Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 6.
Published in final edited form as: Nat Rev Genet. 2016 Nov 21;18(2):71–86. doi: 10.1038/nrg.2016.139

Regulatory activities of transposable elements: from conflicts to benefits

Edward B Chuong 1, Nels C Elde 1, Cédric Feschotte 1
PMCID: PMC5498291  NIHMSID: NIHMS860085  PMID: 27867194

Abstract

Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor binding sites and non-coding RNAs. A wealth of recent studies reinvigorates the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalyzed the evolution of gene regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic impact of regulatory activities encoded by TEs in health and disease.

Graphical abstract

Table of contents blurb: Transposable elements (TEs) are widely known for the pathological consequences of selfish propagation and mutagenesis. However, as described in this Review, TEs also provide hosts with rich, beneficial gene-regulatory machinery in the form of regulatory DNA elements and TE-derived gene products. The authors highlight the diverse regulatory contributions of TEs to organismal physiology and pathology, provide a framework for responsibly assigning functional roles to TEs and offer visions for the future.

Introduction

Transposable elements are ubiquitous in eukaryotic genomes, and persist through independent replication of their sequences. The idea that TEs play a fundamental role in the evolution of eukaryotic gene regulation reaches back 75 years to the seminal work of Barbara McClintock on ‘controlling elements’ of maize. She regarded these elements as “normal components of the chromosome responsible for controlling, differentially, the time and type of activity of individual genes”1. McClintock's perspective, although scorned by some of her contemporaries, was further developed by other pioneers, most notably Britten and Davidson in the late 1960s. Building on early insights into the complex repetitive nature of eukaryotic genomes2, which they correctly attributed to transposition activity, Britten and Davidson envisioned a model where the amplification of diverse repeat families in the genome could spread ‘pre-built’ regulatory elements to drive the evolution of gene regulatory networks3.

Half-a-century forward, we now appreciate that the movement and accumulation of TEs in genomes may be solely explained by their ‘selfish’ replication activities4,5 and other non-adaptive forces acting at the level of the host population, such as genetic drift [G], inexorably shaping genome architecture6. Many studies have documented the disruptive, and often deleterious impact of these activities, as well as the more ‘constructive’ influence of TEs in the evolution of chromosome structure and gene content. But to what extent the pervasive colonization of genomes by TEs has impacted the evolution of eukaryotic gene regulation remains a matter of speculation and controversy. At the heart of the debate lies a series of recent large-scale analyses of the genetic regulatory landscape of mammalian cells revealing the engagement of an unexpectedly large fraction of TE sequences in a wide range of regulatory processes and molecular interactions. These observations, now reported for a diverse range of organisms, have rejuvenated some of the original ideas proposed by McClintock, Britten and Davidson and repositioned transposition as a potent mechanism underlying the evolution of transcriptional gene networks in eukaryotes.

In this Review, we consider emerging evidence revealing TEs as a genome-wide source of regulatory elements. We also discuss recent advances in our ability to experimentally capture the regulatory activities of TEs, with a primary focus on their contribution as cis-regulatory DNA elements. We argue that most of these regulatory activities can be interpreted as relics of strategies used by TEs to spread within genomes and host populations. This simple view bypasses the need to evoke widespread adaptive consequences for the host, while recognizing the occasional long-term repercussions of TEs on genome evolution. We weigh current challenges in deciphering the biological impact and evolutionary relevance of TE-derived regulatory activities, drawing examples mostly from studies performed in mammalian species, but also other eukaryotes including plants and insects. Finally, we propose how unbalanced control of TE-derived functions might result in transcriptional misregulation promoting disease states.

What predisposes TEs to cis-regulatory activity?

Evolutionary origins of TE cis-regulatory activities

Autonomous TEs encode genes that promote their replication independently of that of host chromosomes. However, as genomic parasites, TEs rely on host cell machinery to express their genes. Thus replication-competent TEs have evolved cis-regulatory sequences that function to mimic host promoters (Fig 1a). In long terminal repeat (LTR) elements [G], such as those of endogenous retroviruses (ERVs), cis-regulatory sequences [G] and RNA polymerase II (Pol II) promoters are present in duplicate within each of the two LTRs flanking the coding sequence of the elements8. By contrast, full-length long interspersed nuclear elements [G] (LINEs) possess an internal Pol II promoter located in their 5′ untranslated region (UTR)9 as well as an antisense promoter10. Other TEs collectively referred to as short interspersed nuclear elements [G] (SINEs) are derived from cellular genes transcribed by Pol III (e.g. tRNA or 7SL RNA). Thus, an intact, full-length SINE copy contains internal sequence motifs (A and B boxes) that are capable of recruiting Pol III9.

Figure 1. Origins of TE regulatory activities and how they may impact host genes.

Figure 1

a | Schematic of major transposable element (TE) classes and their typical genetic organization. The left panel depicts the idealized full-length version of each TE type. Most TEs harbour regulatory sequences that function to promote their own transcription and regulation, such as promoters for RNA polymerase II (Pol II) or RNA polymerase III (Pol III), and polyadenylation signals (PAS). The right panel shows the structures of each type of TEs as they most commonly occur in the genome, which can differ substantially based on the TE (see the main text for details). b | Diagram depicting different types of regulatory activities exerted by TEs. These include effects mediated by cis-regulatory DNA and RNA elements (right panel) as well as trans effects mediated by TE-produced non-coding RNAs and proteins (left panel). c | Hierarchy of evidence to consider when determining if a TE has been co-opted for host functions. Many TEs exhibit biochemical hallmarks of regulatory activity based on genome-wide assays. However, additional evidence is required to determine which of these TEs alter the regulation of host genes and affect organismal phenotype and fitness. Abbreviations: LTRs: long terminal repeats; ERVs: endogenous retrovirus; MITEs: miniature inverted-repeat transposable elements; sRNAs: small RNAs; TIR: terminal inverted repeats.

The diversity of TEs in their abundance, form, and replication mechanism greatly impacts the fate of the promoters and cis-regulatory elements they carry. For instance, the mechanism of LTR element replication dictates that each new insertion will introduce two exact copies of the LTR in the host genome8. However, following insertion, these elements often undergo ectopic recombination between their LTRs. These events result in the removal of the coding regions of these elements, but leave intact solitary LTRs, which contain the original cis-regulatory sequences. ERVs occupy ∼8% of the human genome but 90% exist as solitary LTRs11. By contrast, LINEs account for a larger fraction of the human genome (∼20%), but the vast majority suffered 5′ truncations upon insertion that removed their promoter sequences11. Similarly, DNA transposons [G], which generally transpose via a cut-and-paste mechanism, are mostly propagated as miniature inverted repeat TEs (MITEs), which arise from internal deletion derivatives of autonomous elements12. Predictably, many MITEs lack the promoter sequences of their parental element. In summary, not all TE insertions are ‘created equal’ in their ability to retain their original promoters and cis-regulatory elements (see also Fig 1a).

As for canonical host gene promoters, TE promoters often show spatially or temporally regulated activity that is dependent on cell type, or in response to environmental clues such as stress or infection13,14. Although relatively few TE promoters have been extensively characterized, it is known that a variety of TEs exhibit regulated patterns of expression that is established by clusters of cis-regulatory sequences. Much like canonical promoters and enhancers, these sequences can recruit and ‘integrate’ specific combinations of host-encoded transcription factors9,1522.

What evolutionary forces shaped the regulatory features of TE promoters? For new TE insertions to persist through vertical inheritance, transposition events must occur in the germline or prior to germline development, which results in strong selection for TEs to be transcriptionally active in the germline. Indeed, the expression of many TEs appears restricted to various stages of gametogenesis or early embryogenesis in both plants23 and animals24 (see Box 1). Paradoxically, some TEs also exhibit tightly regulated activity in somatic tissues in a variety of organisms2326. Given that somatic transposition events are not trans-generationally inherited, activity in these tissues holds no immediately apparent benefit or consequence to the TE. It may be tempting to speculate that regulated somatic TE activity may reflect a symbiotic relationship between host and TE, but an alternative, yet not mutually exclusive explanation is that these somatic regulatory activities have been previously moulded by selection on the TE. For example, all ERVs originate from infectious retroviruses that must have infected germ cells (or their progenitors) but possibly also somatic cells from diverse tissues. The past viral lifestyle of ERVs may thus explain why their LTRs harbour complex regulatory elements that are capable of driving transcription in a variety of tissues and cell types. Regardless of the evolutionary forces driving somatic activity, it is clear that TEs have the potential to impact host gene regulation well beyond the germline and early embryo.

Box 1. A role for ERVs in regulating early mammalian development?

Genome-wide epigenetic silencing of transposable elements (TEs) is a pivotal step in early mammalian development, yet recent studies characterizing the transcriptomes of embryonic stem cells and early embryos have revealed surprisingly high transcriptional activity emanating from endogenous retrovirus (ERV) sequences. ERVs are amongst the first sequences to be transcribed during zygotic genome activation in mouse two-cell embryos39. Similarly, in human embryos, distinct families of primate-specific ERVs are expressed at each stage of pre-implantation development, and this activity ceases as cells differentiate into somatic cells 156,157. These ERVs are transcriptionally regulated by long terminal repeat (LTR) promoters that contain binding sites for transcription factors governing early development, such as OCT4 and NANOG24. Taken together, these findings indicate that ERVs are seemingly unleashed during pre-implantation development, with potentially widespread regulatory effects on the cellular transcriptome.

A major challenge now lies in understanding the biological consequences of ERV activity in embryonic development. RNA sequencing (RNA-seq) analyses have revealed that hundreds of LTRs drive expression of lncRNAs158 and protein-coding genes39,159, and chromatin immunoprecipitation followed by sequencing (ChIP-seq) analyses have revealed thousands more solitary LTRs that function as embryonic stem cell (ESC)-specific distal enhancers that may also influence host gene expression18. The expression of some host genes that are required for pre-implantation development may thus be regulated by ERV-derived promoters or enhancers.

The activity of some of these ERVs, such as the primate-specific HERV-H, is highly correlated with stem cell pluripotency139,159161, and knockdown of long non-coding RNAs (lncRNAs) driven by HERV-H results in rapid differentiation into somatic cell types159,161164. Further molecular investigation in human embryonic stem cells suggested that HERV-H transcripts may function as RNA scaffolds recruiting transcriptional activators including OCT4, Mediator, and p300164. Yet another lncRNA — named human pluripotency-associated transcript 5 (HPAT5) and derived from both a primate-specific HUERS-P1 ERV and an Alu element — was also discovered to promote pluripotency by functioning as a molecular sponge for the let-7 family of microRNAs (miRNAs)165 (Fig. 2a). Together these results indicate that ERV-derived lncRNAs are capable of modulating stem cell pluripotency, which may be important for proper development. Another study found that the protein Rec encoded by HERV-K binds to a multitude of host mRNAs and potentiates antiviral resistance in cell culture, raising the provocative hypothesis that HERV-K expression protects pre-implantation embryos from exogenous viral infection157.

These data provide mounting evidence that ERVs have been co-opted as essential regulators of human development. However, an important limitation to these studies is that they are typically performed in cultured embryonic stem cells, thus it remains possible that functional phenotypes linked to ERV regulatory activity may not be required for organismal development. One pioneering study used small interfering RNA (siRNA) injections to simultaneously knock down expression of three ERV-derived lncRNAs (HPAT2, HPAT3 and HPAT5) in human two-cell embryos, and found that cells depleted for these lncRNAs were no longer capable of contributing to the inner cell mass of the blastocyst165. This result suggests, but does not prove, that ERVs have been co-opted to regulate blastocyst development. Given the limited sample size and potential for off-target or transcriptional effects of siRNA silencing, further studies are necessary to conclusively establish a developmental role for human ERVs. Indeed it is possible that ERV activity in embryonic stem cells is not essential for development, but instead merely reflect the recent selfish exploitation of this developmental niche by retroviruses 166.

Although the necessary experiments would be challenging with human material, mice similarly exhibit dynamic expression of ERVs during early embryonic development24. Whereas all ERVs with regulatory activity in human embryos are primate-specific, mouse embryos similarly exhibit rodent-specific ERVs that are expressed throughout early development39,158. If ERV activity is truly required for both human and mouse early development, it would imply a remarkable scenario whereby ERVs were independently co-opted in primates and rodents to regulate embryogenesis. Note there are precedents for such convergent TE co-option events167,168 (Box 3).

Non-random integration of TEs in the genome

As TEs evolved regulatory sequences that essentially mimic host cis-regulatory elements, it naturally follows that a TE insertion that lands in the vicinity of a host gene has a strong potential to interfere with its expression. Thus, where a TE initially inserts in the genome will often dictate its fate in the population. Accordingly, many TEs appear to have evolved mechanisms that favour integration into genomic regions that maximize their chance of propagation. For instance, some TE families have evolved sophisticated molecular mechanisms to target genomic ‘safe havens’, such as gene-poor or heterochromatic regions, whereas others favour integration within areas of open, transcriptionally active chromatin27. For instance, Mutator elements in maize28, mPing MITEs in rice29, Tf1 retroelements in fission yeast30, and P elements in Drosophila31 all actively target the 5′ region of genes. Some retroviruses, including HIV-1, preferentially integrate within highly transcribed regions of the genome32. Presumably, this pattern of propagation ensures that newly integrated elements reside in a favourable environment for expression and transmission. Such elements are likely to have a stronger propensity for modulating adjacent gene expression and, if tolerated by natural selection, may be more prone to cis-regulatory co-option — possibly immediately upon insertion29. Finally, it is important to note that as time elapses after TE propagation, forces such as selection and genetic drift are likely to obscure initial integration preferences.

TEs are a rich source of regulatory activities

What is the impact of the widespread dispersal of cis-regulatory elements by TEs on genome regulation and cell function? To cope with the parasitic burden of TEs in their genomes, eukaryotes appear to have evolved multi-layered mechanisms to prevent TE transcription23,33,34. Consequently, the bulk of TE-derived DNA is generally assumed to be silenced and biochemically inert in most cells. Yet examples of host genes driven by TE promoters have been documented in diverse species over the past several decades15,22,3537. Do these cases simply represent rare events where individual TEs have escaped host silencing, or are they hints of pervasive regulatory activity?

A large body of literature has illustrated the myriad mechanisms by which TEs can alter host gene expression both in cis and in trans, transcriptionally or post-transcriptionally15,38 (Fig 1b). Whereas most of these mechanisms were initially discovered through the genetic analysis of mutations caused by de novo TE insertions, genomic studies have revealed that these activities permeate throughout the genome and often emanate from TEs that have long been fixed in the host population. In particular, genome-wide assays such as those mapping transcriptional activity, open chromatin, or binding of transcription factors, have provided compelling evidence that a multitude of TE-derived sequences exhibit the biochemical hallmarks of active regulatory elements. Together these studies indicate that TEs are not as robustly or systematically silenced as commonly assumed.

Notably, transcriptome analyses have uncovered TEs as an abundant source of tissue-specific and/or alternative promoters. Two principal methods have allowed the mapping of TE-derived promoters at a genome-wide scale. First, deep sequencing of full-length cDNAs (a form of RNA sequencing (RNA-seq [G])) revealed that chimeric transcripts that initiate within TE-derived promoters constitute a considerable fraction of mammalian transcriptomes, notably during early development39 but also in many other mammalian tissues40. Second, approaches mapping sites of transcription initiation, such as cap analysis of gene expression followed by sequencing (CAGE-seq [G]) revealed surprising amounts of Pol II initiation within TEs in a variety of human and mouse cell types and tissues 13 (e.g. up to 30% of all transcription start sites mapped in human embryonic cells) as well as in Drosophila melanogaster embryonic development41. There is also growing evidence that TEs are a substantial source of promoter activity in plants37. Thus TEs appear to have dispersed vast numbers of developmentally-regulated promoters in a wide range of species that often remain active and drive the transcription of adjacent DNA in a tissue-or stage-specific fashion.

Evidence is mounting that TEs provide a profusion of other types of cis-regulatory elements including enhancers, insulators, and repressive elements. In mammals, chromatin immunoprecipitation followed by sequencing (ChIP-seq [G]) studies have revealed that for any given transcription factor (TF) and cell type examined, TEs contribute a substantial fraction of binding sites across the genome (5-40%, average ∼20%)16,20. LTR elements tend to contribute more than other TE types15,22,42, probably because LTRs are more likely to possess and retain their ancestral cis-regulatory activities as explained above (Fig. 1a). Importantly, for a given TF the majority of TE-derived binding sites are contributed by a relatively small number of specific TE families that are highly enriched for TF binding events relative to what would be expected based on the density of these TE families in the genome18,20,43,44. In most cases for which the origin and sequence of the binding sites have been examined, it could be inferred that a canonical binding motif pre-existed within the ancestral TE sequence and was subsequently dispersed through the genome via transposition – a scenario consistent with a ‘copy-and-paste’ model of TF binding site dispersion15.

Interestingly, TE-derived TF binding sites tend to be lineage-specific. For instance, in an analysis of binding events for 26 pairs of orthologous TFs across two comparable human and mouse cell lines, >98% of >130,000 TE-derived peaks identified in each species were species-specific20. This result can be partly explained by the fact that the majority of binding events occur within TEs that have amplified after the split of the two species examined, i.e. primate- or rodent-specific TE families. Another factor is that more ancient TEs have accumulated many more neutral substitutions, leading to degradation of their ancestral TF binding sites. The differential decay of these ancestral TE sequences across species may also result in species-specific transcription factor binding events. In any case, these data suggest that transposition represents a common mechanism for the gain of novel TF binding sites during mammalian evolution.

TEs have also been documented to function as insulator and/or boundary elements. Such sequences serve to partition the genome into domains of active or inactive transcription ranging in size from 100 kb to 1 Mb, often by preventing the spread of heterochromatin45. Several studies showed that many TEs, in particular SINEs, harbour binding sites for factors such as CTCF or TFIIIC that confer insulator activity and organize nuclear architecture4648. A subset of these TEs appear to play roles in the three-dimensional organization of the genome by functioning as ‘anchors’ that serve to isolate regions of active transcription. Indeed, studies investigating intra- or inter-chromosomal interactions underlying these topologies at a genome-wide scale using chromatin interaction profiles have found that SINEs are enriched at the borders of these domains4951. By providing a fertile source of binding sites for architectural factors, TEs may be important contributors to high-order genomic organization, which governs the transcriptional regulation of large chromosomal regions containing many genes.

In addition to providing cis-regulatory DNA elements, TEs have also been documented to contribute a wealth of non-coding regulatory RNA transcripts such as microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) that can modulate gene expression in cis or in trans (see Fig. 1b and Box 2).

Box 2. TEs as a source of non-coding regulatory RNAs.

In addition to providing cis-regulatory DNA elements, transposable elements (TEs) have also been documented to contribute non-coding regulatory sequences that modulate gene expression post-transcriptionally. Best documented are TE-derived regulatory sequences embedded within the untranslated regions (UTRs) of protein-coding genes. For instance, Alu elements inserted within 3′ UTRs can provide recognition sequences for Staufen 1-mediated mRNA decay169 or influence mRNA localization170, whereas Alu elements incorporated within 5′ UTRs often modulate mRNA translation efficiency171.

TEs are also major contributors to the evolutionary origination and biogenesis of various regulatory RNAs that regulate gene expression either transcriptionally or post-transcriptionally, including microRNAs (miRNAs)172, long non-coding RNAs (lncRNAs)139,140, and circular RNAs (circRNAs)173,174. In human, mouse and zebrafish, more than two thirds of lncRNAs were found to contain exonic TE sequence, whereas they seldom occur in protein-coding transcripts139,140). Of course, many of these lncRNAs may have no biological function and could merely reflect the pervasive transcriptional activity of TEs. It could also reflect the greater capacity of lncRNAs to tolerate and assimilate TE insertions over evolutionary time15,70,175. Nevertheless, there is a growing number of examples where the sequences conferring regulatory activities to non-coding RNAs are directly derived from tes70,164,165,175177 (Box 1).

Many of the conceptual ideas developed in this Review for TE-derived cis-regulatory DNA elements can be applied to those acting at the RNA level. In particular, a cis-acting sequence (e.g. a recognition motif for an RNA-binding protein) that is present within an ancestral TE sequence will be seeded as the TE amplifies throughout the genome, opening the door for the co-option of multiple TE copies to assemble complex regulatory circuits modulating the expression of a vast number of genes15,176,178. Further details on the role of TEs in the evolution of regulatory RNAs and post-transcriptional gene regulation have been reviewed elsewhere38,70,175.

Finally, while we primarily focus this Review on regulatory activities encoded within TE sequences, TEs can also alter gene expression through many other effects, including the disruptive effects of their chromosomal integration. An interesting example is the adaptive insertion of a POGON1 DNA transposon within the 3′ UTR of the gene CG11699 in D. melanogaster52. The insertion disrupts the ‘normal’ (ancestral) polyadenylation signal of the gene resulting in a shorter 3′ UTR, elevated mRNA levels, and increased resistance to xenobiotic stress52. TEs may also exert a broad influence on genome regulation as a result of their targeting by host silencing pathways33 (Fig. 1b). In particular, the repressive chromatin nucleated at TE sequences may ‘spread’ to adjacent regions and silence the expression of nearby genes, a phenomenon which has been observed in several organisms including plants53 and mammals54.

The biological impact of TE regulatory activities

The work summarized above firmly anchors the idea that TEs are a prolific source of biochemical regulatory activity in host cells. These observations raise the question as to whether this phenomenon has exerted a significant influence on the biology and phenotypic evolution of species. Although it may be tempting to interpret the tightly-controlled regulatory activity originating from TEs as indicative of a widespread role in physiology or development, the biochemical manifestation of this activity does not alone signify that it is important for proper genome function55,56 (Fig. 1c). The terms ‘function’, ‘activity’ or ‘co-option’ are widely used yet can have different meanings depending on the field of study, and we caution that imprecise usage of these terms might confuse our understanding of the influence of TE regulatory activity on biological outcomes. Coming from an evolutionary perspective, we use the term “co-option”, which is also referred to as exaptation57 or domestication58, when a TE-derived sequence acquired a cellular function conferring a selectable phenotype that predictably increases the fitness of the host organism (Fig. 1c). Given that this review focuses on regulatory activity of TEs rather than their protein products, we will use the term ‘co-opted TEs’ to refer specifically to the co-option of DNA sequences within the TE that are responsible for cis-regulatory activity.

With the goal of advancing our current understanding of the biological importance of TE regulatory activities, we discuss the major challenges involved in disentangling effects that are simply relics of the selfish ‘behaviour’ of TEs with no benefit to the host from those that reflect bona fide co-option events resulting in adaptive cellular innovations.

Co-option of TEs revealed by evolutionary sequence conservation

How can evolutionary co-opted, or ‘domesticated’ TEs be distinguished from TEs whose regulatory activity is non-adaptive and simply tolerated by host cells? One of the most direct lines of evidence that a DNA sequence has exerted biological function is to show that it has evolved under a regime of purifying selection [G] . Selective constraint on non-coding DNA is most effectively revealed by detecting a signature of evolutionary sequence conservation when compared across distantly related species, relative to DNA sequences that are assumed to be unconstrained and neutrally evolving. Such comparative genomic studies have revealed tens of thousands of non-coding TE fragments in the human genome that are orthologous and highly conserved across species and display clear signatures of purifying selection15,56,59. Although most of these sequences derive from relatively ancient TEs (which are old enough to be compared orthologously across distant species), they are of various ages and types. They are also unevenly distributed in the genome and tend to be enriched near transcription factor genes and other developmental genes. Together these data strongly suggest that TEs have indeed provided a rich source of non-coding sequence material fueling regulatory innovation during vertebrate evolution.

Striking examples of TEs co-opted for gene regulation identified by virtue of evolutionary conservation are those of the so-called ‘living fossil’ SINE (LF-SINE) family60. Multiple LF-SINE copies are highly conserved across tetrapods, and one was shown to possess tissue-specific enhancer activity in a mouse reporter assay [G] (Table 1). Another interesting example is a highly conserved mammalian-wide interspersed repeat (MIR) element that in mice serves as an intronic enhancer for the FOXP3 gene required for extrathymic generation of regulatory T cells, which re-enforces maternal-fetal immunotolerance during pregnancy61. Additionally, two ancient TEs, a mammalian apparent LTR retrotransposon [G] (MaLR) and a SINE, were independently co-opted during mammalian evolution to serve as hypothalamus-specific enhancers that are required to regulate the proopiomelanocortin (POMC) gene, which functions in the brain to control food intake62. Examples like these highlight how evolutionary conservation of TEs can guide experiments validating regulatory functions.

Table 1.

Case examples of TEs functioning as cis-regulatory DNA elements.

Species Gene product TE Cis-regulatory activity Function Evidence References
Tetrapod vertebrates ISL1 LF-SINE (vertebrate SINE) Neural enhancer Brain development (inferred) Sequence conservation, mouse reporter assay 60
Human AIM2 MER41 (primate ERV) Interferon-inducible enhancer Regulates inflammatory response CRISPR knockout in human cell culture 44
Human β-globin ERV9 (primate ERV) Erythroid enhancer Controls developmental switch from fetal to adult globin Cre-LoxP knockout in transgenic mouse BAC, chromatin looping (3C) 182
Mouse Dicer MT-C (rodent LTR retrotransposon) Oocyte-specific promoter Necessary for female oocyte function and fertility TALEN knockout in mouse 75
Mouse Growth Hormone B2 SINE (mouse retrotransposon) Insulator Developmental regulation of growth hormone (inferred) Enhancer blocking assay 183
Human Prolactin MER39 (ERV) Endometrial-specific promoter Prolactin expression during pregnancy 5′ RACE on endometrial tissue 184
Fly (D. melanogaster) Cyp6g1 Accord (LTR retrotransposon) Enhancer Increased resistance to DDT insecticide Selective sweep, genetic mapping, reporter assay 185
Fly (D. melanogaster) Jheh1/2 Bari1 Antioxidant response elements (enhancer) Increased resistance to oxidative stress Selective sweep, phenotypic assays 186
Fly (D. simulans) Slowpoke Shellder (LTR retrotransposon) Altered splicing Courtship song variation Trait mapping, in vivo CRISPR knockout 187
Maize TB1 Hopscotch (teosinte LTR retrotransposon) Enhancer Apical dominance branching pattern Trait mapping, reporter assay in maize leaf protoplast 77
Sicilian blood orange Ruby Tcs1 (LTR retrotransposon) Cold-responsive promoter Required for cold-induced accumulation of anthocyanin Trait mapping, 5′ RACE 78

Further examples can be found in the Catalogue of genes affected by TEs (C-GATE) https://sites.google.com/site/tecatalog/ (Ref 36). 3C, chromatin conformation capture; BAC, bacterial artificial chromosome; ERV, endogenous retrovirus; LF-SINE, living-fossil SINE; LTR, long terminal repeat; RACE, rapid amplification of cDNA ends; SINE, short interspersed nuclear element; TALEN, transcription activator-like effector nuclease.

Although tens of thousands of TEs in the human genome contain sequences that have evolved under purifying selection59, as well as sequence motifs and biochemical signatures that are consistent with some form of cis-regulatory activity63,64, it is important to emphasize that, thus far, very few of these elements have been characterized using functional assays. Interestingly, a recent study65 identified a DNA segment that is highly conserved across eutherian mammals and derived from three distinct juxtaposed TEs (AmnSINE1, X6b_DNA, and MER117) that appears to function as an enhancer driving frontonasal expression of the crucial development gene Wnt5 in the mouse embryo. Surprisingly, however, mice carrying a homozygous deletion of the TE-derived enhancer did not show any obvious phenotypic defects, raising the paradox — not unprecedented66 — that even sequences with a strong signature of evolutionary constraint and demonstrated cis-regulatory activity may only exert subtle effects on organismal development.

Uncovering recently co-opted TEs

Although sequence conservation is a strong predictor of biological function, there are several caveats that may preclude comparative genomics from identifying TEs that have been co-opted for host gene regulation. First, only relatively ancient TEs that have been co-opted for a sufficiently long evolutionary period will display a robust signature of sequence constraint67. These ancient TEs are technically challenging to recognize as such, especially in species with rapid evolution rate such as fruit flies (Box 3). In human and mouse, the vast majority of TEs exhibiting regulatory activity in functional genomics assays are non-orthologous and derive from relatively young insertions20,42. The recent origins of these elements make it difficult to assess whether they have evolved under functional constraint, although some methods have been developed in attempt to address this issue68. The matter is worsened by the intrinsic shortness and degeneracy of the sequence motifs underlying regulatory activity, which increases the difficulty of detecting purifying selection acting on these motifs through sequence analysis alone69,70.

Box 3. Evolutionary dynamics of TE regulatory activities.

What is the mode and tempo of transposable element (TE) co-option for cis-regulatory function? Does TE co-option occur in major bursts that coincide with the invasion of TEs, or is it a more gradual process tapping into a reservoir of decaying TEs with various levels of predisposition?

The most straightforward path to co-option is when a TE confers an adaptive regulatory function immediately upon insertion (See the figure, part a, right). This scenario may conceivably be common because the cis-regulatory elements mapped within adaptive TEs are often inferred to have pre-existed at the time of their insertion in the genome44. Some of the adaptive TE insertions that recently swept in the Drosophila melanogaster population may represent examples of new cis-regulatory sequences gained by transposition179. It could also be that some TEs have immediate cis-regulatory effects that are independent of their own sequences, simply through their disruptive52 or epigenetic activities.

There is also evidence that a TE may become ‘latently’ co-opted long after it inserted in the genome (See the figure, part a, middle). Although many TEs possess built-in cis-regulatory sequences, these may remain silent and/or inconsequential for adjacent gene expression until additional mutations unmask or bolster their regulatory activity. Along those lines, Emera and Wagner proposed a model of ‘epistatic capture’ to describe the series of mutational events by which a TE was transformed into a strong promoter for decidual prolactin72. It is also conceivable that such epistatic mutational events occurring outside of the TE sequence but within its genomic neighbourhood could also promote latent co-option of the TE. A variety of mechanisms can be envisioned, such as point mutations introducing or removing other TF binding sites, as well as insertion or deletion events of cis-regulatory elements, genes, or even other TEs.

What is the evolutionary fate of a TE that has become co-opted for host function? As discussed in the main text, a recent study in Drosophila miranda found that dosage compensation on a newly evolved ‘neo-X’ chromosome (∼1 million years old) was mediated by the recent spread of an ISX transposon, which harbours a ‘recognition element’ motif for the male-specific lethal (MSL) dosage compensation complex103. Remarkably, the authors also found evidence that dosage compensation on the older XR chromosome (∼15 million years old) was partly established by an older expansion of a related TE family, named ISXR103. The older ISXR elements were characterized by stronger MSL recognition elements and decaying TE sequence. The authors also found evidence that the MSL recognition elements in the younger ISX elements have already undergone additional evolutionary ‘fine-tuning’ to optimize binding of the MSL dosage complex, facilitated by a gene-conversion-like process between copies180. These data suggest a two-step model of TE domestication, whereby elements with a regulatory predisposition upon insertion are further refined by post-insertional mutation and selection. Eventually, co-opted TEs will lose signatures of their TE origins as all but the most essential nucleotides are eroded by neutral substitutions103.

Another emerging insight is that the co-option of TEs as regulatory elements may be a surprisingly volatile process. Several cases have now been described whereby multiple species have independently, but convergently co-opted lineage-specific TEs to regulate the same gene. In mammals, this is exemplified by the prolactin gene72 and also the neuronal apoptosis inhibitory protein (NAIP) gene181. One potential explanation is that similar selective pressures drove independent co-option of TEs at the same locus in different species. However, it is also possible that the ancestral locus was also regulated by a TE, which has since been replaced by a new TE in each lineage. Thus, analogous to a gene duplication event, the insertion of a TE with similar regulatory properties might result in relaxed selection and eventual decay of a nearby co-opted TE, and over time this would perpetuate in a cycle of co-opted TEs being replaced by newer TEs (See the figure, part b). Such a turnover model could explain the prevalence of lineage-specific TEs associated with cellular processes that have deep evolutionary origins, such as pregnancy90 or stem cell development18.

Box 3

Despite these hurdles, there is a growing set of recent TEs for which solid evidence has been gathered to support the idea that they were co-opted for important regulatory innovation36 (some examples are provided in Table 1 and Fig. 2). These are often cases in which the TE serves as a tissue-specific promoter for a gene with crucial function in that tissue. Indeed, these cases can be readily discerned by the detection of a tissue-specific gene transcript initiating within a TE sequence35. A classic example in mammals is the acquisition of salivary expression of the digestive enzyme amylase from a retroviral LTR inserted in the common ancestor of anthropoid primates71. Another interesting case is the endometrial expression of the hormone prolactin, which is critical for pregnancy in mammals72,73 (Box 3).

Figure 2. Examples of phenotypes driven by TE regulatory activity.

Figure 2

a | The human pluripotency-associated transcript 5 (HPAT5) is a long non-coding RNA (lncRNA) that is derived from a composite of a HUERS-P1 endogenous retrovirus (ERV) element and an Alu short interspersed nuclear element (SINE). In all figure parts, wavy lines indicate pre-mRNA transcripts and angled lines indicate spliced introns. HPAT5 regulates the let7 family of microRNAs through let7 binding sites carried by the Alu element, and was shown to be essential for inner cell mass formation during early embryonic development165. b | A MER41 transposable element (TE) provides an interferon-inducible enhancer upstream of the human AIM2 gene, which regulates inflammation in response to infection44. c | In the peppered moth, a polymorphic Carbonaria TE insertion within an intron of the Cortex gene enhances Cortex expression levels (dotted line indicates an uncharacterized regulatory mechanism) which underlay adaptive cryptic colouration during the industrial revolution83. d | In oil palm, sporadic demethylation of a Karma TE within an intron of the MANTLED gene results in unmasking of a cryptic splice acceptor site and premature termination signal, causing the mantled fruit phenotype144. Left panel depicts genomic locus (not drawn to scale). The fruit images in part d are reproduced from ref 144

In contrast to TEs with promoter activity, the function of TEs located distal to genes is more challenging to discern and validate. Therefore, far fewer examples of TE-derived enhancers with clear biological roles have been documented36. A traditional experimental approach to delineate the cis-regulatory effects of a TE is a reporter assay. If the reporter expression pattern driven by the TE recapitulates that of the endogenous gene associated with the TE, it may be reasonable to conclude that the TE contributes to the regulation of the endogenous gene in vivo60,74. However, these experiments are still limited by the fact that they dissociate the TE from its native chromosomal context and cannot establish a direct causal link between the cis-regulatory activity of the TE and endogenous gene expression.

A more conclusive approach to assess the effects of an individual TE on host gene regulation is to perform a loss-of-function experiment. The recent development of precise genome editing technologies has begun to facilitate such experimental knockout studies. In one study, transcription activator-like effector nuclease (TALEN)-mediated deletion of a mouse MT element providing an alternative promoter for the Dicer1 gene led to oocyte malfunction and female sterility75. In another study, CRISPR-mediated deletion of a MER41 ERV-derived interferon-inducible enhancer associated with the AIM2 gene in human cells impaired the innate immune response to viral infection44 (Fig. 2b). In a third example, a screen for cis-elements regulating the pregnancy-specific histocompatibility gene HLA-G combining massively parallel reporter assays and CRISPR genome editing identified a LTR71-derived distal enhancer required for HLA-G expression in the placenta76.

TE-mediated cis-regulatory variation and the domestication of species

TEs were first discovered in maize and have since been characterized as a major component of other crops. Intriguingly, there is mounting evidence that many traits associated with the domestication of crop plants evolved through artificial selection of TEs with cis-regulatory effects on adjacent host genes14,37. In maize for instance, a Hopscotch retrotransposon insertion serves as an enhancer for the teosinte branched1 gene underlying increased apical dominance relative to its ancestral wild relative teosinte77. Another study examining blood oranges identified a copia-like retrotransposon that functions as a cold temperature-inducible enhancer of the ruby locus that modulates fruit color78 (Table 1).

Furthermore, there is evidence that regulatory variation introduced by TEs have facilitated the domestication of animals. A study investigating the domesticated silkworm found that the insertion of a Taguchi LINE enhances expression of the ecdysone oxidase (EO) gene, which inhibits premature metamorphosis of silk-producing pupae79. Additionally, several TE insertions with cis-regulatory effects have been associated with traits selected in domestic dog and cat breeds, such as small body size80 and coat color pattern81,82.

A recent report83 identified an intronic insertion of a TE that enhances the expression of the cortex gene, and this increased expression was found to underlie industrial melanism in the English peppered moth (Fig. 2c). In this textbook example of adaptation, peppered moths were not domesticated per se, but were subject to strong artificial selection by coal pollution during the industrial revolution.

Together these findings suggest that TEs frequently play pivotal roles facilitating plant and animal domestication, which is characterized by artificial selection for specific traits often modulated through changes in gene expression84.

TEs and the evolution of gene regulatory networks

Whereas most previous studies focused on individual cases of a particular TE regulating a single host gene, it is theoretically possible that TEs exert a more extensive influence in regulatory evolution by providing the ‘wiring’ connecting large gene batteries. Such regulatory networks coordinate the expression of multiple gene products that act in concert to control entire pathways and complex biological processes. The assembly of new gene regulatory networks is thought to underlie major evolutionary innovations including the emergence of new morphological structures and cell types85. There is mounting evidence that modification of the cis-regulatory architecture underlying the transcriptional control of genes is an important force driving the evolution of new networks86. Transposition has long been proposed as an attractive mechanism to facilitate the concurrent mutational events that are required to deeply remodel cis-regulatory architecture3,15.

This model has gained support from recent studies linking the expansion of certain TE families with the dispersal of a regulatory module that is important for the execution of a specific developmental program. For instance, MER20 transposons appear to have deposited numerous cAMP-inducible enhancers specifying endometrial gene expression in the eutherian mammal ancestor, which coincided with the emergence of mammalian pregnancy over 100 million years ago63. Another example is the dispersal of hundreds of enhancers with forebrain-specific activity in the mouse through the ancient expansion of the MER130 family, which may have been associated with the evolution of the mammalian neocortex64. There is also evidence supporting a role for lineage-specific TEs in driving more recent adaptive evolution of gene regulatory networks. For example, the murine-specific RLTR13 family of ERVs dispersed hundreds of placenta-specific enhancers within the past 15-25 million years87, which may reflect and possibly directly contribute to the rapid morphological diversification of the mammalian placenta88.

Functional evidence for network rewiring by TEs

These recent findings support a model whereby waves of TE activity deposited the raw material for large-scale cis-regulatory changes underlying major evolutionary innovations. However current evidence is mostly limited to correlative observations based on chromosomal association of TEs with nearby genes encoding evocative functions63,64,87,89,90. Although these associations are tantalizing, they should be interpreted with caution. An alternative, but not mutually exclusive interpretation is that TEs inserted within or near genes that are highly transcribed in a given cell type are more likely to be accessible to regulatory proteins (e.g. TFs) expressed in those cells. Because TEs may have an intrinsic preference for integration within ‘open’ transcriptionally permissive chromatin91,92 and must also replicate in the germline for new insertions to be inherited, these properties may introduce a biased association with genes that are highly expressed in the germline and early embryonic cell types 9395. Second, there is probably some level of selection against insertions that significantly alter host gene expression patterns upon insertion, meaning that most TEs retained in the genome might cause minimal changes in host gene expression96102. This bias would also favour the retention of TEs near or within genes that share similar expression profiles, as these insertions are less likely to significantly alter existing gene expression patterns. Thus, although genomic analyses often appear to support a causal role for TEs in shaping regulatory networks, these data alone cannot falsify the null hypothesis that a given TE has no regulatory effect on a nearby gene, even when exhibiting biochemical signatures that are consistent with cis-regulatory activity56.

A few recent studies have gone a step further in testing more directly the role of TE co-option in the evolution of gene regulatory networks by using experimental genetic manipulation. In a study investigating the interferon-inducible gene regulatory network, CRISPR-mediated deletion of four separate primate-specific MER41 elements impaired the expression of adjacent genes with known innate immune functions44. Thus the co-option of multiple MER41 elements with the same regulatory properties at several genomic loci appear to have facilitated the establishment of a coordinated transcriptional response to infection during primate evolution.

Another genetic network where TEs have been implicated is the cis-regulatory circuit enabling dosage compensation on sex chromosomes. Studies of the emergence of dosage compensation on the newly evolved (<1 million year old) ‘neo-X’ chromosome of the fruit fly Drosophila miranda showed that an ISX transposon was responsible for spreading dozens of binding sites for the dosage compensation machinery103 (Box 3). These ISX elements are strikingly enriched on the X chromosome, and experimental insertion of an ISX element into an autosomal chromosome was sufficient to recruit the dosage compensation complex to this ectopic site. Together with data indicative of the involvement of TEs in mammalian dosage compensation104,105, these findings suggest that transposition played a recurring role in the evolution of sex chromosomes by enabling the rapid copying-and-pasting of cis-regulatory sequences along entire chromosomes.

A growing number of studies also suggest a potential role for TEs in rewiring gene regulatory networks orchestrating early mammalian development24. Although the biological relevance of these findings remains to be clarified, these reports raise the exciting possibility that ERVs have been repeatedly co-opted during mammalian evolution to remodel the complex genetic circuits underlying early embryonic development (Box 1).

Pathogenic effects of TE regulatory activities

In line with their parasitic origins and selfish behaviour, TEs have long been associated with mutant phenotypes and disease. TEs are well documented to cause disease through two primary mechanisms: insertional mutagenesis and chromosomal rearrangements. For example, de novo germline TE insertions disrupting normal gene function have been implicated in over a hundred human inherited diseases106. Furthermore, both transposition and TE-mediated chromosomal rearrangements in somatic cells have been causally linked to several types of cancer107. Here we explore how the regulatory activities of TEs may also represent an underappreciated source of disease phenotypes.

TEs as pathogenic regulatory variants

Cis-regulatory variation is increasingly recognized as an important factor influencing disease susceptibility108. As TE insertions represent a common form of structural variation [G] in human genomes109, it is plausible that some of these polymorphic insertions contribute to disease risk by modulating the expression of adjacent genes. This idea has yet to be investigated systematically, but a plausible example recently surfaced from the discovery of a link between the complement C4 system and schizophrenia risk110. The authors found that individuals carrying a polymorphic ERV intronic insertion in the C4 gene have elevated C4 expression, which in turn was found to cause synapse over-pruning, which is a phenotype that is associated with schizophrenia110. Although the evidence linking ERV regulatory activity to disease remains indirect, this case is intriguing in light of previous observations of an association between schizophrenia and elevated ERV transcriptional activity111.

Causes and consequences of TE reactivation

TEs that are fixed in the human population, but encode dormant regulatory sequences, may also contribute to pathogenesis. It has been widely observed that TE transcript levels are significantly increased in numerous cancers and other disease states112,113. The reasons why particular TEs appear to be upregulated in certain disease conditions remain poorly understood. Recent data suggest that environmental stimuli, including infection114 and cellular stress115, as well as natural cellular processes such as senescence116118, destabilize epigenetic marks that normally silence the bulk of TEs in the genome, thereby triggering their sporadic transcriptional activation.

Does the derepression of TEs play a major role in driving disease states, or is it simply a side effect of pathogenesis113,119,120? There are several routes by which inappropriate transcriptional activation of TEs might have pathogenic consequences for the host (Fig. 3a). First, this activity may result in increased rates of propagation through transposition. Indeed, DNA hypomethylation and transcriptional reactivation of replication-competent LINE-1 copies seems to explain why some tumours, particularly epithelial ones, may exhibit an increased rate of transposition relative to matched healthy tissues121123. The role of somatic insertional mutagenesis and its contribution to tumorigenesis and malignancy is an area of considerable interest119. Indeed, recent studies have demonstrated that de novo LINE-1 insertions can activate oncogenic pathways in hepatocellular carcinoma124 and colorectal cancer125. Activation of the LINE-1 transposition machinery may have additional pathological consequences, such as the infliction of double strand breaks in the genome through LINE-1-encoded endonuclease activity126.

Figure 3. TEs can be aberrantly unmasked to promote disease states.

Figure 3

a | Epigenetic perturbations can result in global transposable element (TE) reactivation and pathogenic consequences. The repressive epigenetic marks that normally silence TE transcription include DNA methylation and binding by KRAB zinc finger (ZNF) transcription factors, and these can be depleted upon various stresses. The reactivation of TE sequences across the genome can result in a wide variety of pathogenic consequences, including genome instability via transposition, other pathogenic activities of TE-encoded peptides or non-coding RNAs, and cellular toxicity due to buildup of RNA or cDNA intermediate molecules. b | Studies examining B-cell lymphomas have revealed oncogenic activation of colony-stimulating factor 1 receptor (CSF1R) (driven by a THE1B promoter in Hodgkin Lymphoma)135, fatty acid-binding protein gene (FABP7) (driven by a LTR2 promoter in diffuse large B-cell lymphoma)136, and interferon regulatory factor 5 (IRF5) (driven by a LOR1a promoter in Hodgkin Lymphoma)137. All these examples involve long terminal repeat (LTR) or endogenous retrovirus (ERV) elements, highlighting again the proclivity of this class of TEs to retain potent but generally repressed cis-regulatory activity in the human genome.

It is important to keep in mind that the vast majority of TEs that are transcriptionally activated in disease (such as ERVs) are no longer replication competent, and as such are unable to generate new transposition events. However, there are other potential routes for an excess of TE-encoded transcripts to lead to pathogenesis (Fig. 3a). For example, some of these transcripts may encode peptides with adverse cellular activities. For instance, overexpression of ERV envelope proteins, as seen in the brain of patients affected with some neurodegenerative and autoimmune disorders, can induce a wide range of cellular processes and abnormalities associated with these pathologies, such as neurodegeneration127, autoinflammation128, demyelination129, and superantigen activity130. Furthermore, the cytoplasmic accumulation of nucleic acids derived from activated TEs, including double-stranded RNA, reverse transcribed cDNA, or RNA–DNA hybrids, are increasingly regarded as potent immunological ‘adjuvants’ that may trigger autoimmune responses131,132 and other toxicities when present in elevated quantities133,134.

Misregulation of host genes by TEs

Perhaps most importantly, and in-line with the major theme of this Review, the reactivation of TEs may promote disease states indirectly by altering host gene expression. TEs that are normally silenced by DNA methylation may exhibit cis- or trans-regulatory activity that could cause global dysregulation of host genes in cis or trans (through the various mechanisms illustrated in Fig. 1b). Thus ectopic activation of dormant TE cis-regulatory sequences may result in the pathogenic activation of genes or pathways in some cells. Although evidence supporting this model remains limited, there is a growing number of studies showing how derepression of a particular TE copy activates transcription of an adjacent proto-oncogene135137 (Fig. 3b). Whereas the loss of regulatory control at these specific elements may be a rare stochastic event occurring in a small subset of cells, it is possible that this process could be favoured by selection during tumour evolution. In this model termed ‘onco-exaptation’, individual cells where an oncogenic TE is aberrantly unmasked acquire a fitness advantage over other cells as a result of altered oncogene expression138. This process would favour clonal propagation of cells where the TE is unmasked and perpetuate tumour growth.

In addition to altering the expression of adjacent protein-coding genes, reactivated TEs can also drive widespread expression of non-coding RNAs, which themselves are largely derived from TE sequences139141. Many of these transcripts are likely to be non-functional, but there is evidence that some have oncogenic properties. For example the lncRNA BANCR is specifically expressed in melanoma cells and promotes the proliferation and migration of these cells in culture142. Interestingly, BANCR exons are largely derived from a MER41 ERV insertion, and its promoter is derived from the MER41 LTR140. A similar example is EVADR, which is a lncRNA derived from the ERV MER48 that is recurrently overexpressed in adenocarcinomas143. Further studies are warranted to determine the mechanisms by which these TE-derived transcripts might contribute to tumorigenesis.

The regulatory activities of TEs are also emerging as drivers of pathogenesis in non-human species. In the oil palm for example, a fixed Karma TE insertion within an intron of the gene MANTLED is normally methylated, but sporadic demethylation of the region provides an alternative splice site and a premature termination signal for MANTLED144 (Fig. 2d). This epigenetic dysregulation was found to underlie the spontaneous production of deformed oil palm fruits by genetic clones144.

The double-edged sword of co-opting TEs

How can potentially pathogenic TEs persist in host genomes and not be purged by negative selection? In some cases it might just be a matter of time until TE insertions segregating at low frequency in the population are eventually eliminated. It is also possible that the pathogenic activity of certain TEs is only unmasked in a small number of individuals, or exert only weak or post-reproductive effects, and therefore impose an insufficient fitness cost to be purged by natural selection. Another possibility is that some TEs play an adaptive role, but occasional misregulation acts a negative, disease-causing side effect of co-option. Under such a scenario co-opted TEs might be viewed as ‘double-edged swords’ in host evolution: beneficial in some situations or individuals, but detrimental in others.

Our recent discovery of a MER41 element conferring interferon-inducibility to the AIM2 gene44 (Fig. 2b) may represent an example of a double-edged TE. AIM2 encodes an important immunity factor that also acts as a potent tumour suppressor145. Constitutive transcriptional dysregulation of AIM2 (either up or down) has been recurrently observed in cancer and autoimmune diseases, although the mechanisms underlying this misregulation are not well-understood146. In one study examining colorectal colon cancer cells, where AIM2 is constitutively silent and unresponsive to interferon, the upstream promoter region (∼700 bp upstream of the AIM2 gene) was found to be consistently hypermethylated in cancer samples147. Intriguingly, this region coincides with the location of the MER41 element, suggesting that aberrant methylation of this TE might account for silencing of AIM2 in the cancer cells.

Conclusions and future perspectives

The past decade was marked by tremendous progress in our understanding of how TEs shape genome evolution. Major headways were propelled in large part by advances in technologies enabling genome-scale analysis. Genome-wide surveys revealed TEs as a substantial source of cis-regulatory elements in diverse eukaryotic species, lending credence to ideas pioneered decades ago by McClintock, Britten and Davidson, among others. It is now apparent that TEs evolved many complex mechanisms and biochemical activities that, to various extents, predispose them to transitions from parasitic elements to integral components of host gene regulation; ‘from conflicts to benefits’. A pressing challenge is to obtain more direct assessments of the biological consequences of the regulatory activity of TEs, which in turn will provide a better understanding of the long-term evolutionary implications of TE dispersal. Despite decades of genetic analyses of mutant and disease phenotypes illustrating how TE insertions and rearrangements can alter host gene expression patterns in many ways, there is still relatively little experimental evidence that TEs have promoted evolutionary innovations in organismal development and physiology.

Spurred by new tools for direct genetic manipulation such as the CRISPR-Cas9 system, recent studies add to the small but growing list of TEs that have been unambiguously co-opted for regulating cellular functions. A major outstanding task is to gain a better grasp on the role of TE co-option in driving the evolution of gene regulatory networks on a broader scale. Recent work suggests that TEs can extensively remodel regulatory networks that are involved in specific processes including dosage compensation103, immunity44, and early embryonic development24. Other comparative studies of enhancer evolution in the liver148 and the neocortex149 among mammals found only a minor contribution for host co-option of TE activity. These observations raise the question of whether TE-mediated regulatory evolution may be inherently biased for certain biological processes, or if it is a more general mechanism for large-scale genetic innovation.

Taking phylogenetic and other retrospective approaches for assessing the function of ancient TEs will also provide insights into how more-recent TEs have shaped gene regulatory networks in modern species. Here one of the major hurdles is to reach back to events of the distant past to study specific steps leading to TE co-option, which are obscured by millions of years of evolution (Box 2). Therefore, in addition to these types of retrospective studies, investigations of ongoing adaptation in populations150,151 or using protocols of real-time experimental evolution152,153 will shed light on how TEs can facilitate regulatory evolution in response to defined selective pressures during the earliest stages of adaptation.

There is also growing evidence linking aberrant TE regulatory activity and disease states. Although only a few examples of pathogenic TE misregulation have been documented so far for contributing to cancer, TE reactivation is clearly associated with other pathological conditions including ageing154, neurological disorders155, and autoimmunity133. Our mechanistic understanding of the cause and consequence of such ectopic TE activation for disease aetiology or progression is still in its infancy. As technological advances continue to increase our ability to functionally test the involvement of non-coding regulatory processes in pathogenesis, it will open new avenues to assess the role of TE-mediated gene dysregulation in disease.

Key points.

  • Transposable elements (TEs) are increasingly recognized as a potent source of regulatory sequences in eukaryotic genomes.

  • The selfish replication cycle of TEs drove the evolution of finely tuned regulatory activities that favoured their propagation and has predisposed them to be co-opted for the regulation of host genes.

  • There is a growing number of examples of TE-derived sequences that have been co-opted to regulate important biological processes in organismal development and physiology.

  • Dysfunction of TE-derived regulatory sequences is also emerging as a potential driver of diseases including cancer and autoimmunity.

  • Functional genomics and genome editing technologies herald an exciting era for understanding the biological impact of TEs.

Acknowledgments

We apologize to many colleagues who have produced primary research on the topic but could not be cited or discussed owing to space limitations. This work was supported by funds from the US National Institutes of Health (GM77582, GM112972, GM059290 to C.F. and GM114514 to N.C.E). E.B.C. was supported by a Howard Hughes Medical Institute postdoctoral fellowship from the Jane Coffin Childs Memorial Fund. N.C.E. was supported by the Biomedical Scholars Program of the Pew Charitable Trusts.

Glossary

Genetic drift

A process by which mutations become fixed in the population by chance alone

Cis-regulatory sequences

Segments of DNA that regulate the transcription of adjacent genes

Long terminal repeat (LTR) elements

A class of retrotransposons containing direct long terminal repeats flanking the protein-coding sequence

Long interspersed nuclear elements (LINEs)

Class of non-LTR retrotransposons that retrotranspose by target-primed reverse transcription

Short interspersed nuclear elements (SINEs)

Class of non-autonomous retrotransposons that are copied by the LINE replication machinery

Retrotransposon

A type of transposable element that replicates through an RNA intermediate in a ‘copy-and-paste’ mechanism

DNA transposons

Transposable elements that do not generate an RNA intermediate during transposition, which generally occurs through a ‘cut-and-paste’ mechanism

CAGE-seq (Cap Analysis of Gene Expression followed by sequencing)

A method used to precisely map the transcription start sites of capped RNAs genome-wide

RNA-seq

High-throughput sequencing of complementary DNAs derived from RNAs extracted from cells or tissues

ChIP-seq (Chromatin immunoprecipitation followed by sequencing)

A method for identifying protein–DNA interactions genome-wide. Following crosslinking, a protein of interest is immunoprecipitated, and its binding sites in the genome are identified by high-throughput sequencing of the co-purified DNA fragments

Reporter assay

A putative cis-regulatory DNA sequence is cloned upstream of a reporter gene (such as luciferase) either in an episomal vector or as a chromosomally integrated construct and tested for its ability to enhance transcription of the reporter gene

Purifying selection

Selection against mutations that are deleterious to the fitness of the individual

Structural variation

Genomic variation resulting from large-scale DNA mutations such as deletions, insertions, or rearrangements

Biographies

Edward B. Chuong

Edward Chuong received his Ph.D. from Stanford University in 2013, where he worked with Julie Baker to study the impact of endogenous retroviruses on mammalian placenta development. Since 2013, he has been a Postdoctoral Fellow at the University of Utah School of Medicine, Salt Lake City, USA, working with Nels Elde and Cédric Feschotte to study how transposable elements influence gene regulatory evolution in host immunity.

Edward B. Chuong's homepage: http://controlelements.org

Nels C. Elde

Nels Elde is an Assistant Professor in the Department of Human Genetics at the University of Utah School of Medicine, Salt Lake City, USA, where he holds the Mario R. Capecchi Endowed Chair in Genetics. He trained as a cell biologist in the laboratory of Aaron Turkewitz at the University of Chicago, USA, and conducted his postdoctoral training in evolutionary genetics in the laboratory of Harmit Malik at the Fred Hutchinson Cancer Research Center in Seattle, USA. The Elde lab studies how microbes, pathogens, and related genetic elements influence the biology of cells and genomes.

Nels C. Elde's homepage: http://cellvolution.org/

Cédric Feschotte

Cédric Feschotte received his Ph.D. in 2001 from the University of Paris VI, France, during which he characterized mosquito transposable elements with Claude Mouchès. From 2000 to 2004, he was a Postdoctoral Associate in the group of Susan R. Wessler at the University of Georgia, Athens, USA, where he explored the origin and amplification mechanism of ‘MITE’ transposons in plant genomes. In 2004, he became an Assistant Professor in the Department of Biology at the University of Texas, Arlington, USA, and an Associate Professor in 2009. In June 2012, he joined the Department of Human Genetics at the University of Utah School of Medicine, Salt Lake City, USA, where he currently holds the title of Professor and runs a laboratory investigating the biological impact of mobile genetic elements.

Cédric Feschotte's homepage: http://feschotte.genetics.utah.edu/

Footnotes

Subject categories. Biological sciences / Evolution / Evolutionary genetics, [URI /631/181/2474]

Biological sciences / Developmental biology / Pluripotency, [URI /631/136/2444]

Biological sciences / Developmental biology / Stem cells, [URI /631/136/532]

Biological sciences / Microbiology / Virology / Viral evolution, [URI /631/326/596/2554]

Biological sciences / Genetics / Genome / Interspersed repetitive sequences / DNA transposable elements, [URI /631/208/726/2001/1428]

Biological sciences / Molecular biology / Transcription / Transcriptional regulatory elements, [URI /631/337/572/2102]

Biological sciences / Genetics / Genomics / Mobile elements, [URI /631/208/212/2305]

Biological sciences / Genetics / Gene regulation, [URI /631/208/200]

References

  • 1.McClintock B. Intranuclear systems controlling gene action and mutation. Brookhaven Symp Biol. 1956:58–74. [PubMed] [Google Scholar]
  • 2.Britten RJ, Kohne DE. Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science. 1968;161:529–540. doi: 10.1126/science.161.3841.529. [DOI] [PubMed] [Google Scholar]
  • 3.Britten RJ, Davidson EH. Repetitive and Non-Repetitive DNA Sequences and a Speculation on the Origins of Evolutionary Novelty. Q Rev Biol. 1971;46:111–138. doi: 10.1086/406830. [DOI] [PubMed] [Google Scholar]
  • 4.Orgel LE, Crick FH, Sapienza C. Selfish DNA. Nature. 1980;288:645–646. doi: 10.1038/288645a0. [DOI] [PubMed] [Google Scholar]
  • 5.Doolittle WF, Sapienza C. Selfish genes, the phenotype paradigm and genome evolution. Nature. 1980;284:601–603. doi: 10.1038/284601a0. [DOI] [PubMed] [Google Scholar]
  • 6.Lynch M. The Origins of Genome Architecture. Sinauer Associates Incorporated; 2007. [Google Scholar]
  • 7.Britten RJ, Kohne DE. Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science. 1968;161:529–540. doi: 10.1126/science.161.3841.529. [DOI] [PubMed] [Google Scholar]
  • 8.Mager DL, Stoye JP. Mammalian Endogenous Retroviruses. Microbiol Spectr. 2015;3:MDNA3–0009–2014. doi: 10.1128/microbiolspec.MDNA3-0009-2014. [DOI] [PubMed] [Google Scholar]
  • 9.Richardson SR, et al. The Influence of LINE-1 and SINE Retrotransposons on Mammalian Genomes. Microbiol Spectr. 2015;3:MDNA3–0061–2014. doi: 10.1128/microbiolspec.MDNA3-0061-2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Speek M. Antisense promoter of human L1 retrotransposon drives transcription of adjacent cellular genes. Mol Cell Biol. 2001;21:1973–1985. doi: 10.1128/MCB.21.6.1973-1985.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lander ES, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
  • 12.Feschotte C, Zhang X, Wessler SR. Miniature inverted-repeat transposable elements (MITEs) and their relationship with established DNA transposons. Mobile DNA II. :1147–1158. [Google Scholar]
  • 13.Faulkner GJ, et al. The regulated retrotransposon transcriptome of mammalian cells. Nat Genet. 2009;41:563–571. doi: 10.1038/ng.368. [DOI] [PubMed] [Google Scholar]
  • 14.Lisch D. How important are transposons for plant evolution? Nat Rev Genet. 2013;14:49–61. doi: 10.1038/nrg3374. [DOI] [PubMed] [Google Scholar]
  • 15.Feschotte C. Transposable elements and the evolution of regulatory networks. Nat Rev Genet. 2008;9:397–405. doi: 10.1038/nrg2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bourque G, et al. Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res. 2008;18:1752–1762. doi: 10.1101/gr.080663.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Roman AC, Benitez DA, Carvajal-Gonzalez JM, Fernandez-Salguero PM. Genome-wide B1 retrotransposon binds the transcription factors dioxin receptor and Slug and regulates gene expression in vivo. Proc Natl Acad Sci U S A. 2008;105:1632–1637. doi: 10.1073/pnas.0708366105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kunarso G, et al. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat Genet. 2010;42:631–634. doi: 10.1038/ng.600. [DOI] [PubMed] [Google Scholar]
  • 19.Testori A, et al. The role of Transposable Elements in shaping the combinatorial interaction of Transcription Factors. BMC Genomics. 2012;13:400. doi: 10.1186/1471-2164-13-400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Sundaram V, et al. Widespread contribution of transposable elements to the innovation of gene regulatory networks. Genome Res. 2014;24:1963–1976. doi: 10.1101/gr.168872.113. The most comprehensive assessment to date of the contribution of TEs to the genomic landscape of transcription factor binding in human and mouse cells. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Teng L, He B, Gao P, Gao L, Tan K. Discover context-specific combinatorial transcription factor interactions by integrating diverse ChIP-Seq data sets. Nucleic Acids Res. 2014;42:e24. doi: 10.1093/nar/gkt1105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Thompson PJ, Macfarlan TS, Lorincz MC. Long Terminal Repeats: From Parasitic Elements to Building Blocks of the Transcriptional Regulatory Repertoire. Mol Cell. 2016;62:766–776. doi: 10.1016/j.molcel.2016.03.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Lisch D, Damon L. Regulation of transposable elements in maize. Curr Opin Plant Biol. 2012;15:511–516. doi: 10.1016/j.pbi.2012.07.001. [DOI] [PubMed] [Google Scholar]
  • 24.Gerdes P, Richardson SR, Mager DL, Faulkner GJ. Transposable elements in the mammalian embryo: pioneers surviving through stealth and service. Genome Biol. 2016;17:100. doi: 10.1186/s13059-016-0965-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kazazian HH., Jr Mobile DNA transposition in somatic cells. BMC Biol. 2011;9:62. doi: 10.1186/1741-7007-9-62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Perrat PN, et al. Transposition-driven genomic heterogeneity in the Drosophila brain. Science. 2013;340:91–95. doi: 10.1126/science.1231965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Levin HL, Moran JV. Dynamic interactions between transposable elements and their hosts. Nat Rev Genet. 2011;12:615–627. doi: 10.1038/nrg3030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Liu S, et al. Mu transposon insertion sites and meiotic recombination events co-localize with epigenetic marks for open chromatin across the maize genome. PLoS Genet. 2009;5:e1000733. doi: 10.1371/journal.pgen.1000733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Naito K, et al. Unexpected consequences of a sudden and massive transposon amplification on rice gene expression. Nature. 2009;461:1130–1134. doi: 10.1038/nature08479. [DOI] [PubMed] [Google Scholar]
  • 30.Leem YE, et al. Retrotransposon Tf1 is targeted to Pol II promoters by transcription activators. Mol Cell. 2008;30:98–107. doi: 10.1016/j.molcel.2008.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Liao GC, Rehm EJ, Rubin GM. Insertion site preferences of the P transposable element in Drosophila melanogaster. Proceedings of the National Academy of Sciences. 2000;97:3347–3351. doi: 10.1073/pnas.050017397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Schröder ARW, et al. HIV-1 integration in the human genome favors active genes and local hotspots. Cell. 2002;110:521–529. doi: 10.1016/s0092-8674(02)00864-4. [DOI] [PubMed] [Google Scholar]
  • 33.Friedli M, Trono D. The developmental control of transposable elements and the evolution of higher species. Annu Rev Cell Dev Biol. 2015;31:429–451. doi: 10.1146/annurev-cellbio-100814-125514. [DOI] [PubMed] [Google Scholar]
  • 34.Wolf G, Greenberg D, Macfarlan TS. Spotting the enemy within: Targeted silencing of foreign DNA in mammalian genomes by the Krüppel-associated box zinc finger protein family. Mob DNA. 2015;6:17. doi: 10.1186/s13100-015-0050-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Cohen CJ, Lock WM, Mager DL. Endogenous retroviral LTRs as promoters for human genes: A critical assessment. Gene. 2009;448:105–114. doi: 10.1016/j.gene.2009.06.020. [DOI] [PubMed] [Google Scholar]
  • 36.Rebollo R, Farivar S, Mager DL. C-GATE - catalogue of genes affected by transposable elements. Mob DNA. 2012;3:9. doi: 10.1186/1759-8753-3-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hirsch CD, Springer NM. Transposable element influences on gene expression in plants. Biochim Biophys Acta. 2016 doi: 10.1016/j.bbagrm.2016.05.010. [DOI] [PubMed] [Google Scholar]
  • 38.Elbarbary RA, Lucas BA, Maquat LE. Retrotransposons as regulators of gene expression. Science. 2016;351:aac7247. doi: 10.1126/science.aac7247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Macfarlan TS, et al. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature. 2012;487:57–63. doi: 10.1038/nature11244. This paper and Ref. 18 are seminal studies hinting at the recurrent regulatory co-option of ERVs in mammalian early development. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Melé M, et al. Human genomics. The human transcriptome across tissues and individuals. Science. 2015;348:660–665. doi: 10.1126/science.aaa0355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Batut P, Dobin A, Plessy C, Carninci P, Gingeras TR. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res. 2013;23:169–180. doi: 10.1101/gr.139618.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Jacques PÉ, Jeyakani J, Bourque G. The majority of primate-specific regulatory sequences are derived from transposable elements. PLoS Genet. 2013;9:e1003504. doi: 10.1371/journal.pgen.1003504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Wang T, et al. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc Natl Acad Sci U S A. 2007;104:18613–18618. doi: 10.1073/pnas.0703637104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Chuong EB, Elde NC, Feschotte C. Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Science. 2016;351:1083–1087. doi: 10.1126/science.aad5497. The first application of the CRISPR-Cas9 system to establish the functional importance of TE-derived cis-regulatory elements, here shown to be co-opted for the human innate immune response. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Van Bortle K, Corces VG. The role of chromatin insulators in nuclear architecture and genome function. Curr Opin Genet Dev. 2013;23:212–218. doi: 10.1016/j.gde.2012.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Schmidt D, et al. Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages. Cell. 2012;148:335–348. doi: 10.1016/j.cell.2011.11.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Crepaldi L, et al. Binding of TFIIIC to sine elements controls the relocation of activity-dependent neuronal genes to transcription factories. PLoS Genet. 2013;9:e1003699. doi: 10.1371/journal.pgen.1003699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Wang J, et al. MIR retrotransposon sequences provide insulators to the human genome. Proc Natl Acad Sci U S A. 2015;112:E4428–37. doi: 10.1073/pnas.1507253112. Refs. 46–48 show that SINEs frequently exhibit properties of architectural elements partitioning the mammalian genome into distinct topological domains. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. doi: 10.1038/nature11082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Rao SSP, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Cournac A, Koszul R, Mozziconacci J. The 3D folding of metazoan genomes correlates with the association of similar repetitive elements. Nucleic Acids Res. 2016;44:245–255. doi: 10.1093/nar/gkv1292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Mateo L, Ullastres A, Gonzalez J. A transposable element insertion confers xenobiotic resistance in Drosophila. PLoS Genet. 2014;10:e1004560. doi: 10.1371/journal.pgen.1004560. A meticulous investigation of the physiological impact of a TE insertion with adaptive regulatory effects in D. melanogaster. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Lippman Z, et al. Role of transposable elements in heterochromatin and epigenetic control. Nature. 2004;430:471–476. doi: 10.1038/nature02651. [DOI] [PubMed] [Google Scholar]
  • 54.Rebollo R, et al. Retrotransposon-induced heterochromatin spreading in the mouse revealed by insertional polymorphisms. PLoS Genet. 2011;7:e1002301. doi: 10.1371/journal.pgen.1002301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.de Souza FSJ, Franchini LF, Rubinstein M. Exaptation of transposable elements into novel cis-regulatory elements: is the evidence always strong? Mol Biol Evol. 2013;30:1239–1251. doi: 10.1093/molbev/mst045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.del Rosario RCH, Rayan NA, Shyam P. Noncoding origins of anthropoid traits and a new null model of transposon functionalization. Genome Res. 2014;24:1469–1484. doi: 10.1101/gr.168963.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Brosius J, Gould SJ. On ‘genomenclature’: a comprehensive (and respectful) taxonomy for pseudogenes and other ‘junk DNA’. Proc Natl Acad Sci U S A. 1992;89:10706–10710. doi: 10.1073/pnas.89.22.10706. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Miller WJ, McDonald JF, Pinsker W. Molecular domestication of mobile elements. Genetica. 1997;100:261–270. [PubMed] [Google Scholar]
  • 59.Lowe CB, Haussler D. 29 mammalian genomes reveal novel exaptations of mobile elements for likely regulatory functions in the human genome. PLoS One. 2012;7:e43128. doi: 10.1371/journal.pone.0043128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Bejerano G, et al. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature. 2006;441:87–90. doi: 10.1038/nature04696. [DOI] [PubMed] [Google Scholar]
  • 61.Samstein RM, Josefowicz SZ, Arvey A, Treuting PM, Rudensky AY. Extrathymic generation of regulatory T cells in placental mammals mitigates maternal-fetal conflict. Cell. 2012;150:29–38. doi: 10.1016/j.cell.2012.05.031. A careful study tracing the origination of an enhancer that is essential for extrathymic regulatory T cell differentiation to an ancient TE insertion that was co-opted in the common ancestor of placental mammals. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Lam DD, et al. Partially redundant enhancers cooperatively maintain Mammalian pomc expression above a critical functional threshold. PLoS Genet. 2015;11:e1004935. doi: 10.1371/journal.pgen.1004935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Lynch VJ, Leclerc RD, May G, Wagner GP. Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. 2011;43:1154–1159. doi: 10.1038/ng.917. [DOI] [PubMed] [Google Scholar]
  • 64.Notwell JH, Chung T, Heavner W, Bejerano G. A family of transposable elements co-opted into developmental enhancers in the mouse neocortex. Nat Commun. 2015;6:6644. doi: 10.1038/ncomms7644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Nishihara H, et al. Coordinately co-opted multiple transposable elements constitute an enhancer for wnt5a expression in the mammalian secondary palate. PLOS Genetics. doi: 10.1371/journal.pgen.1006380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Ahituv N, et al. Deletion of Ultraconserved Elements Yields Viable Mice. PLoS Biol. 2007;5:e234. doi: 10.1371/journal.pbio.0050234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Stone EA, Cooper GM, Sidow A. Trade-offs in detecting evolutionary constrained sequence by comparative genomics. Annu Rev Genomics Hum Genet. 2005;6:143–164. doi: 10.1146/annurev.genom.6.080604.162146. [DOI] [PubMed] [Google Scholar]
  • 68.Boffelli D, et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science. 2003;299:1391–1394. doi: 10.1126/science.1081331. [DOI] [PubMed] [Google Scholar]
  • 69.Weirauch MT, Hughes TR. Conserved expression without conserved regulatory sequence: the more things change, the more they stay the same. Trends Genet. 2010;26:66–74. doi: 10.1016/j.tig.2009.12.002. [DOI] [PubMed] [Google Scholar]
  • 70.Kapusta A, Feschotte C. Volatile evolution of long noncoding RNA repertoires: mechanisms and biological implications. Trends Genet. 2014;30:439–452. doi: 10.1016/j.tig.2014.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Ting CN, Rosenberg MP, Snow CM, Samuelson LC, Meisler MH. Endogenous retroviral sequences are required for tissue-specific expression of a human salivary amylase gene. Genes Dev. 1992;6:1457–1465. doi: 10.1101/gad.6.8.1457. [DOI] [PubMed] [Google Scholar]
  • 72.Emera D, Wagner GP. Transformation of a transposon into a derived prolactin promoter with function during human pregnancy. Proceedings of the National Academy of Sciences. 2012;109:11246–11251. doi: 10.1073/pnas.1118566109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Gerlo S, Davis JRE, Mager DL, Kooijman R. Prolactin in man: a tale of two promoters. Bioessays. 2006;28:1051–1055. doi: 10.1002/bies.20468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Sasaki T, et al. Possible involvement of SINEs in mammalian-specific brain formation. Proc Natl Acad Sci U S A. 2008;105:4220–4225. doi: 10.1073/pnas.0709398105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Flemr M, et al. A retrotransposon-driven dicer isoform directs endogenous small interfering RNA production in mouse oocytes. Cell. 2013;155:807–816. doi: 10.1016/j.cell.2013.10.001. A compelling study using TALEN editing to demonstrate the phenotypic importance of a lineage-specific TE insertion driving the emergence of an alternative protein isoform with a novel function. [DOI] [PubMed] [Google Scholar]
  • 76.Ferreira LMR, et al. A distant trophoblast-specific enhancer controls HLA-G expression at the maternal-fetal interface. Proc Natl Acad Sci U S A. 2016;113:5364–5369. doi: 10.1073/pnas.1602886113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Studer A, Zhao Q, Ross-Ibarra J, Doebley J. Identification of a functional transposon insertion in the maize domestication gene tb1. Nat Genet. 2011;43:1160–1163. doi: 10.1038/ng.942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Butelli E, et al. Retrotransposons control fruit-specific, cold-dependent accumulation of anthocyanins in blood oranges. Plant Cell. 2012;24:1242–1255. doi: 10.1105/tpc.111.095232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Sun W, Shen YH, Han MJ, Cao YF, Zhang Z. An adaptive transposable element insertion in the regulatory region of the EO gene in the domesticated silkworm, Bombyx mori. Mol Biol Evol. 2014;31:3302–3313. doi: 10.1093/molbev/msu261. [DOI] [PubMed] [Google Scholar]
  • 80.Klütsch CFC, de Caprona MDC. The IGF1 small dog haplotype is derived from Middle Eastern grey wolves: a closer look at statistics, sampling, and the alleged Middle Eastern origin of small dogs. BMC Biol. 2010;8:119. doi: 10.1186/1741-7007-8-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Clark LA, Wahl JM, Rees CA, Murphy KE. Retrotransposon insertion in SILV is responsible for merle patterning of the domestic dog. Proc Natl Acad Sci U S A. 2006;103:1376–1381. doi: 10.1073/pnas.0506940103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.David VA, et al. Endogenous retrovirus insertion in the KIT oncogene determines white and white spotting in domestic cats. G3. 2014;4:1881–1891. doi: 10.1534/g3.114.013425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Van't Hof AE, et al. The industrial melanism mutation in British peppered moths is a transposable element. Nature. 2016;534:102–105. doi: 10.1038/nature17951. A classic example of environmental adaptation traced to a TE insertion with cis-regulatory effect. [DOI] [PubMed] [Google Scholar]
  • 84.Swinnen G, Goossens A, Pauwels L. Lessons from Domestication: Targeting Cis-Regulatory Elements for Crop Improvement. Trends Plant Sci. 2016;21:506–515. doi: 10.1016/j.tplants.2016.01.014. [DOI] [PubMed] [Google Scholar]
  • 85.Shubin N, Tabin C, Carroll S. Deep homology and the origins of evolutionary novelty. Nature. 2009;457:818–823. doi: 10.1038/nature07891. [DOI] [PubMed] [Google Scholar]
  • 86.Sorrells TR, Johnson AD. Making sense of transcription networks. Cell. 2015;161:714–723. doi: 10.1016/j.cell.2015.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Chuong EB, Rumi MAK, Soares MJ, Baker JC. Endogenous retroviruses function as species-specific enhancer elements in the placenta. Nat Genet. 2013;45:325–329. doi: 10.1038/ng.2553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Chuong EB. Retroviruses facilitate the rapid evolution of the mammalian placenta. Bioessays. 2013;35:853–861. doi: 10.1002/bies.201300059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Xie M, et al. DNA hypomethylation within specific transposable element families associates with tissue-specific enhancer landscape. Nat Genet. 2013;45:836–841. doi: 10.1038/ng.2649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Lynch VJ, et al. Ancient transposable elements transformed the uterine regulatory landscape and transcriptome during the evolution of mammalian pregnancy. Cell Rep. 2015;10:551–561. doi: 10.1016/j.celrep.2014.12.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Dewannieux M, Dupressoir A, Harper F, Pierron G, Heidmann T. Identification of autonomous IAP LTR retrotransposons mobile in mammalian cells. Nat Genet. 2004;36:534–539. doi: 10.1038/ng1353. [DOI] [PubMed] [Google Scholar]
  • 92.Brady T, et al. Integration target site selection by a resurrected human endogenous retrovirus. Genes Dev. 2009;23:633–642. doi: 10.1101/gad.1762309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Fontanillas P, Hartl DL, Reuter M. Genome organization and gene expression shape the transposable element distribution in the Drosophila melanogaster euchromatin. PLoS Genet. 2007;3:e210. doi: 10.1371/journal.pgen.0030210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.McVicker G, Green P. Genomic signatures of germline gene expression. Genome Res. 2010;20:1503–1511. doi: 10.1101/gr.106666.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Zhang Y, Ying Z, Mager DL. Gene Properties and Chromatin State Influence the Accumulation of Transposable Elements in Genes. PLoS One. 2012;7:e30158. doi: 10.1371/journal.pone.0030158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Han JS, Szak ST, Boeke JD. Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature. 2004;429:268–274. doi: 10.1038/nature02536. [DOI] [PubMed] [Google Scholar]
  • 97.Sironi M, et al. Gene function and expression level influence the insertion/fixation dynamics of distinct transposon families in mammalian introns. Genome Biol. 2006;7:R120. doi: 10.1186/gb-2006-7-12-r120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Simons C, Pheasant M, Makunin IV, Mattick JS. Transposon-free regions in mammalian genomes. Genome Res. 2006;16:164–172. doi: 10.1101/gr.4624306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Kvikstad EM, Makova KD. The (r)evolution of SINE versus LINE distributions in primate genomes: sex chromosomes are important. Genome Res. 2010;20:600–613. doi: 10.1101/gr.099044.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Zhang Y, Romanish MT, Mager DL. Distributions of transposable elements reveal hazardous zones in mammalian introns. PLoS Comput Biol. 2011;7:e1002046. doi: 10.1371/journal.pcbi.1002046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Nellåker C, et al. The genomic landscape shaped by selection on transposable elements across 18 mouse strains. Genome Biol. 2012;13:R45. doi: 10.1186/gb-2012-13-6-r45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Campos-Sánchez R, Cremona MA, Pini A, Chiaromonte F, Makova KD. Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis. PLoS Comput Biol. 2016;12:e1004956. doi: 10.1371/journal.pcbi.1004956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Ellison CE, Bachtrog D. Dosage compensation via transposable element mediated rewiring of a regulatory network. Science. 2013;342:846–850. doi: 10.1126/science.1239552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Lyon MF. X-chromosome inactivation: a repeat hypothesis. Cytogenet Cell Genet. 1998;80:133–137. doi: 10.1159/000014969. [DOI] [PubMed] [Google Scholar]
  • 105.Chow JC, et al. LINE-1 activity in facultative heterochromatin formation during X chromosome inactivation. Cell. 2010;141:956–969. doi: 10.1016/j.cell.2010.04.042. [DOI] [PubMed] [Google Scholar]
  • 106.Hancks DC, Kazazian HH., Jr Roles for retrotransposon insertions in human disease. Mob DNA. 2016;7:9. doi: 10.1186/s13100-016-0065-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Belancio VP, Roy-Engel AM, Deininger PL. All y'all need to know 'bout retroelements in cancer. Semin Cancer Biol. 2010;20:200–210. doi: 10.1016/j.semcancer.2010.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Albert FW, Kruglyak L. The role of regulatory variation in complex traits and disease. Nat Rev Genet. 2015;16:197–212. doi: 10.1038/nrg3891. [DOI] [PubMed] [Google Scholar]
  • 109.Sudmant PH, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81. doi: 10.1038/nature15394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Sekar A, et al. Schizophrenia risk from complex variation of complement component 4. Nature. 2016;530:177–183. doi: 10.1038/nature16549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Leboyer M, Tamouza R, Charron D, Faucard R, Perron H. Human endogenous retrovirus type W (HERV-W) in schizophrenia: a new avenue of research at the gene-environment interface. World J Biol Psychiatry. 2013;14:80–90. doi: 10.3109/15622975.2010.601760. [DOI] [PubMed] [Google Scholar]
  • 112.Rodić N, et al. Long interspersed element-1 protein expression is a hallmark of many human cancers. Am J Pathol. 2014;184:1280–1286. doi: 10.1016/j.ajpath.2014.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Kassiotis G. Endogenous retroviruses and the development of cancer. J Immunol. 2014;192:1343–1349. doi: 10.4049/jimmunol.1302972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.van der Kuyl AC. HIV infection and HERV expression: a review. Retrovirology. 2012;9:6. doi: 10.1186/1742-4690-9-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Mourier T, Nielsen LP, Hansen AJ, Willerslev E. Transposable elements in cancer as a by-product of stress-induced evolvability. Front Genet. 2014;5:156. doi: 10.3389/fgene.2014.00156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.De Cecco M, et al. Transposable elements become active and mobile in the genomes of aging mammalian somatic tissues. Aging. 2013;5:867–883. doi: 10.18632/aging.100621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Van Meter M, et al. SIRT6 represses LINE1 retrotransposons by ribosylating KAP1 but this repression fails with stress and age. Nat Commun. 2014;5:5011. doi: 10.1038/ncomms6011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Wood JG, et al. Chromatin-modifying genetic interventions suppress age-associated transposable element activation and extend life span in Drosophila. Proc Natl Acad Sci U S A. 2016 doi: 10.1073/pnas.1604621113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Rodić N, Burns KH. Long Interspersed Element–1 (LINE-1): Passenger or Driver in Human Neoplasms? PLoS Genet. 2013;9:e1003402. doi: 10.1371/journal.pgen.1003402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Magiorkinis G, Belshaw R, Katzourakis A. There and back again': revisiting the pathophysiological roles of human endogenous retroviruses in the post-genomic era. Philos Trans R Soc Lond B Biol Sci. 2013;368:20120504. doi: 10.1098/rstb.2012.0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Lee E, et al. Landscape of somatic retrotransposition in human cancers. Science. 2012;337:967–971. doi: 10.1126/science.1222077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Rodić N, et al. Retrotransposon insertions in the clonal evolution of pancreatic ductal adenocarcinoma. Nat Med. 2015;21:1060–1064. doi: 10.1038/nm.3919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Doucet-O'Hare TT, et al. LINE-1 expression and retrotransposition in Barrett's esophagus and esophageal carcinoma. Proceedings of the National Academy of Sciences. 2015;112:E4894–E4900. doi: 10.1073/pnas.1502474112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Shukla R, et al. Endogenous retrotransposition activates oncogenic pathways in hepatocellular carcinoma. Cell. 2013;153:101–111. doi: 10.1016/j.cell.2013.02.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Scott EC, et al. A hot L1 retrotransposon evades somatic repression and initiates human colorectal cancer. Genome Res. 2016;26:745–755. doi: 10.1101/gr.201814.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Gasior SL, Wakeman TP, Xu B, Deininger PL. The human LINE-1 retrotransposon creates DNA double-strand breaks. J Mol Biol. 2006;357:1383–1393. doi: 10.1016/j.jmb.2006.01.089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Li W, et al. Human endogenous retrovirus-K contributes to motor neuron disease. Sci Transl Med. 2015;7:307ra153. doi: 10.1126/scitranslmed.aac8201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Duperray A, et al. Inflammatory response of endothelial cells to a human endogenous retrovirus associated with multiple sclerosis is mediated by TLR4. Int Immunol. 2015;27:545–553. doi: 10.1093/intimm/dxv025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Kremer D, et al. Human endogenous retrovirus type W envelope protein inhibits oligodendroglial precursor cell differentiation. Ann Neurol. 2013;74:721–732. doi: 10.1002/ana.23970. [DOI] [PubMed] [Google Scholar]
  • 130.Conrad B, et al. A Human Endogenous Retroviral Superantigen as Candidate Autoimmune Gene in Type I Diabetes. Cell. 1997;90:303–313. doi: 10.1016/s0092-8674(00)80338-4. [DOI] [PubMed] [Google Scholar]
  • 131.Yu P. The potential role of retroviruses in autoimmunity. Immunol Rev. 2015;269:85–99. doi: 10.1111/imr.12371. [DOI] [PubMed] [Google Scholar]
  • 132.Hurst TP, Magiorkinis G. Activation of the innate immune response by endogenous retroviruses. J Gen Virol. 2015;96:1207–1218. doi: 10.1099/jgv.0.000017. [DOI] [PubMed] [Google Scholar]
  • 133.Volkman HE, Stetson DB. The enemy within: endogenous retroelements and autoimmune disease. Nat Immunol. 2014;15:415–422. doi: 10.1038/ni.2872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 134.Chiappinelli KB, et al. Inhibiting DNA Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses. Cell. 2016;164:1073. doi: 10.1016/j.cell.2015.10.020. [DOI] [PubMed] [Google Scholar]
  • 135.Lamprecht B, et al. Derepression of an endogenous long terminal repeat activates the CSF1R proto-oncogene in human lymphoma. Nat Med. 2010;16:571–579. doi: 10.1038/nm.2129. [DOI] [PubMed] [Google Scholar]
  • 136.Lock FE, et al. Distinct isoform of FABP7 revealed by screening for retroelement-activated genes in diffuse large B-cell lymphoma. Proceedings of the National Academy of Sciences. 2014;111:E3534–43. doi: 10.1073/pnas.1405507111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Babaian A, et al. Onco-exaptation of an endogenous retroviral LTR drives IRF5 expression in Hodgkin lymphoma. Oncogene. 2016;35:2542–2546. doi: 10.1038/onc.2015.308. Refs 135–137 provide clear instances where the derepression of LTR elements leads to the activation of proto-oncogenes in human lymphoma. [DOI] [PubMed] [Google Scholar]
  • 138.Babaian A, Mager DL. Endogenous retroviral promoter exaptation in human cancer. Mobile DNA (in press) doi: 10.1186/s13100-016-0080-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Kelley D, Rinn J. Transposable elements reveal a stem cell-specific class of long noncoding RNAs. Genome Biol. 2012;13:R107. doi: 10.1186/gb-2012-13-11-r107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140.Kapusta A, et al. Transposable elements are major contributors to the origin, diversification, and regulation of vertebrate long noncoding RNAs. PLoS Genet. 2013;9:e1003470. doi: 10.1371/journal.pgen.1003470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Hashimoto K, et al. CAGE profiling of ncRNAs in hepatocellular carcinoma reveals widespread activation of retroviral LTR promoters in virus-induced tumors. Genome Res. 2015;25:1812–1824. doi: 10.1101/gr.191031.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Flockhart RJ, et al. BRAFV600E remodels the melanocyte transcriptome and induces BANCR to regulate melanoma cell migration. Genome Res. 2012;22:1006–1014. doi: 10.1101/gr.140061.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Gibb EA, et al. Activation of an endogenous retrovirus-associated long non-coding RNA in human adenocarcinoma. Genome Med. 2015;7:22. doi: 10.1186/s13073-015-0142-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Ong-Abdullah M, et al. Loss of Karma transposon methylation underlies the mantled somaclonal variant of oil palm. Nature. 2015;525:533–537. doi: 10.1038/nature15365. A striking case of epigenetic derepression of a TE associated with a deleterious phenotype that has long plagued the oil palm industry. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Man SM, et al. Critical Role for the DNA Sensor AIM2 in Stem Cell Proliferation and Cancer. Cell. 2015;162:45–58. doi: 10.1016/j.cell.2015.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Man SM, Karki R, Kanneganti TD. AIM2 inflammasome in infection, cancer, and autoimmunity: Role in DNA sensing, inflammation, and innate immunity. Eur J Immunol. 2016;46:269–280. doi: 10.1002/eji.201545839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Woerner SM, et al. The putative tumor suppressor AIM2 is frequently affected by different genetic alterations in microsatellite unstable colon cancers. Genes Chromosomes Cancer. 2007;46:1080–1089. doi: 10.1002/gcc.20493. [DOI] [PubMed] [Google Scholar]
  • 148.Villar D, et al. Enhancer evolution across 20 mammalian species. Cell. 2015;160:554–566. doi: 10.1016/j.cell.2015.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149.Emera D, Yin J, Reilly SK, Gockley J, Noonan JP. Origin and evolution of developmental enhancers in the mammalian neocortex. Proc Natl Acad Sci U S A. 2016;113:E2617–26. doi: 10.1073/pnas.1603718113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 150.Casacuberta E, González J. The impact of transposable elements in environmental adaptation. Mol Ecol. 2013;22:1503–1517. doi: 10.1111/mec.12170. [DOI] [PubMed] [Google Scholar]
  • 151.Quadrana L, et al. The Arabidopsis thaliana mobilome and its impact at the species level. Elife. 2016;5 doi: 10.7554/eLife.15716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152.Feng G, Leem YE, Levin HL. Transposon integration enhances expression of stress response genes. Nucleic Acids Res. 2013;41:775–789. doi: 10.1093/nar/gks1185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 153.Barroso-Batista J, et al. The first steps of adaptation of Escherichia coli to the gut are dominated by soft sweeps. PLoS Genet. 2014;10:e1004182. doi: 10.1371/journal.pgen.1004182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154.Li W, et al. Activation of transposable elements during aging and neuronal decline in Drosophila. Nat Neurosci. 2013;16:529–531. doi: 10.1038/nn.3368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155.Reilly MT, Faulkner GJ, Dubnau J, Ponomarev I, Gage FH. The role of transposable elements in health and diseases of the central nervous system. J Neurosci. 2013;33:17577–17586. doi: 10.1523/JNEUROSCI.3369-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Goke J, et al. Dynamic transcription of distinct classes of endogenous retroviral elements marks specific populations of early human embryonic cells. Cell Stem Cell. 2015;16:135–141. doi: 10.1016/j.stem.2015.01.005. [DOI] [PubMed] [Google Scholar]
  • 157.Grow EJ, et al. Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells. Nature. 2015;522:221–225. doi: 10.1038/nature14308. This study reports that the HERV-K family is precisely activated during early human development and express proteins promoting the formation of retroviral particles and potentially modulating embryonic gene expression. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Fort A, et al. Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenance. Nat Genet. 2014;46:558–566. doi: 10.1038/ng.2965. [DOI] [PubMed] [Google Scholar]
  • 159.Wang J, et al. Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells. Nature. 2014;516:405–409. doi: 10.1038/nature13804. [DOI] [PubMed] [Google Scholar]
  • 160.Santoni FA, Guerra J, Luban J. HERV-H RNA is abundant in human embryonic stem cells and a precise marker for pluripotency. Retrovirology. 2012;9:111. doi: 10.1186/1742-4690-9-111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 161.Ohnuki M, et al. Dynamic regulation of human endogenous retroviruses mediates factor-induced reprogramming and differentiation potential. Proceedings of the National Academy of Sciences. 2014;111:12426–12431. doi: 10.1073/pnas.1413299111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162.Loewer S, et al. Large intergenic non-coding RNA-RoR modulates reprogramming of human induced pluripotent stem cells. Nat Genet. 2010;42:1113–1117. doi: 10.1038/ng.710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 163.Wang Y, et al. Endogenous miRNA sponge lincRNA-RoR regulates Oct4, Nanog, and Sox2 in human embryonic stem cell self-renewal. Dev Cell. 2013;25:69–80. doi: 10.1016/j.devcel.2013.03.002. [DOI] [PubMed] [Google Scholar]
  • 164.Lu X, et al. The retrovirus HERVH is a long noncoding RNA required for human embryonic stem cell identity. Nat Struct Mol Biol. 2014;21:423–425. doi: 10.1038/nsmb.2799. [DOI] [PubMed] [Google Scholar]
  • 165.Durruthy-Durruthy J, et al. The primate-specific noncoding RNA HPAT5 regulates pluripotency during human preimplantation development and nuclear reprogramming. Nat Genet. 2015;48:44–52. doi: 10.1038/ng.3449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 166.Izsvák Z, Wang J, Singh M, Mager DL, Hurst LD. Pluripotency and the endogenous retrovirus HERVH: Conflict or serendipity? Bioessays. 2016;38:109–117. doi: 10.1002/bies.201500096. [DOI] [PubMed] [Google Scholar]
  • 167.Casola C, Hucks D, Feschotte C. Convergent Domestication of pogo-like Transposases into Centromere-Binding Proteins in Fission Yeast and Mammals. Mol Biol Evol. 2007;25:29–41. doi: 10.1093/molbev/msm221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168.Cornelis G, et al. Retroviral envelope gene captures and syncytin exaptation for placentation in marsupials. Proc Natl Acad Sci U S A. 2015;112:E487–96. doi: 10.1073/pnas.1417000112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169.Gong C, Maquat LE. lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 3′ UTRs via Alu elements Nature. 2011;470:284–288. doi: 10.1038/nature09701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 170.Hu SB, et al. Protein arginine methyltransferase CARM1 attenuates the paraspeckle-mediated nuclear retention of mRNAs containing IRAlus. Genes Dev. 2015;29:630–645. doi: 10.1101/gad.257048.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 171.Shen S, et al. Widespread establishment and regulatory impact of Alu exons in human genes. Proc Natl Acad Sci U S A. 2011;108:2837–2842. doi: 10.1073/pnas.1012834108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 172.Roberts JT, Cardin SE, Borchert GM. Burgeoning evidence indicates that microRNAs were initially formed from transposable element sequences. Mob Genet Elements. 2014;4:e29255. doi: 10.4161/mge.29255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173.Zhang XO, et al. Complementary sequence-mediated exon circularization. Cell. 2014;159:134–147. doi: 10.1016/j.cell.2014.09.001. [DOI] [PubMed] [Google Scholar]
  • 174.Liang D, Wilusz JE. Short intronic repeat sequences facilitate circular RNA production. Genes Dev. 2014;28:2233–2247. doi: 10.1101/gad.251926.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 175.Johnson R, Guigó R. The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs. RNA. 2014;20:959–976. doi: 10.1261/rna.044560.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 176.Spengler RM, Oakley CK, Davidson BL. Functional microRNAs and target sites are created by lineage-specific transposition. Hum Mol Genet. 2014;23:1783–1793. doi: 10.1093/hmg/ddt569. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 177.Creasey KM, et al. miRNAs trigger widespread epigenetically activated siRNAs from transposons in Arabidopsis. Nature. 2014;508:411–415. doi: 10.1038/nature13069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 178.Kelley DR, Hendrickson DG, Tenen D, Rinn JL. Transposable elements modulate human RNA abundance and splicing via specific RNA-protein interactions. Genome Biol. 2014;15:537. doi: 10.1186/s13059-014-0537-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 179.González J, Lenkov K, Lipatov M, Michael Macpherson J, Petrov DA. High Rate of Recent Transposable Element-Induced Adaptation in Drosophila melanogaster. PLoS Biol. 2008;6:e251. doi: 10.1371/journal.pbio.0060251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 180.Ellison CE, Bachtrog D. Non-allelic gene conversion enables rapid evolutionary change at multiple regulatory sites encoded by transposable elements. Elife. 2015;4 doi: 10.7554/eLife.05899. This study reveals a gene conversion-like process by which cis-regulatory sequences derived from related TE copies may be rapidly ameliorated to fine-tune their binding affinity for a trans-regulatory protein. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 181.Romanish MT, Lock WM, van de Lagemaat LN, Dunn CA, Mager DL. Repeated Recruitment of LTR Retrotransposons as Promoters by the Anti-Apoptotic Locus NAIP during Mammalian Evolution. PLoS Genet. 2007;3:e10. doi: 10.1371/journal.pgen.0030010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 182.Tuan D, Pi W. In Human Beta-Globin Gene Locus, ERV-9 LTR Retrotransposon Interacts with and Activates Beta-but Not Gamma-Globin Gene. Blood. 2014;124:2686–2686. [Google Scholar]
  • 183.Lunyak VV, et al. Developmentally regulated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science. 2007;317:248–251. doi: 10.1126/science.1140871. [DOI] [PubMed] [Google Scholar]
  • 184.Emera D, et al. Convergent evolution of endometrial prolactin expression in primates, mice, and elephants through the independent recruitment of transposable elements. Mol Biol Evol. 2012;29:239–247. doi: 10.1093/molbev/msr189. [DOI] [PubMed] [Google Scholar]
  • 185.Chung H, et al. Cis-regulatory elements in the Accord retrotransposon result in tissue-specific expression of the Drosophila melanogaster insecticide resistance gene Cyp6g1. Genetics. 2007;175:1071–1077. doi: 10.1534/genetics.106.066597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 186.Guio L, Barrón MG, González J. The transposable element Bari-Jheh mediates oxidative stress response in Drosophila. Mol Ecol. 2014;23:2020–2030. doi: 10.1111/mec.12711. [DOI] [PubMed] [Google Scholar]
  • 187.Ding Y, et al. Natural courtship song variation caused by an intronic retroelement in an ion channel gene. Nature. 2016;536:329–332. doi: 10.1038/nature19093. This study employs an elegant mix of genetic mapping and genome editing to show that a polymorphic TE insertion within an intron of an ion channel gene is responsible for driving courtship song variation in Drosophila simulans. [DOI] [PubMed] [Google Scholar]

RESOURCES