Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Aug 1.
Published in final edited form as: Genomics. 2014 Jul 17;104(2):96–104. doi: 10.1016/j.ygeno.2014.07.001

De novo LINE-1 Retrotransposition in HepG2 Cells Preferentially Targets Gene Poor Regions of Chromosome 13

Pasano Bojang Jr 1, Mark Anderton 2, Ruth Roberts 2, Kenneth S Ramos 1,3
PMCID: PMC4157570  NIHMSID: NIHMS615020  PMID: 25043885

Abstract

Long interspersed nuclear elements (Line-1 or L1s) account for ~17% of the human genome. While the majority of human L1s are inactive, ~80–100 elements remain retrotransposition competent and mobilize through RNA intermediates to different locations within the genome. De novo insertions of L1s account for polymorphic variation of the human genome and disruption of target loci at their new location. In the present study, fluorescence in situ hybridization and DNA sequencing were used to characterize retrotransposition profiles of L1RP in cultured human HepG2 cells. While expression of synthetic L1RP was associated with full-length and truncated insertions throughout the entire genome, a strong preference for gene-poor regions, such as those found in chromosome 13 was observed for full-length insertions. These findings shed light into L1 targeting mechanisms within the human genome and question the putative randomness of L1 retrotransposition.

Keywords: Long Interspersed Nuclear Element-1(Line-1/L1), Retrotransposons, HepG2, Fluorescence in situ hybridization, chromosome 13

INTRODUCTION

A full-length human Long Interspersed Nuclear Element-1 (Line-1 or L1) is approximately 6kb and contains four major components: a 5′ untranslated region (5′UTR), two open reading frames (ORFs) separated by a 63bp inter-ORF region and a 3′UTR with a poly A tail and signal [1, 2]. L1 5’-UTR is 907bp long and contains an internal promoter that harbors several transcription factor binding sites including, Yin Yang-1 (YY1), Sox11, E2F and RUNX3 transcription factors [37]. ORF1 encodes a ~40kDa protein with RNA binding and nucleic acid chaperone activities [8]. ORF2 encodes a ~150kDa protein with endonuclease (EN), reverse transcriptase (RT) activities, and a zinc finger domain (ZF) believed to mediate ORF2-DNA interactions [911]. L1s have been shown to mobilize through target-primed reverse transcription (TPRT), also known as “copy and paste” mechanism; although alternate mechanisms are known to exist [12, 13].

Full length L1 mRNA is transcribed from its internal promoter by RNA polymerase II /III and exported to the cytoplasm [3]. Upon translation, ORF1p and ORF2p exhibit cis-preference and bind their encoding mRNA to form a ribonucleoprotein particle (RNP) [14, 15]. L1 RNP translocates into the nucleus, nicks a single-strand of genomic DNA to expose a 3′-OH group which is then used by the RT-domain of ORF2 to prime and synthesize the first strand of L1 cDNA [16, 17]. In the majority of cases, ORF2 falls off the template before reaching the 5’end due to its non-processive nature and the presence of premature cryptic polyadenylation sites in the ORF2 cDNA sequence [18]. This explains the overwhelming presence of truncated L1s littered throughout the genome. For example, there are ~516,000 copies of L1 in the human genome, with only 80–100 estimated to be full-length and retrotransposition competent [19, 20]. The RNP of L1 can also act in trans to mobilize short interspersed elements (SINES), such as Alu sequences, noncoding RNAs such as U6 snRNA, and some cellular mRNAs leading to the formation of processed pseudogenes [2124]. The integration of L1 sequences near genes can modulate their expression, induce alternative splicing, reshuffle the genome causing inter-individual genetic variations and/or lead to epigenetic dysregulation at the insertion site [25, 26, 27]. Within this context, we recently showed that forced expression of L1RP, an active L1 isolated from exon 1 of the retinitis pigmentosa gene (RP) of a patient with X-linked retinitis pigmentosa [28, 29], modulates genetic networks involved in the regulation of inflammation, adhesion and cellular metabolism in HepG2 cells [30]. The HePG2 cell line is frequently used because it retains high functional activity of liver-specific genes [31, 32]. Both wildtype and RTdomain mutant (i.e. D702Y) of L1RP induce epithelial to mesenchymal transition (EMT) in HepG2 cells, confirming a biological role for L1RP that does not involve retrotransposition [30].

To further evaluate molecular mechanisms of L1 mobilization and genetic reprogramming within the HepG2 genome, a follow-up study was conducted to characterize the retrotransposition profiles of L1RP. Analysis of L1 retrotransposition in HepG2 cells by fluorescence in-situ hybridization (FISH), an approach previously used to localize genomic consensus sequences of L1 ORF2 [30], coupled with DNA sequencing, established a strong preference for insertion into gene-poor regions of chromosome 13. These findings establish for the first time that L1 insertions are not entirely random events as originally proposed, and raise questions about the biology and molecular mechanism involved in the regulation of L1 retrotransposition.

MATERIALS AND METHODS

Cloning, Western Blotting, RT-PCR and Indirect Immunofluorescence

Cloning of vectors pB001CTR (vector backbone or CTR), pB016MUT (Aspartate (D) to Tyrosine (Y) mutant at position 702 (D702Y)) and pB015WT (Wildtype or WT) were done as described in Bojang et al. 2013 [30]. These vectors were used to generate stable transfected HepG2 cell lines used to monitor retrotransposition activity of L1RP. Total protein was extracted using the m-PER reagent (Thermo Scientific, Rockford, IL) and L1 ORF1-HASTREP and ORF2-FlAGMYC proteins detected using antibodies against Strep (Strep (S10D4) sc-52234, Santa Cruz, CA) and Flag (Anti-Flag (F1804): Sigma, St. Louis, MO) tags respectively. RNA was extracted using the RNeasy Mini kit (Qiagen, Maryland, cat# 74104) according to manufacturers’ instructions. L1 ORF1 and ORF2 mRNA was measured with primers specific for just the transfected L1 ORF1 (L1-ORF1exo-1F: 5’-GAAGGAAGCGCTAAACATGG-3’ and L1-ORF1exo-1R: 5’-TGGGACGTCGTATGGGTATT-3’) and ORF2 (L1-ORF2exo-1F: 5’-TGAAATTGGA AACCATCATTCTC-3’ and L1-ORF2exo-1R: 5’-CCTTGTCATCGTCATCCTTGT-3’), and the ΔΔCT method used to calculate relative levels of message in wildtype and control cells. Indirect immunofluorescence was done as described in Bojang et al., 2013 [30] with antibodies against HA tag (i.e. Anti-HA-tag (6E2) Mouse mAB-Alexa-594 conjugated antibody) of ORF1 and Flag tag (i.e. Anti-flag tag M2-Alexa Fluor-488 conjugated antibody) of ORF2 (Cell Signaling Technology Inc., Danvers, MA). Images were analyzed using the Axiovert Inverted microscope at 63× magnification.

Degenerate Oligonucleotide Primed Polymerase Chain Reaction (DOP-PCR) and FISH

Two pairs of degenerate primers were used to amplify the L1 5’-UTR (907bp) (UTR-1f: CAAGATGGCCGAATAGGAAC and UTR-1r: TACTTTTGGTCTTTGATGATGGTG) and the spliced 1kb neomycin gene (Neo-1f: GGATAGCATTGGGAGATATACCT and Neo-1r: ATTGAACAAGATGGATTGCACGC). The PCR fragment of the spliced neomycin gene was labeled using Degenerate Oligonucleotide Primed PCR (DOP-PCR) as described in Bojang et al., 2013 [30]. After initial PCR of the L1-5’UTR, 2µg of the L1-5’UTR was chemically-labeled with CY3 using the MIRUS FISH labeling kit (Cat # MIR 6510) at 37°C for 1hour. Both PCR products were visualized on a 1% agarose gel, purified and quantitated using the nanodrop. FISH analysis was done as described in Bojang et al., 2013 [30].

Characterization of L1 Insertions by Inverse PCR and DNA Sequencing

Genomic DNA (gDNA) was isolated from HepG2 cells stably expressing wildtype L1 and 500 ng was digested with MluI. gDNA was then phenol/chloroform extracted, ethanol precipitated, and ligated using T4 DNA ligase. DNA rings containing reverse transcribed and mobilized L1 sequences were isolated using primers specific for the neomycin gene (Inverse-Neo-1f: AGTGACAACGTCGAGCACAG: Inverse-Neo-1r: ATCAGGACATAGCGTTGGCT. Amplicons were gel purified and cloned into pCR2.1 TOPO TA (Invitrogen). Vectors were digested with EcoR1 to confirm insertions and each clone was sequenced using M13 forward and reversed primers. DNA sequences were blasted against the NCBI and UCSC Blat genome browser databases to identify L1 insertion sites.

RESULTS

Ectopic L1 undergoes complete cycles of retrotransposition in cultured HepG2 cells

We have previously shown that L1RP is expressed in HepG2 cells and remains retrotransposition competent after serial passage [30]. Here, we further characterize the expression profiles of L1 ORF1 and ORF2 proteins in HepG2 cells and evaluated retrotransposition profiles after extended culture. Figure 1A shows a schematic of the L1RP wildtype vector used in our studies, and described earlier in Bojang et al. 2013 [30, 34]. RT-PCR (Figure 1B) and Western experiments (Figure 1C) confirmed that L1 ORF1 and ORF2 proteins are readily detected in HepG2 cells transfected with L1RP, but not control plasmid. Measurements of integration and expression of the final spliced neomycin gene product into genomic DNA isolated from control (CTR), D702Y mutant (D702Y) and wildtype (WT) HepG2 cells showed that the un-spliced and spliced forms of the neomycin gene are only detected in wildtype cells (Figures 1D). In keeping with this observation, a 1kb neomycin gene product is only amplified from the cDNA of wildtype clones (Figure 1E). Figure 2 shows that the rate of retrotransposition is highly variable in different nuclei isolated from different clones of stably transfected HepG2 cells. Together; these data indicate that complete cycles of retrotransposition in HepG2 cells exhibit variable rates of retrotransposition among different nuclei.

Figure 1. Expression of ectopic L1 proteins in HepG2 cells.

Figure 1

A. Schematic diagram of L1 vectors used to examine the expression and retrotransposition of L1RP within the HepG2 genome. pB001CTR (CTR) is the vector backbonem while pB015WT (WT) consists of L1 ORF1 tagged with Strep and HA (green), ORF2 tagged with Myc and Flag (red) and a neomycin cassette placed in opposite orientation. The schematic also shows the retrotransposition of ectopic L1 leading to full-length or truncated insertions into the HepG2 genome. B. RT-PCR analysis of L1 ORF1 and ORF2 using primers specific for ectopic L1. The location of the primer sets is indicated in Figure 1A. C. Western blot of ORF1 and ORF2 proteins with antibodies directed against Strep and Flag, respectively, detected ~40kDA and ~150 kDA bands which are absent in cells transfected with control plasmid. D. Detection of spliced neomycin gene from gDNA and cDNA of HepG2 cells indicates the retrotransposition of ectopic L1. The spliced neomycin gene is absent from the cDNA of cells expressing control plasmid. E. Degenerate Oligonucleotide Primed PCR (DOP-PCR) product of neomycin gene from HepG2 genomic DNA. The results confirmed the increased size of the biotin-dTTP labeled (L) probe compared to unlabeled probe (Un).

Figure 2.

Figure 2

Analyses of retrotransposition rates in nuclei from different clones of stably transfected HepG2 cells. FISH was completed to evaluate L1 retrotransposition rates in individual nuclei. Column 1 shows chromosome spreads stained with DAPI, column 2 shows the neomycin probe stained with FITC/CY3 and column3 shows the merged signals. Differences in neomycin staining indicate that L1 retrotransposition rates are specific for individual nuclei.

Ectopic full-length and truncated L1 insertions are detected in the genome of HepG2 cells

The non-processive nature of L1 RT, the presence of cryptic polyadenylation sites in L1-ORF2 sequence, and the mode of translation of L1-ORF2 often lead to insertion of 5’UTR truncated L1 sequences [20]. As such, we sought to track the integration of full-length versus truncated L1 insertions by FISH. Two unique probes, L1RP-5’UTR and spliced neomycin gene probes were designed to distinguish full-length and truncated L1 sequences. Each probe was labeled with either biotin-dTTP or CY3. An increase in the apparent size of labeled probes indicated the incorporation of biotin-dTTP or CY3, respectively (Figure 3A). These two probes were then used to track full-length and truncated L1RP insertions in cultured HepG2 cells (Figure 3B), and to quantify the number of insertions. In our studies, only spreads with more than one insertion were counted, as this was judged to represent true, active retrotransposition events as opposed to stable integration. Arrows denote staining of the neomycin probe as an index of L1 retrotransposition. A total of 15 spreads was examined for the L1-5’UTR and 11 spreads for the neomycin gene, with 40 to 67 individual spots counted for each probe (Table 1). The results identified an average of 2.67 full-length insertions compared to 6.10 truncated insertions, confirming the assertion that the majority of L1 insertions are truncated at the 5’-UTR (Table 1). To further distinguish truncated from full length insertions, we analyzed the expression of L1 ORF1/2 by indirect immunofluorescence (Supplementary Figure 1a). We reasoned that cells with truncated 5’L1 should lose the expression of both proteins and as expected, some populations of wildtype cells lacked expression of both L1 ORF1 and ORF2 (Supplementary Figure 1a, compare columns 1 and 2 to columns 3 and 4).

Figure 3. Detection and quantification of full-length and truncated L1 insertions.

Figure 3

A. CY3 chemically-labeled L1RP -5’UTR using the MIRUS FISH labeling kit. Results showed a faint band that is larger in size compared to the unlabeled probe. The increase indicates the incorporation of CY3, while faintness is an indication that UV emission at ~ 300nm is not optimal for detection of fluorescent labeled L1RP -5’UTR (Top). DOP-PCR product of neomycin gene from HepG2 genomic DNA showing increased in size of the biotin-dTTP labeled (L) probe compared to unlabeled probe (Un). B. FISH analysis with L1RP-5’UTR (top) and neomycin (bottom) probes. The data indicate that retrotransposition of ectopic L1 can be readily assayed using FISH analysis. C. Dual Fish analysis of L1RP using the L1RP-5’UTR and neomycin probes. Column 1 shows DAPI staining (Blue) of chromosomes, column 2 shows FITC staining of the neomycin probe (green), column 3 shows Cy3 staining L1RP-5’UTR probe (red), column 4 shows matched CY3 and FITC staining and column 5 shows the merged staining for all dyes. Colocalization of FITC and CY3 (red arrow) indicated full-length L1RP insertions, while single FITC staining indicates truncated insertions. Scale bar is 10µm.

Table 1.

PROBE CRITERIA NUMBER OF
SPREADS
SPOTS
COUNTED
AVERAGE/SPREAD
L1-5’-UTR # OF SPOTS > 1 15s 40 2.67
SPLICED NEOMYCIN # OF SPOTS > 1 11 63 5.25

Chromosome spreads were counted for full length (L1-5’-UTR probe) or truncated (spliced Neomycin probe) insertions, with only spreads showing more than one staining event counted.

Genome analysis has previously demonstrated that the 5’UTR of different L1s are remarkably similar in sequence, such that the 5’-UTR probe used may have recognized endogenous L1s within the HepG2 genome. To test this hypothesis, combined FISH analysis using L1RP-5’UTR and neomycin probes was used as an index of full-length insertions. Figure 3C shows that CY3 and FITC (column 4) colocalize in HepG2 cells, indicating the presence of full-length ectopic L1RP insertion (red arrow), and the specificity of our probe for ectopic L1. It should be noted that the 5’UTR probe used did not stain multiple chromosomal locations in double FISH analysis, however, when the UTR probe was used by itself, several chromosomes were stained including chromosomes with multiple insertions (red arrow) (Supplementary Figure 1b). The lack of staining might have been caused by the lack of accessibility of the probes to these regions due to compacted heterochromatin of these ancient L1 sequences. This conclusion is supported by the fact that L1 sequences are heavily methylated which in turn induces the formation of heterochromatin [7, 35]. Figure 3C also shows that not all neomycin staining colocalized with the 5’UTR probe staining (white arrow), supporting the conclusion that the number of truncated insertions greatly exceeds the number of full-length insertions. Together, these data indicate that the retransposition activity of ectopic L1 can be readily assayed using FISH to differentiate between full-length and truncated L1 insertions.

Preferential L1 Insertions into Gene Poor Regions of Chromosome 13

Next, we sought to examine the randomness of L1 insertions after extended culturing. HepG2 cells underwent > 25 passages and were then processed for FISH using the Cy3-labeled neomycin probe. While the majority of spreads showed a random pattern of retrotransposition (Figure 4A), multiple insertions were consistently observed into a single chromosome (Figure 4). The metaphase spread presented in the first row of Figure 4B displayed three L1 insertions, which increased to five and then six, as a function of serial passage in culture. To authenticate preferential insertions, chromosome spreads were isolated from two independent clones and FISH analysis repeated. Again, both of these clones showed preferential insertion into the same chromosome confirming the initial observation (Supplementary Figure 2). These repeated insertions represent full length L1 insertions since truncated insertions do not have the ability to retrotranspose.

Figure 4. Patterns of L1 insertion after prolong culturing.

Figure 4

Metaphase chromosome spreads isolated from cells expressing wildtype L1 and probed biotin-labeled neomycin probes followed by streptavidin-CY3 or streptavidin-FITC secondary antibodies. A. Random insertion of ectopic L1 into three (top) and two (bottom) different chromosomes. Notice that sometimes not all chromosomes are released from the nucleus. B. Repeated or preferential insertion of ectopic L1 into the same chromosome. In the first panel (row-1), there are three L1 insertions, in panel 2 (row-2) there are five L1 insertions and in panel 3 (row-3) there are six L1 insertions into the same chromosome. Scale bar is 10µm.

Since L1RP may be preferentially targeted to this particular chromosome, G-banding experiments were conducted. G-banding analysis identified random insertions into chromosomes 8 and 21, while repeated insertions were identified into a chromosome that could either be 13 or Y based on the G-banding profile (Figure 5). To authenticate the results, and to more definitively identify the chromosome targeted for preferential insertion, inverse PCR of gDNA was completed. Genomic DNA from HepG2 cells was Mlul digested and T4 ligated followed by isolation of L1 rings using primers specific for the neomycin cassette (Figure 6A). Six unique amplicons (Clones 1–6) of sizes ~2.2, 1.0, 4.0, 0.7, 0.9 and 0.7kb, respectively, were isolated, gel purified, and cloned into the pCR2.1TA cloning vector (Figure 6B). Each clone was sequenced using M13 forward and reverse primers specific for the pCR2.1TA vector. The flanking genomic sequences of L1 insertions were identified using BLAT (http:genome.ucsc.edu) and BLASTN (http://blast.ncbi.nlm.nih.gov/blastn) sequence alignment search engines against the Human Genome Sequence. Four of the clones (Clones 2, 4, 5, 6) matched to chromosome 13 at different sites (Figure 6C), with clones 4 and 6 being identical in both size and flanking sequence (Figure 6B & C). Clone 3 inserted into chromosome 8, while the insertion site of Clone 1 could not be definitively identified given that flanking sequence was not obtained. Overall, these results identify with confidence chromosome 13 as an autosome with preferential targeting or duplication of L1 insertions.

Figure 5. Identification of the chromosome with repeated targeted by L1 insertion.

Figure 5

FISH analysis (left) and the corresponding G-banded spread (right) of chromosomes from HepG2 cell stably expressing wildtype L1. The Neomycin probe stained with FITC indicates the retrotransposition of L1 and the G-banded spread indicates that L1 retrotransposed into chromosomes 8, 21 and 13. Data indicate that chromosome 13 is repeatedly targeted by L1RP for insertion. Scale bar is 10µm.

Figure 6. Authentication of preferential insertions using inverse PCR followed by DNA sequencing to determine chromosome identity and insertion sites of L1 retrotransposition.

Figure 6

A. Schematic of the inverse PCR procedure. Briefly gDNA was digested with MluL, then ligated and PCR amplified using Neomycin specific primers. B. Agarose gel electrophoresis of independent inverse PCR products. C. Sequencing data after TA-cloning of each gel-purified amplicon. Table indicates clone number, approximate size, flanking sequence adjacent to the neomycin insert, and chromosome location. The results indicate that Clones 2,4,5,6 inserted into chromosome 13, while Clone 3 inserted into chromosome 8. The insertion site of Clone 1 could not be determined as no flanking sequence was resolved for this clone. D. A schematic depiction of L1 insertion sites into Chromosome 13. Dark shaded regions denote gene-rich areas, while gray shaded areas denote gene-poor regions. In all cases, L1 insertions occur within gene poor regions of the chromosome.

Lastly, insertion profiles into Chromosome 13 were examined given that this autosome has the lowest gene load, CpG island density, and exon coverage of all human autosomes (Dunham et al., 2004). The criteria employed by Dunham and coworkers was applied, where a region is classified as gene rich if it contains five or more genes per megabase (Mb), or gene poor if the density is less than five. Using the flanking sequence for each clone, Blat analysis showed that all insertions occurred within intronic regions, and more specifically, into gene poor regions denoted as gray shaded segments in Figure 6D. It should be noted that this analysis was limited to genes validated using the updated GRC38/Hg38 build (http:genome.ucsc.edu).

DISCUSSION

Evidence is presented here that the pB015WT plasmid can be reliably used to monitor retrotransposition at the single chromosomal and cellular levels. In the past, detection of L1 encoded proteins has proven challenging given the stringent restrictions posed by the structural organization of genetic elements and the widespread silencing of L1 expression. L1RP consists of two ORFs in-frame which are separated by an inter-ORF region containing two in-frame stop codons after the stop codon of ORF1. Both ORFs are transcribed from a common promoter, but the modes of translation for the two proteins are different. While L1 ORF1 is translated using the 40S ribosomal scanning model, L1 ORF2 relies on the translation/termination model giving rise to lower protein levels [3638]. Our own findings lend support to these views, with higher levels of ORF1 mRNA and protein than ORF2 mRNA and protein detected in HepG2 cells in all instances examined (Figure 1).

L1 sequences can be inserted into the genome as full length or 5’-truncated insertions, with only full-length insertions retaining the capacity to remobilize to new locations. Both insertion types can create epigenetic hot spots, alternate splice sites or alternate promoters, which in turn function to fine-tune the expression of nearby genes [39]. Thus, understanding the frequency of full-length and truncated insertions at the chromosomal and single cell levels is paramount for advancing our understanding of the genomic basis of human disease. The conventional methodology used for monitoring retransposition activity in cultured cells is to count the number of fluorescent signals, or G418-resistant foci, and to factor this as a function of the total number of cells (i.e. number of hygromycin resistant cells). This methodology likely underestimates actual retrotransposition rates because a single cell or chromosome may contain more than one L1 insertion (see Figure 16). In contrast, FISH methodology allows efficient tracking of L1 retrotransposition events, and to capture the pattern and number of L1 insertions at the single chromosomal or cell levels. Of particular note is that identification of cells with more than one insertion, as evidenced by the occurrence of multiple loci with selectable markers, facilitates the differentiation of full length and truncated insertions. As such, the approach can be used reliably to ask questions concerning retrotransposition events in a single G418-positive clone, or chromosome, and also to assess the randomness of this process. The approach can also be readily combined with restriction digestion, inverse PCR and/or sequence alignment to accurately determine both the pattern and mode of L1 insertions as shown here (Figure 6).

Although unknown for L1, many transposable elements (TEs) have developed highly specific targeting mechanisms that direct their integration to genome safe regions [4042]. For instance, characterization of the Tf1 fission yeast retrotransposon has shown that 95% of the integrations are clustered upstream of ORFs, with most of the promoters targeted by Tf1s representing genes associated with stress [40, 41]. Likewise, Ty5 transposable elements in yeast have been shown to change their target sites in response to stress, with integration into ORFs as opposed to heterochromatin in cells deprived of nitrogen [43]. Interestingly, Ty5 integration into heterochromatin is dependent on phosphorylation of Ser1095, with mutation of Ser1095 redirecting integration to expressed regions of the genome [43]. Experiments in maize have shown that integration of DNA transposons lead to variegated corn color phenotypes, while integration of Hatvine1-rrm DNA transposon into the promoter region of VvTFL1A gene influences the branching pattern and the fruit size of grapevines [43, 45]. A question that remains unanswered is whether similar mechanisms exist in higher organisms to direct the integration of mobile elements into the host genome.

Our analysis of L1RP retrotransposition in HepG2 cells revealed random insertions into all chromosomes, except chromosome 13 where preferential insertions were documented by FISH and DNA sequencing (Figure 4 & 6). Previous studies have revealed that older L1s specifically integrate into gene poor and AT rich regions of the genome, while new L1 integrations are interspersed, occurring near or within intronic regions at a loosely defined sequence of 5-TTTT/A-3 [27, 47]. These findings establish an evolving, but adaptive mechanism for L1 insertion into the host genome that might not be random, but rather contextual in a manner that affords selective advantage to the host. For example, others have noted the enrichment for recent L1 insertions into the human Y chromosome, including the unusually high number of full-length L1s [48]. Our own G-banding studies would support this notion since we could not readily distinguish between chromosomes 13 and Y, and one of the clones obtained by inverse PCR could be matched in sequence with confidence to either chromosome. The occurrence of preferential insertions into the genome may be linked to a low gene load, with chromosome 13 having the lowest gene density (6.5 genes per Mb) of all human autosomes and containing a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb [49]. We regard these regions as permissive “safe heaven” regions for insertion. Of note is that all of the insertions identified by DNA sequencing targeted intronic regions or repeat regions of high homology among all chromosomes, including the sex chromosomes (Figure 6). Thus, another likely target for preferential insertion may be the Y chromosome, where faulty selection and a low gene load have also been documented [50, 51]. Graves et al. established the inability of the Y chromosome to sort through its genes, and suggested that this may account for its propensity to accumulate junk DNA [50]. A low gene load would make preferential L1 insertions less harmful to the organism such that insertional mutagenesis would be of modest negative impact on overall survival.

Fish analysis is a routine procedure employed in the clinical genetics laboratory. As such, the approaches described here can be readily adapted in the clinical setting to address questions related to the role of L1 in human pathogenesis. Such studies are important given increasing recognition that genetic variation between individuals is largely attributed to the polymorphic expression of transposable elements [49, 52]. Further, the activities of TEs can be strongly regulated by environmental cues that define and dictate differences in disease susceptibility [12, 13, 27]. To date, up to 100 human diseases have been linked to the activity of TEs [5355]. Given that most repetitive regions of the genome cannot be easily sequenced using current methodologies, FISH analysis of transposable elements can be used in the clinical setting to evaluate polymorphic variations between individuals. These findings shed new light into L1 targeting within the genome, raise important questions about the cellular mechanisms responsible for L1 retrotransposition and strongly suggest that L1 retrotransposition is not entirely random.

Supplementary Material

01

Highlights.

  • L1 retrotransposition in HepG2 cells is not an entirely random process.

  • L1 exhibits a preference for insertion into gene-poor regions such as those found in chromosome 13.

  • FISH provides a tool to identify full-length and truncated L1 insertions into the host genome.

  • Identifying L1 insertions at the chromosome level may define the genomic basis of human disease.

ACKNOWLEDGMENTS

This work was supported in part by grants from the National Institute of Environmental Health Sciences (ES014443 and ES017274) and Astra Zeneca to KSR. We thank Dr. Alexander Asamoah for his referral and Ms. Margaret Barch for excellent technical assistance and helpful discussions regarding G-banding measurements. We also gratefully acknowledge the assistance provided by Dr. Bhagavatula Moorthy for his assistance with gDNA sequencing.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

COMPETING INTEREST

The authors declare no potential competing interests.

REFERENCES

  • 1.Scott AF, Schmeckpeper BJ, Abdelrazik M, Comey CT, O'Hara B, Rossiter JP, Cooley T, Heath P, Smith KD, Margolet L. Origin of the human L1 elements: proposed progenitor genes deduced from a consensus DNA sequence. Genomics. 1987;1:113–125. doi: 10.1016/0888-7543(87)90003-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Dombroski BA, Mathias SL, Nanthakumar E, Scott AF, Kazazian HH., Jr Isolation of an active human transposable element. Science. 1991;254:1805–1808. doi: 10.1126/science.1662412. [DOI] [PubMed] [Google Scholar]
  • 3.Swergold GD. Identification, characterization, and cell specificity of a human LINE-1 promoter. Mol Cell Biol. 1990;10:6718–6729. doi: 10.1128/mcb.10.12.6718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Becker KG, Swergold GD, Ozato K, Thayer RE. Binding of the ubiquitous nuclear transcription factor YY1 to a cis regulatory sequence in the human LINE-1 transposable element. Hum Mol Genet. 1993;2:1697–1702. doi: 10.1093/hmg/2.10.1697. [DOI] [PubMed] [Google Scholar]
  • 5.Tchenio T, Casella JF, Heidmann T. Members of the SRY family regulate the human LINE retrotransposons. Nucleic Acids Res. 2000;28:411–415. doi: 10.1093/nar/28.2.411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Yang N, Zhang L, Zhang Y, Kazazian HH., Jr An important role for RUNX3 in human L1 transcription and retrotransposition. Nucleic Acids Res. 2003;31:4929–4940. doi: 10.1093/nar/gkg663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Montoya-Durango DE, Liu Y, Teneng I, Kalbfleisch T, Lacy ME, Steffen MC, Ramos KS. Epigenetic control of mammalian LINE-1 retrotransposon by retinoblastoma proteins. Mutation research. 2009;665:20–28. doi: 10.1016/j.mrfmmm.2009.02.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Holmes SE, Singer MF, Swergold GD. Studies on p40, the leucine zipper motif-containing protein encoded by the first open reading frame of an active human LINE-1 transposable element. J Biol Chem. 1997;267:19765–19768. [PubMed] [Google Scholar]
  • 9.Mathias SL, Scott AF, Kazazian HH, Jr, Boeke JD, Gabriel A. Reverse transcriptase encoded by a human transposable element. Science. 1991;254:1808–1810. doi: 10.1126/science.1722352. [DOI] [PubMed] [Google Scholar]
  • 10.Feng Q, Moran JV, Kazazian HH, Jr, Boeke JD. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87:905–916. doi: 10.1016/s0092-8674(00)81997-2. [DOI] [PubMed] [Google Scholar]
  • 11.Kolosha VO, Martin SL. In vitro properties of the first ORF protein from mouse LINE-1 support its role in ribonucleoprotein particle formation during retrotransposition. Proc Natl Acad Sci USA. 1997;94:10155–10160. doi: 10.1073/pnas.94.19.10155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Myers JS, Vincent BJ, Udall H, Watkins WS, Morrish TA, Kilroy GE, Swergold GD, Henke J, Henke L, Moran JV, Jorde LB, Batzer MA. A comprehensive analysis of recently integrated human Ta L1 elements. American journal of human genetics. 2002;71:312–326. doi: 10.1086/341718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hedges DJ, Deininger PL. Inviting instability. Transposable elements, double-strand breaks, and the maintenance of genome integrity. Mutat Res. 2006;616:46–59. doi: 10.1016/j.mrfmmm.2006.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Martin SL. Characterization of a LINE-1 cDNA that originated from RNA present in ribonucleoprotein particles: implications for the structure of an active mouse LINE-1. Gene. 1995;153:261–266. doi: 10.1016/0378-1119(94)00785-q. [DOI] [PubMed] [Google Scholar]
  • 15.Wei W, Gilbert N, Ooi SL, Lawler JF, Ostertag EM, Kazazian HH, Boeke JD, Moran JV. Human L1 retrotransposition: cis preference versus trans complementation. Mol Cell Biol. 2001;21:1429–1439. doi: 10.1128/MCB.21.4.1429-1439.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jurka J. Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. Proc Natl Acad Sci USA. 1997;94:1872–1877. doi: 10.1073/pnas.94.5.1872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cost GJ, Boeke JD. Targeting of human retrotransposon integration is directed by the specificity of the L1 endonuclease for regions of unusual DNA structure. Biochemistry. 1998;37:18081–18093. doi: 10.1021/bi981858s. [DOI] [PubMed] [Google Scholar]
  • 18.Han JS, Boeke JD. A highly active synthetic mammalian retrotransposon. Nature. 2004;429:314–318. doi: 10.1038/nature02535. [DOI] [PubMed] [Google Scholar]
  • 19.Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, Kazazian HH., Jr High frequency retrotransposition in cultured mammalian cells. Cell. 1996;87:917–927. doi: 10.1016/s0092-8674(00)81998-4. [DOI] [PubMed] [Google Scholar]
  • 20.Brouha B, Schustak J, Badge RM, Lutz-Prigge S, Farley AH, Moran JV, Kazazian HH., Jr Hot L1s account for the bulk of retrotransposition in the human population. Proc Natl Acad Sci USA. 2003;100:5280–5285. doi: 10.1073/pnas.0831042100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Esnault C, Maestre J, Heidmann T. Human LINE retrotransposons generate processed pseudogenes. Nat Genet. 2000;24:363–367. doi: 10.1038/74184. [DOI] [PubMed] [Google Scholar]
  • 22.Buzdin A, Ustyugova S, Gogvadze E, Vinogradova T, Lebedev Y, Sverdlov E. A new family of chimeric retrotranscripts formed by a full copy of U6 small nuclear RNA fused to the 3' terminus of l1. Genomics. 2002;80:402–406. doi: 10.1006/geno.2002.6843. [DOI] [PubMed] [Google Scholar]
  • 23.Dewannieux M, Esnault C, Heidmann T. LINE-mediated retrotransposition of marked Alu sequences. Nat Genet. 2003;35:41–48. doi: 10.1038/ng1223. [DOI] [PubMed] [Google Scholar]
  • 24.Hancks DC, Kazazian HH., Jr SVA retrotransposons: Evolution and genetic instability. Semin Cancer Biol. 2010;20:234–245. doi: 10.1016/j.semcancer.2010.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Narita N, Nishio H, Kitoh Y, Ishikawa Y, Minami R, Nakamura H, Matsuo M. Insertion of a 5' truncated L1 element into the 3' end of exon 44 of the dystrophin gene resulted in skipping of the exon during splicing in a case of Duchenne muscular dystrophy. J Clin Invest. 1993;91:1862–1867. doi: 10.1172/JCI116402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Takahara T, Ohsumi T, Kuromitsu J, Shibata K, Sasaki N, Okazaki Y, Shibata H, Sato S, Yoshiki A, Kusakabe M, Muramatsu M, Ueki M, Okuda K, Hayashizaki Y. Dysfunction of the Orleans reeler gene arising from exon skipping due to transposition of a full-length copy of an active L1 sequence into the skipped exon. Human molecular genetics. 1996;5:989–993. doi: 10.1093/hmg/5.7.989. [DOI] [PubMed] [Google Scholar]
  • 27.Han JS, Szak ST, Boeke JD. Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature. 2004;429:268–274. doi: 10.1038/nature02536. [DOI] [PubMed] [Google Scholar]
  • 28.Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
  • 29.Schwahn U, Lenzner S, Dong J, Feil S, Hinzmann B, van Duijnhoven G, Kirschner R, Hemberger M, Bergen AA, Rosenberg T, Pinckers AJ, Fundele R, Rosenthal A, Cremers FP, Ropers HH, Berger W. Positional cloning of the gene for X-linked retinitis pigmentosa 2. Nature genetics. 1998;19:327–332. doi: 10.1038/1214. [DOI] [PubMed] [Google Scholar]
  • 30.Bojang P, Jr, Roberts RA, Anderton M, Ramos KS. Reprogramming of HepG2 Genome by Long Interspersed Nuclear Element-1. Mol Oncol. 2013;4:812–825. doi: 10.1016/j.molonc.2013.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, Epstein CB, Frietze S, Harrow J, Kaul R, Khatun J, Lajoie BR, Landt SG, Lee BK, Pauli F, Rosenbloom KR, Sabo P, Safi A, Sanyal A, Shoresh N, Simon JM, Song L, Trinklein ND, Altshuler RC, Birney E, Brown JB, Cheng C, Djebali S, Dong X, Ernst J, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Jennen DG, Magkoufopoulou C, Ketelslegers HB, van Herwijnen MH, Kleinjans JC, van Delft JH. Comparison of HepG2 and HepaRG by whole-genome gene expression analysis for the purpose of chemical hazard identification. Toxicological sciences. An official journal of the Society of Toxicology. 2010;115:66–79. doi: 10.1093/toxsci/kfq026. [DOI] [PubMed] [Google Scholar]
  • 33.Waters PD, Dobigny G, Waddell PJ, Robinson TJ. LINE-1 elements: analysis by fluorescence in-situ hybridization and nucleotide sequences. Methods in molecular biology. 2008;422:227–237. doi: 10.1007/978-1-59745-581-7_14. [DOI] [PubMed] [Google Scholar]
  • 34.Goodier JL, Zhang L, Vetter MR, Kazazian HH., Jr LINE-1 ORF1 protein localizes in stress granules with other RNA-binding proteins, including components of RNA interference RNA-induced silencing complex. Mol Cell Biol. 2007;27:6469–6483. doi: 10.1128/MCB.00332-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Teneng I, Montoya-Durango DE, Quertermous JL, Lacy ME, Ramos KS. Reactivation of L1 retrotransposon by benzo(a)pyrene involves complex genetic and epigenetic regulation. Epigenetics. 2011;3:355–367. doi: 10.4161/epi.6.3.14282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ilves H, Kahre O, Speek M. Translation of the rat LINE bicistronic RNAs in vitro involves ribosomal reinitiation instead of frameshifting. Mol Cell Biol. 1992;12:4242–4248. doi: 10.1128/mcb.12.9.4242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.McMillan JP, Singer MF. Translation of the human LINE-1 element, L1Hs. Proc Natl Acad Sci USA. 1993;90:11533–11537. doi: 10.1073/pnas.90.24.11533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Alisch RS, Garcia-Perez JL, Muotri AR, Gage FH, Moran JV. Unconventional translation of mammalian LINE-1 retrotransposons. Genes Dev. 2006;20:210–224. doi: 10.1101/gad.1380406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Montoya DE, Ramos KS. L1 retrotransposon and retinoblastoma: Molecular linkages between epigenetics and cancer. Curr Mol Med. 2010;10:511–521. doi: 10.2174/156652410791608234. [DOI] [PubMed] [Google Scholar]
  • 40.Leem YE, Ripmaster TL, Kelly FD, Ebina H, Heincelman ME, Zhang K, Grewal SI, Hoffman CS, Levin HL. Retrotransposon Tf1 is targeted to Pol II promoters by transcription activators. Molecular cell. 2008;30:98–107. doi: 10.1016/j.molcel.2008.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Devine SE, Boeke JD. Integration of the yeast retrotransposon Ty1 is targeted to regions upstream of genes transcribed by RNA polymerase III. Genes & development. 1996;10:620–633. doi: 10.1101/gad.10.5.620. [DOI] [PubMed] [Google Scholar]
  • 42.Sandmeyer S. Integration by design. Proc Natl Acad Sci USA. 2003;100:5586–5588. doi: 10.1073/pnas.1031802100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Guo Y, Levin HL. High-throughput sequencing of retrotransposon integration provides a saturated profile of target activity in Schizosaccharomyces pombe. Genome Res. 2010;20:239–248. doi: 10.1101/gr.099648.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Dai J, Xie W, Brady TL, Gao J, Voytas DF. Phosphorylation regulates integration of the yeast Ty5 retrotransposon into heterochromatin. Mol Cell. 2007;27:289–299. doi: 10.1016/j.molcel.2007.06.010. [DOI] [PubMed] [Google Scholar]
  • 45.McClintock B. The origin and behavior of mutable loci in maize. Proc Natl Acad Sci USA. 1950;36:344–355. doi: 10.1073/pnas.36.6.344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Fernandez L, Torregrosa L, Segura V, Bouquet A, Martinez-Zapater JM. Transposon-induced gene activation as a mechanism generating cluster shape somatic variation in grapevine. Plant J. 2010;61:545–557. doi: 10.1111/j.1365-313X.2009.04090.x. [DOI] [PubMed] [Google Scholar]
  • 47.Beck CR, Garcia-Perez JL, Badge RM, Moran JV. LINE-1 elements in structural variation and disease. Annu Rev Genomics Hum Genet. 2011;12:187–215. doi: 10.1146/annurev-genom-082509-141802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Boissinot S, Entezam A, Furano AV. Selection against deleterious LINE-1-containing loci in the human lineage. Molecular biology and evolution. 2001;18:926–935. doi: 10.1093/oxfordjournals.molbev.a003893. [DOI] [PubMed] [Google Scholar]
  • 49.Sheen FM, Sherry ST, Risch GM, Robichaux M, Nasidze I, Stoneking M, Batzer MA, Swergold GD. Reading between the LINEs: human genomic variation induced by LINE-1 retrotransposition. Genome research. 2000;10:1496–1508. doi: 10.1101/gr.149400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Rozen S, Skaletsky H, Marszalek JD, Minx PJ, Cordum HS, Waterston RH, Wilson RK, Page DC. Abundant gene conversion between arms of palindromes in human and ape Y chromosomes. Nature. 2003;423:873–876. doi: 10.1038/nature01723. [DOI] [PubMed] [Google Scholar]
  • 51.Graves JA, Koina E, Sankovic N. How the gene content of human sex chromosomes evolved. Curr Opin Genet Dev. 2006;16:219–224. doi: 10.1016/j.gde.2006.04.007. [DOI] [PubMed] [Google Scholar]
  • 52.Dunham A, Matthews LH, Burton J, Ashurst JL, Howe KL, Ashcroft KJ, Beare DM, Burford DC, Hunt SE, Griffiths-Jones S, Jones MC, Keenan SJ, Oliver K, Scott CE, Ainscough R, Almeida JP, Ambrose KD, Andrews DT, Ashwell RI, Babbage AK, Bagguley CL, Bailey J, Bannerjee R, Barlow KF, Bates K, Beasley H, Bird CP, Bray-Allen S, Brown AJ, Brown JY, et al. The DNA sequence and analysis of human chromosome 13. Nature. 2004;428:522–528. doi: 10.1038/nature02379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Seleme MC, Vetter MR, Cordaux R, Bastone L, Batzer MA, Kazazian HH., Jr Extensive individual variation in L1 retrotransposition capability contributes to human genetic diversity. Proc Natl Acad Sci USA. 2006;103:6611–6616. doi: 10.1073/pnas.0601324103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Belancio VP, Hedges DJ, Deininger P. Mammalian non-LTR retrotransposons: for better or worse, in sickness and in health. Genome Res. 2008;18:343–358. doi: 10.1101/gr.5558208. [DOI] [PubMed] [Google Scholar]
  • 55.Chen JM, Stenson PD, Cooper DN, Ferec C. A systematic analysis of LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease. Hum Genet. 2005;117:411–427. doi: 10.1007/s00439-005-1321-0. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES