Abstract
Retrotransposons account for almost half of our genome. They are mobile genetics elements—also known as jumping genes—but only the L1HS subfamily of Long Interspersed Nuclear Elements (LINEs) has retained the ability to jump autonomously in modern humans. Their mobilization in germline—but also some somatic tissues—contributes to human genetic diversity and to diseases, such as cancer. Here, we present euL1db, the European database of L1HS retrotransposon insertions in humans (available at http://euL1db.unice.fr). euL1db provides a curated and comprehensive summary of L1HS insertion polymorphisms identified in healthy or pathological human samples and published in peer-reviewed journals. A key feature of euL1db is its sample-wise organization. Hence L1HS insertion polymorphisms are connected to samples, individuals, families and clinical conditions. The current version of euL1db centralizes results obtained in 32 studies. It contains >900 samples, >140 000 sample-wise insertions and almost 9000 distinct merged insertions. euL1db will help understanding the link between L1 retrotransposon insertion polymorphisms and phenotype or disease.
INTRODUCTION
Repetitive DNA accounts for half of our genome. Most of these repeats are retrotransposons, i.e. mobile genetic elements, which proliferate through an RNA-mediated copy-and-paste mechanism, called retrotransposition. A tiny fraction of human retrotransposons is still able to autonomously generate new copies in modern humans (1). These active elements all belong to the L1HS subfamily (HS stands for human-specific), a subgroup of the L1 (Long Interspersed Nuclear Element-1 or LINE-1) clade of non-Long Terminal Repeat (LTR) retrotransposons found in vertebrates, plants and fungi. The L1 retrotransposon machinery is also able to mobilize in trans non-autonomous retrotransposons belonging to the Short Interspersed Nuclear Element (SINE) class (Alu, SVA); or cellular RNAs (U6, mRNA), which results in processed pseudogene formation (see (2–4) for recent reviews). Other transposable elements are molecular fossils and do not mobilize in modern humans.
A full-length human L1 is ∼6.0 kb in length, contains an internal promoter located in the 5′-untranslated region and encodes two proteins, ORF1p and ORF2p, both being required for L1 retrotransposition. ORF1p is an RNA-binding protein (5) and ORF2p an enzyme with endonuclease and reverse transcriptase activities (6,7). These proteins associate with the L1 mRNA to form a ribonucleoprotein particle, which is considered as the core of the L1 retrotransposition machinery (8,9). A new L1 copy is produced when ORF2p nicks the genomic DNA and extends this newly formed 3′ end using the L1 mRNA as a template, a process known as target-primed reverse transcription (TPRT) (7,10,11). This process results in a short duplication of the target site (TSD, target-site duplication). Abortive retrotransposition often leads to 5′ truncated L1 copies (12,13). Some L1 insertions exhibit both a 5′ truncation and an inversion, due to twin priming (14). Finally, L1 insertions can also contain 5′- or 3′-transductions corresponding to genomic sequences immediately upstream or downstream their progenitor copies. Such events originate from the retrotransposition of L1 transcripts generated from upstream promoters or ending downstream of the L1 sequence due to the weakness of the natural L1 polyadenylation signal (13,15,16). L1 target site preference is currently not fully defined, but both the endonuclease consensus sequence and the ability of the target site to partially anneal to the L1 mRNA poly(A) tail contribute to this process (7,17–19).
In the past 5 years, advances in deep-sequencing technologies have shed a new light on the extent of L1-mediated genome variation (20,21). L1HS represents ∼3.3 Mb of the reference human genome (∼0.1%). These L1 copies are often referred to as ‘reference L1HS elements’. However, each individual has additional L1HS copies not present in the reference genome, referred to as ‘non-reference L1HS elements’, which contribute to our genetic diversity (22–27). On the average, two human individual genomes differ at 285 sites with respect to L1 insertion presence or absence (27). These recent studies have also led to the discovery that L1HS is not only able to mobilize in the germline—resulting in inheritable genetic variations (3,28,29)—–but can also jump in some somatic tissues, such as brain (30–32) or in many cancers (26,33–39). Most retrotransposition events are the consequence of highly active, or ‘hot’, L1HS loci that constitute a small minority of full-length L1HS elements, with many of these being population-specific or even unique to a particular individual (private copies) (1,24). Therefore, understanding the link between L1HS insertion polymorphisms and phenotype or disease requires a comprehensive view of the different L1HS copies present in given individuals.
euL1db provides a curated and comprehensive summary of L1 retrotransposon insertion polymorphisms (RIPs) identified in healthy or pathological human samples and published in peer-reviewed journals. A sample is defined here as the primary biological material (e.g. tissue biopsy, blood, cell or cell line) from which a genomic DNA preparation was obtained and a sequencing library prepared. An important feature of euL1db is that insertions can be retrieved at a sample-by-sample level to facilitate correlations between the presence/absence of an L1 insertion with a specific phenotype or disease.
DATABASE STRUCTURE AND CONTENT
The euL1db database is organized in several tables, which are interconnected in a dynamic way, through the MySQL relational database management system. A simplified view of the object relationships is depicted in Figure 1 and a more detailed view of the underlying database structure is shown in Supplementary Figure S1.
Figure 1.
Database organization, data model and content. (A) Relationship between euL1db objects. euL1db is organized by study. Each study contains one or more samples. A sample originates from a single individual. Individuals can be analyzed in multiple studies. An SRIP (sample retrotransposon insertion polymorphism) is a real insertion detected in a given sample and has a unique ID prefixed by srip. Several samples from different individuals might possess an SRIP at the same genomic location. A private L1HS insertion will correspond to an SRIP only found in samples of the same individual. Inversely, an L1HS insertion which is fixed in the human population will appear as an SRIP at the same location in all the genome-wide samples of euL1db. Thus SRIP are highly redundant. In contrast, MRIP (meta-retrotransposon insertion polymorphisms) are virtual insertions obtained by merging overlapping or close SRIP, which are likely to correspond to the same retrotransposition event. Thus MRIP are non-redundant. (B) Approach used in euL1db to define unique L1HS insertion events. Nearby SRIPs are merged into a single MRIP if they satisfy all the following requirements: (i) they are located within 200 bp of each other, (ii) they share the same strand orientation, and (iii) they are all germline. Somatic retrotransposition events are unique by nature, and are not merged with germline events, nor merged together. Therefore, somatic SRIPs give rise to MRIPs containing only a single SRIP. (C) Overlap between euL1db and dbRIP. Numbers correspond to MRIP records in euL1db and to L1HS records in dbRIP (transposable elements not belonging to the L1HS subfamily were not taken into account to draw this Venn diagram).
The ‘Study’ table contains information about the study in which L1HS insertions were cataloged and mapped. Typically, a study will correspond to a single publication. Each study uses a coherent set of methods and analyses. Because these parameters determine to a large extent the variability that exists between data sets, all data in euL1db are organized by study. The ‘Individual’ table relates to the source individuals from whom the samples were originally taken from. The same individual might have been subjected to multiple analyses, possibly in different studies (e.g. Figure 1, individual 1, present in study 1 and 2). When available, euL1db stores the gender, the geographical origin, potential familial links with other individuals in euL1db and health information. All individuals from the 1000 Genomes Project have been incorporated in euL1db, even though only a small portion has been analyzed for L1HS content. This was necessary to maintain the family architecture and to facilitate future updates. The ‘Sample’ table describes the primary biological sample taken from a given individual and from which L1HS insertions were cataloged and mapped. When available, euL1db stores the anatomical and potential pathological data, and whether it was prepared from a single-cell or from multiple cells. Potential relationships between samples are also recorded (e.g. normal-tumor pairs). Importantly, a given sample can only be linked to a single study, and is given a unique ID. Unanalyzed individuals from the 1000 Genomes Project are not linked to any sample.
L1HS insertions found in a given sample are cataloged as ‘sample retrotransposon insertion polymorphism’ or SRIP. An SRIP is defined minimally by its genomic coordinates and is linked to a unique sample (Figure 1A). Additional optional information might include its genomic strand, its internal sequence, the length and sequence of its TSD or deletion, the presence of a 5′- or 3′-transduction, the presence of a 5′ inversion, the size of its downstream poly(A) sequence, its coordinates relative to the Repbase L1HS consensus sequence (and the positions of an inversion, if present), its allele frequency, if it is a somatic or a germline insertion, and its integrity (i.e. full length, 5′-truncated, 3′-truncated or internal fragment). Each SRIP is given a unique ID in euL1db, which is prefixed by srip (e.g. srip34564). Because several SRIP might actually correspond to the same original insertion event, some have identical or close genomic coordinates (e.g. srip 1, 3, 6, 8 and 10 in Figure 1A). To reduce this redundancy and to facilitate comparisons within and across studies, a set of virtual insertions named ‘meta-retrotransposon insertion polymorphism’ or MRIP has been computationally generated (Figure 1A and B). An MRIP refers to a unique genomic interval, which contains overlapping or close SRIP, likely corresponding to the same original insertion event. In practice, nearby SRIPs are merged into a single MRIP if they satisfy all the following requirements: (i) they are located within 200 bp of each other, (ii) they share the same strand orientation, and (iii) they are all germline insertions. Somatic retrotransposition events are unique by nature, and are not merged with germline events, nor merged together. Therefore, somatic SRIPs give rise to MRIPs containing only a single SRIP. Using a 200-bp window around SRIP rather than precise coordinates was necessary since different methods and studies have variable accuracy in defining the precise location of L1HS insertions. The rational for choosing the size of this window is detailed in the Supplementary Methods. Each MRIP is given a unique ID in euL1db, which is prefixed by mrip (e.g. mrip1234). Although the probability of finding two independent germline insertion events in the same 200-bp window is extremely low, it is not null. The euL1db framework allows users to compare annotations provided for each SRIP within a given MRIP. Depending on the study, SRIP annotations may include the length and/or sequence of the TSD, the reverse-transcribed L1 sequence or other additional potential rearrangements (inversion, transduction). In a situation where distinct insertion events were wrongly combined in a single MRIP, discrepancies in the SRIPs annotations could alert the user that caution should be taken. This also applies for the most extreme case, i.e. two independent insertion events occurring at the same exact nucleotide. Since reference L1HS insertions are virtual insertions derived from a consensus reference sequence and not from a biological sample, we have chosen to include them in a distinct table, entitled the ‘Reference’ table, and to assign them an ID prefixed by ref (e.g. ref123). This table is used internally to determine whether a given SRIP or MRIP actually corresponds to a reference L1HS insertion, and to annotate each record. The total number of SRIPs and MRIPs included for each study is graphed in Supplementary Figure S3.
In addition to these main tables, euL1db uses a ‘Method’ table, which contains the methods used to call SRIPs in the different studies, and a ‘Family’ table, which classifies the familial relationships between euL1db individuals (mostly from the 1000 Genomes Project). An individual without known relative in euL1db is not linked to any family.
The data contained in euL1db originate from peer-reviewed publications and have been manually curated and entered. The source of data and the curation process are detailed in Supplementary Methods and in Supplementary Table S1. The reference L1HS insertions have been processed from the UCSC RepeatMasker track table. The summary statistics at the time of writing are displayed in Table 1.
Table 1. euL1db content statistics.
Record type | Number of records |
---|---|
Studies | 32a |
Samples | 943 |
Individuals | 741 |
Families | 50b |
SRIP | 142,495 |
MRIP | 8991 |
Reference L1HS | 1545 |
aOut of 32 studies, 10 used high-, 1 medium- and 21 low-throughput approaches.
bWith at least two individuals analyzed.
DATA ACCESS
euL1db can be interrogated through a user-friendly Web Server (http://euL1db.unice.fr). A set of detailed tutorials and examples of use are accessible from the ‘Help’ tab. The detailed description of the Web Server architecture is described in Supplementary Figure S1.
There are several ways to query euL1db: (i) by searching SRIP or MRIP located in a single locus (genomic region, gene) or in a single individual (‘Search’ tab); (ii) by browsing the different tables and using filters to select a specific subset of data across and within studies, families, individuals, samples, insertions (‘Browse’ tab); (iii) by batch query using a list of multiple loci (genomic coordinates) or genes (gene names) (‘Utilities’ tab).
Users can choose to display L1 insertions as SRIP or MRIP in (i) graphical- (UCSC genome browser, dbVar genome browser) (40,41); (ii) tabular- (sortable html tables); or (iii) text-formats (including in standard BED format for subsequent analyses with other tools). Tables can be customized to display the information of interest for the user.
RELATIONSHIP AND DIFFERENCES WITH OTHER DATABASES
Several resources are related to—but distinct from—euL1db. Repbase is a database of consensus repetitive DNA sequence and as such does not contain any localization information (42). One of its entries is the L1HS consensus sequence and has been subsequently used to annotate the human reference genome and to identify the genomic loci corresponding to L1HS elements (Smit, A.F.A., Hubley, R. & Green, P. RepeatMasker Open-3.0. 1996–2010 <http://www.repeatmasker.org>). This information is available through the RepeatMasker table of the UCSC Genome Browser (40). The reference L1HS elements included in the ‘Reference’ table of euL1db have been processed and annotated using the latter. dbRIP was an early effort to catalog and annotate polymorphic retrotransposon insertions in humans (43). In contrast to dbRIP, euL1db stores data in a sample-wise manner and contains the most recent data sets obtained by high-throughput sequencing, including those from the 1000 Genomes Project. Although dbRIP could not be directly included in euL1db since samples are not documented in dbRIP, 94% of dbRIP L1HS records have an MRIP equivalent in euL1db (Figure 1C). dbRIP has also unique features since it contains non-autonomous human retrotransposons such as Alu or SVA sequences and not only L1HS insertions. As a particular case of structural variation, L1 retrotransposon insertions are also documented in dbVar/DGVa or DGV as mobile element insertions (MEI) (41). The data structure logics in dbVar/DGVa and euL1db are comparable (sample-wise variants and merged variants). euL1db is specialized for L1HS insertions, while dbVar/DGVa can include any type of structural variants, including MEI. However, the set of information for L1HS insertions provided by euL1db is much more exhaustive, and only a single study (out of 32 at the time of writing) stored in euL1db was also deposited in dbVar/DGVa.
CONCLUDING REMARKS
High-throughput sequencing technologies have considerably fostered the study of L1-mediated genomic variation and its impact on human health. We anticipate that this trend will continue in the next years, particularly with the availability of long-reads sequencing approaches, which might greatly facilitate the detection of L1HS insertions and their accurate positioning on the genome by generating reads that span the entire element and both flanking regions. In this respect, euL1db database and server have been tailored to support a considerable increase of SRIP, while keeping a fast-response time. To summarize, euL1db provides a centralized and user-friendly access to known germline and somatic L1HS insertions, which will be critical to elucidate the physiological or pathological impact of novel L1HS insertions. This resource will be useful in a large variety of fields such as human genetics, neurosciences or cancer genomics.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
Acknowledgments
We are grateful to the L1 community and to the 1000 Genomes Consortium for data sharing. The authors also thank R. Manas and M. Ducellier (University of Nice-Sophia Antipolis) for assistance with the IT infrastructure. We thank Aurelien Doucet for critical reading.
FUNDING
Institut National de la Santé Et de la Recherche Medicale and the Institut National du Cancer [2009–340 to G.C.]; the European Research Council [243312 to G.C.]; Agence Nationale pour la Recherche [ANR-11-LABX-0028–01 to G.C.]. Funding for open access charge: European Research Council [243312 to G.C.].
Conflict of interest statement. None declared.
REFERENCES
- 1.Brouha B., Schustak J., Badge R.M., Lutz-Prigge S., Farley A.H., Moran J.V., Kazazian H.H.J. Hot L1s account for the bulk of retrotransposition in the human population. Proc. Natl. Acad. Sci. U.S.A. 2003;100:5280–5285. doi: 10.1073/pnas.0831042100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Burns K.H., Boeke J.D. Human transposon tectonics. Cell. 2012;149:740–752. doi: 10.1016/j.cell.2012.04.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hancks D.C., Kazazian H.H. Active human retrotransposons: variation and disease. Curr. Opin. Genet. Dev. 2012;22:191–203. doi: 10.1016/j.gde.2012.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Beck C.R., Garcia-Perez J.L., Badge R.M., Moran J.V. LINE-1 elements in structural variation and disease. Annu. Rev. Genomics Hum. Genet. 2011;12:187–215. doi: 10.1146/annurev-genom-082509-141802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hohjoh H., Singer M.F. Sequence-specific single-strand RNA binding protein encoded by the human LINE-1 retrotransposon. EMBO J. 1997;16:6034–6043. doi: 10.1093/emboj/16.19.6034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mathias S.L., Scott A.F., Kazazian H.H., Boeke J.D., Gabriel A. Reverse transcriptase encoded by a human transposable element. Science. 1991;254:1808–1810. doi: 10.1126/science.1722352. [DOI] [PubMed] [Google Scholar]
- 7.Feng Q., Moran J.V., Kazazian H.H., Boeke J.D. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87:905–916. doi: 10.1016/s0092-8674(00)81997-2. [DOI] [PubMed] [Google Scholar]
- 8.Kulpa D.A., Moran J.V. Cis-preferential LINE-1 reverse transcriptase activity in ribonucleoprotein particles. Nat. Struct. Mol. Biol. 2006;13:655–660. doi: 10.1038/nsmb1107. [DOI] [PubMed] [Google Scholar]
- 9.Doucet A.J., Hulme A.E., Sahinovic E., Kulpa D.A., Moldovan J.B., Kopera H.C., Athanikar J.N., Hasnaoui M., Bucheton A., et al. Characterization of LINE-1 ribonucleoprotein particles. PLoS Genet. 2010;6:e1001150. doi: 10.1371/journal.pgen.1001150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Luan D.D., Korman M.H., Jakubczak J.L., Eickbush T.H. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell. 1993;72:595–605. doi: 10.1016/0092-8674(93)90078-5. [DOI] [PubMed] [Google Scholar]
- 11.Cost G.J., Feng Q., Jacquier A., Boeke J.D. Human L1 element target-primed reverse transcription in vitro. EMBO J. 2002;21:5899–5910. doi: 10.1093/emboj/cdf592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Gilbert N., Lutz-Prigge S., Moran J.V. Genomic deletions created upon LINE-1 retrotransposition. Cell. 2002;110:315–325. doi: 10.1016/s0092-8674(02)00828-0. [DOI] [PubMed] [Google Scholar]
- 13.Symer D.E., Connelly C., Szak S.T., Caputo E.M., Cost G.J., Parmigiani G., Boeke J.D. Human l1 retrotransposition is associated with genetic instability in vivo. Cell. 2002;110:327–338. doi: 10.1016/s0092-8674(02)00839-5. [DOI] [PubMed] [Google Scholar]
- 14.Ostertag E.M., Kazazian H.H. Twin priming: a proposed mechanism for the creation of inversions in L1 retrotransposition. Genome Res. 2001;11:2059–2065. doi: 10.1101/gr.205701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Moran J.V., DeBerardinis R.J., Kazazian H.H.J. Exon shuffling by L1 retrotransposition. Science. 1999;283:1530–1534. doi: 10.1126/science.283.5407.1530. [DOI] [PubMed] [Google Scholar]
- 16.Goodier J.L., Ostertag E.M., Kazazian H.H. Transduction of 3′-flanking sequences is common in L1 retrotransposition. Hum. Mol. Genet. 2000;9:653–657. doi: 10.1093/hmg/9.4.653. [DOI] [PubMed] [Google Scholar]
- 17.Repanas K., Zingler N., Layer L.E., Schumann G.G., Perrakis A., Weichenrieder O. Determinants for DNA target structure selectivity of the human LINE-1 retrotransposon endonuclease. Nucleic Acids Res. 2007;35:4914–4926. doi: 10.1093/nar/gkm516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Monot C., Kuciak M., Viollet S., Mir A.A., Gabus C., Darlix J.L., Cristofari G. The specificity and flexibility of L1 reverse transcription priming at imperfect T-tracts. PLoS Genet. 2013;9:e1003499. doi: 10.1371/journal.pgen.1003499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Viollet S., Monot C., Cristofari G. L1 retrotransposition: the snap-velcro model and its consequences. Mob. Genet. Elem. 2014;4:e28907. doi: 10.4161/mge.28907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.O'Donnell K.A., Burns K.H. Mobilizing diversity: transposable element insertions in genetic variation and disease. Mob. DNA. 2010;1:21. doi: 10.1186/1759-8753-1-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ray D.A., Batzer M.A. Reading TE leaves: new approaches to the identification of transposable element insertions. Genome Res. 2011;21:813–820. doi: 10.1101/gr.110528.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ewing A.D., Kazazian H.H. Whole-genome resequencing allows detection of many rare LINE-1 insertion alleles in humans. Genome Res. 2011;21:985–990. doi: 10.1101/gr.114777.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Stewart C., Kural D., Strömberg M.P., Walker J.A., Konkel M.K., Stütz A.M., Urban A.E., Grubert F., Lam H.Y., Lee W.P., et al. A comprehensive map of mobile element insertion polymorphisms in humans. PLoS Genet. 2011;7:e1002236. doi: 10.1371/journal.pgen.1002236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Beck C.R., Collier P., Macfarlane C., Malig M., Kidd J.M., Eichler E.E., Badge R.M., Moran J.V. LINE-1 retrotransposition activity in human genomes. Cell. 2010;141:1159–1170. doi: 10.1016/j.cell.2010.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Huang C.R., Schneider A.M., Lu Y., Niranjan T., Shen P., Robinson M.A., Steranka J.P., Valle D., Civin C.I., Wang T., et al. Mobile interspersed repeats are major structural variants in the human genome. Cell. 2010;141:1171–1182. doi: 10.1016/j.cell.2010.05.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Iskow R.C., McCabe M.T., Mills R.E., Torene S., Pittard W.S., Neuwald A.F., Van Meir E.G., Vertino P.M., Devine S.E. Natural mutagenesis of human genomes by endogenous retrotransposons. Cell. 2010;141:1253–1261. doi: 10.1016/j.cell.2010.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ewing A.D., Kazazian H.H. High-throughput sequencing reveals extensive variation in human-specific L1 content in individual human genomes. Genome Res. 2010;20:1262–1270. doi: 10.1101/gr.106419.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kazazian H.H., Wong C., Youssoufian H., Scott A.F., Phillips D.G., Antonarakis S.E. Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. Nature. 1988;332:164–166. doi: 10.1038/332164a0. [DOI] [PubMed] [Google Scholar]
- 29.Kaer K., Speek M. Retroelements in human disease. Gene. 2013;518:231–241. doi: 10.1016/j.gene.2013.01.008. [DOI] [PubMed] [Google Scholar]
- 30.Coufal N.G., Garcia-Perez J.L., Peng G.E., Yeo G.W., Mu Y., Lovci M.T., Morell M., O'Shea K.S., Moran J.V., Gage F.H. L1 retrotransposition in human neural progenitor cells. Nature. 2009;460:1127–1131. doi: 10.1038/nature08248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Baillie J.K., Barnett M.W., Upton K.R., Gerhardt D.J., Richmond T.A., De Sapio F., Brennan P.M., Rizzu P., Smith S., Fell M., et al. Somatic retrotransposition alters the genetic landscape of the human brain. Nature. 2011;479:534–537. doi: 10.1038/nature10531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Evrony G.D., Cai X., Lee E., Hills L.B., Elhosary P.C., Lehmann H.S., Parker J.J., Atabay K.D., Gilmore E.C., Poduri A., et al. Single-neuron sequencing analysis of L1 retrotransposition and somatic mutation in the human brain. Cell. 2012;151:483–496. doi: 10.1016/j.cell.2012.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Shukla R., Upton K.R., Muñoz-Lopez M., Gerhardt D.J., Fisher M.E., Nguyen T., Brennan P.M., Baillie J.K., Collino A., Ghisletti S., et al. Endogenous retrotransposition activates oncogenic pathways in hepatocellular carcinoma. Cell. 2013;153:101–111. doi: 10.1016/j.cell.2013.02.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Solyom S., Ewing A.D., Rahrmann E.P., Doucet T., Nelson H.H., Burns M.B., Harris R.S., Sigmon D.F., Casella A., Erlanger B., et al. Extensive somatic L1 retrotransposition in colorectal tumors. Genome Res. 2012;22:2328–2338. doi: 10.1101/gr.145235.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lee E., Iskow R., Yang L., Gokcumen O., Haseley P., Luquette L.J., Lohr J.G., Harris C.C., Ding L., Wilson R.K., et al. Landscape of somatic retrotransposition in human cancers. Science. 2012;337:967–971. doi: 10.1126/science.1222077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Miki Y., Nishisho I., Horii A., Miyoshi Y., Utsunomiya J., Kinzler K.W., Vogelstein B., Nakamura Y. Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Res. 1992;52:643–645. [PubMed] [Google Scholar]
- 37.Rodić N., Burns K.H. Long interspersed element-1 (LINE-1): passenger or driver in human neoplasms? PLoS Genet. 2013;9:e1003402. doi: 10.1371/journal.pgen.1003402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Goodier J.L. Retrotransposition in tumors and brains. Mob. DNA. 2014;5:11. doi: 10.1186/1759-8753-5-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Helman E., Lawrence M.S., Stewart C., Sougnez C., Getz G., Meyerson M. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing. Genome Res. 2014;24:1053–1063. doi: 10.1101/gr.163659.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Karolchik D., Barber G.P., Casper J., Clawson H., Cline M.S., Diekhans M., Dreszer T.R., Fujita P.A., Guruvadoo L., Haeussler M., et al. The UCSC genome browser database: 2014 update. Nucleic Acids Res. 2014;42:D764–D770. doi: 10.1093/nar/gkt1168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lappalainen I., Lopez J., Skipper L., Hefferon T., Spalding J.D., Garner J., Chen C., Maguire M., Corbett M., Zhou G., et al. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 2013;41:D936–D941. doi: 10.1093/nar/gks1213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Jurka J., Kapitonov V.V., Pavlicek A., Klonowski P., Kohany O., Walichiewicz J. Repbase update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 2005;110:462–467. doi: 10.1159/000084979. [DOI] [PubMed] [Google Scholar]
- 43.Wang J., Song L., Grover D., Azrak S., Batzer M.A., Liang P. dbRIP: a highly integrated database of retrotransposon insertion polymorphisms in humans. Hum. Mutat. 2006;27:323–329. doi: 10.1002/humu.20307. [DOI] [PMC free article] [PubMed] [Google Scholar]