Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
letter
. 2013 Nov 18;23(22):R994–R995. doi: 10.1016/j.cub.2013.10.028

Neanderthal and Denisovan retroviruses in modern humans

Emanuele Marchi 1,#, Alex Kanapin 2,#, Matthew Byott 3, Gkikas Magiorkinis 1,4,, Robert Belshaw 3,∗∗
PMCID: PMC3923971  PMID: 24262833

Summary

In the June 5th 2012 issue of Current Biology, Agoni et al.[1] reported finding 14 endogenous retrovirus (ERV) loci in the genome sequences of Neanderthal and/or Denisovan fossils (both ∼40,000 years old) that are not found in the human reference genome sequence. The authors [1] concluded that these retroviruses were infecting the germline of these archaic hominins at or subsequent to their divergence from modern humans (∼400,000 years ago). However, in our search for unfixed ERVs in the modern human population, we have found most of these loci. We explain this apparent contradiction using population genetic theory and suggest that it illustrates an important phenomenon for the study of transposable elements such as ERVs.

Main Text

The genomes of extinct human groups (archaic hominins), such as Neanderthals, are now available with high throughput sequencing technology, which can produce millions of short (∼100 base) sequences called reads from fossil bone or teeth. An analysis of a Neanderthal and a Denisovan genome identified many reads that contained sequences of viral origin, similar to known integrations of retroviruses into the germline of modern humans [1]. Such so-called endogenous retroviruses (or ERVs) are common, making up ∼5% of our genome. Some of the reads spanned the integration site of an ERV, called here a locus, and thus were part viral DNA and part archaic hominin DNA (Figure 1). In some cases, the authors [1] did not find an ERV at the corresponding coordinate in the human genome reference sequence. Instead they found the pre-integration site, which is the sequence that existed before the virus inserted a copy of itself into the chromosome. All of these loci belonged to one ERV lineage (family), called HERVK(HML2) or HERVK, which is the only lineage that has continued to replicate within humans in the last few million years [2]. They concluded that these retroviruses had infected the germline of the archaic hominins either after their divergence from modern humans (∼400,000 years ago) or immediately before divergence (with the integration and pre-integration sites then segregating differently in the lineages). However, while searching many new genome sequences of modern humans for ERVs, we have found most of these loci. For example, of the eight Denisovan loci for which Agoni et al. [1] were able to give precise genome coordinates, at least seven exist in modern humans. We have found six in an analysis of 67 cancer patient genomes (Figure 1), and examination of another study of 43 such genomes [3] shows all seven to be present (Supplemental information). One is K113 (19p12b), which is well-described and has a frequency of 16% in modern humans [2]. The four reported Denisovan loci lacking coordinates are within repetitive or unassembled regions of the genome, and we can neither confirm nor refute their presence in the modern human population: e.g. two loci are in transposable elements called Alu’s, of which there are ∼1,000,000 copies in the human genome (making up ∼10% of the human genome sequence). When an ERV integrates into another transposable element, finding this ERV locus can be a formidable computational challenge because there are many paralogous copies of the integration site. Two additional loci were reported from the Neanderthal fossil, and we have found one of these.

Figure 1.

Figure 1

ERV loci absent from the human reference genome but present in both archaic hominins and modern humans.

For each Agoni et al. [1] locus that we recovered in modern humans, the top sequence with black background shows the corresponding pre-integration region in the human reference sequence (hg19) and below are the reads from both the archaic hominins (with the viral regions in blue) and modern humans (viral regions in red). ‘De’ = Denisovan, ‘Ne’ = Neanderthal. In most cases there are reads spanning both upstream and downstream boundaries of the ERV, with the characteristic six base Target Site Duplication (TSD) of the host genome between them (see also Figure S1; only a small sample of the available reads from modern humans is shown). An asterisk shows the first base of the ERV, which in five of the seven instances represented here has integrated in reverse orientation. Coordinates taken from the UCSC Genome Browser at http://genome.ucsc.edu/ (Feb. 2009 (GRCh37/hg19) assembly). Both we and Agoni et al. [1] found the same A/G substitution in the TSD of HERV-K-De3.

It is unlikely that these ERV loci in the archaic hominins are contaminants from modern human DNA. Average coverage of the Denisovan genome was only about twofold and the contamination rate among the reads was estimated using several approaches to have been less than 1% [4]. We believe that the explanation lies in fundamental population genetics. With the exception of co-opted ERV loci such as syncytins [5], which could increase in frequency due to positive selection, we assume ERV loci become common by genetic drift, and the average time for a neutral allele to go to fixation is 4Ne generations (where Ne is the effective population size). Given estimates of long-term human generation time and population size [6], this is ∼800,000 years. The population divergence of modern humans from the Denisovan/Neanderthal lineage is more recent, between 170,000 and 700,000 years according to a more recent — and much deeper —sequencing of the above Denisovan fossil [7], so many loci will have persisted at fluctuating frequencies in all three lineages.

As well as showing how differences in loci between one genome and another must be interpreted cautiously, our finding illustrates how single genomes, whether the human reference or one from an archaic hominin fossil, are likely to only contain those ERV loci that after almost a million years have drifted to high frequency. These old loci give us only a limited insight into the processes that created them, e.g. they will have accrued multiple inactivating mutations during this time. In contrast, loci that have integrated recently are more likely to produce proteins and might even be replicating. Such loci are interesting, perhaps most importantly because they are more likely to be pathogenic. The long-running debate over whether or not ERVs cause disease in humans has been handicapped by our poor knowledge of ERV polymorphism. Characterising individual loci is necessary to test ERV involvement in disease 8, 9, and will aid the potential exploitation of ERV proteins as cancer and HIV immunotherapy targets [10].

ERVs in fossil hominins also improve our understanding of both ERV and human evolution. When the ERV loci in modern humans have been reasonably well-sampled, fossil loci will help us build a robust mathematical model of ERV proliferation. Then, because ERV loci make easily detectable and irreversible genetic markers (the common mechanism called ‘recombinational deletion’ leaves a relict structure called a solo-LTR [9]), they might help us in the measurement of divergence dates and population sizes for these archaic hominins.

Acknowledgments

We are grateful to the TCGA Data Access Committee (project 3504: “Endogenous retroviruses and cancer”), the UCSC CGhub, and the WGS500 Project Consortium for access to the data. The WGS500 project is funded by the Wellcome Trust, Oxford NIHR Biomedical Research Centre and Illumina. E.M., A.K. and R.B. are supported by the Wellcome Trust, and G.M. by an MRC Clinician Scientist Fellowship.

Footnotes

Supplemental Information including experimental procedures, one table and one figure can be found with this article online at http://dx.doi.org/10.1016/j.cub.2013.10.028.

Contributor Information

Gkikas Magiorkinis, Email: gkikas.magiorkinis@zoo.ox.ac.uk.

Robert Belshaw, Email: robert.belshaw@plymouth.ac.uk.

Supplemental Information

Document S1. Experimental Procedures one table and one figure
mmc1.pdf (110.8KB, pdf)

References

  • 1.Agoni L., Golden A., Guha C., Lenz J. Neandertal and Denisovan retroviruses. Curr. Biol. 2012;22:R437–R438. doi: 10.1016/j.cub.2012.04.049. [DOI] [PubMed] [Google Scholar]
  • 2.Subramanian R.P., Wildschutte J.H., Russo C., Coffin J.M. Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses. Retrovirology. 2011;8:90. doi: 10.1186/1742-4690-8-90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lee E., Iskow R., Yang L., Gokcumen O., Haseley P., Luquette L.J., III, Lohr J.G., Harris C.C., Ding L., Wilson R.K. Landscape of somatic retrotransposition in human cancers. Science. 2012;337:967–971. doi: 10.1126/science.1222077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Reich D., Green R.E., Kircher M., Krause J., Patterson N., Durand E.Y., Viola B., Briggs A.W., Stenzel U., Johnson P.L.F. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature. 2010;468:1053–1060. doi: 10.1038/nature09710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Dupressoir A., Lavialle C., Heidmann T. From ancestral infectious retroviruses to bona fide cellular genes: Role of the captured syncytins in placentation. Placenta. 2012;33:663–671. doi: 10.1016/j.placenta.2012.05.005. [DOI] [PubMed] [Google Scholar]
  • 6.Belshaw R., Dawson A.L.A., Woolven-Allen J., Redding J., Burt A., Tristem M. Genomewide screening reveals high levels of insertional polymorphism in the human endogenous retrovirus family HERV-K(HML2): Implications for present-day activity. J. Virol. 2005;79:12507–12514. doi: 10.1128/JVI.79.19.12507-12514.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Meyer M., Kircher M., Gansauge M.-T., Li H., Racimo F., Mallick S., Schraiber J.G., Jay F., Prüfer K., de Filippo C. A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012;338:222–226. doi: 10.1126/science.1224344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Young G.R., Stoye J.P., Kassiotis G. Are human endogenous retroviruses pathogenic? An approach to testing the hypothesis. Bioessays. 2013;35:794–803. doi: 10.1002/bies.201300049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Magiorkinis G., Belshaw R., Katzourakis A. “There and back again”: revisiting the pathophysiological roles of Endogenous Retroviruses in the post-genomic era. Phil. Trans. Roy. Soc. B. 2013;368:20120504. doi: 10.1098/rstb.2012.0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wang-Johanning F., Rycaj K., Plummer J., Li M., Yin B., Frerich K., Garza J., Shen J., Lin K., Yan P. Immunotherapeutic potential of anti-human endogenous retrovirus-K envelope protein antibodies in targeting breast tumors. J. Natl. Cancer Inst. 2012;104:189–210. doi: 10.1093/jnci/djr540. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Experimental Procedures one table and one figure
mmc1.pdf (110.8KB, pdf)

RESOURCES