Skip to main content
Cell Genomics logoLink to Cell Genomics
. 2023 Aug 16;3(9):100377. doi: 10.1016/j.xgen.2023.100377

High-coverage genome of the Tyrolean Iceman reveals unusually high Anatolian farmer ancestry

Ke Wang 1,2,3, Kay Prüfer 2, Ben Krause-Kyora 4, Ainash Childebayeva 2, Verena J Schuenemann 5,6,7, Valentina Coia 8, Frank Maixner 8, Albert Zink 8,, Stephan Schiffels 2, Johannes Krause 2,9,∗∗
PMCID: PMC10504632  PMID: 37719142

Summary

The Tyrolean Iceman is known as one of the oldest human glacier mummies, directly dated to 3350–3120 calibrated BCE. A previously published low-coverage genome provided novel insights into European prehistory, despite high present-day DNA contamination. Here, we generate a high-coverage genome with low contamination (15.3×) to gain further insights into the genetic history and phenotype of this individual. Contrary to previous studies, we found no detectable Steppe-related ancestry in the Iceman. Instead, he retained the highest Anatolian-farmer-related ancestry among contemporaneous European populations, indicating a rather isolated Alpine population with limited gene flow from hunter-gatherer-ancestry-related populations. Phenotypic analysis revealed that the Iceman likely had darker skin than present-day Europeans and carried risk alleles associated with male-pattern baldness, type 2 diabetes, and obesity-related metabolic syndrome. These results corroborate phenotypic observations of the preserved mummified body, such as high pigmentation of his skin and the absence of hair on his head.

Keywords: Iceman, population history, ancient DNA, Chacolithic, Neolithic, glacier mummy, alpine population, early farmers

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • High-coverage genome of the Iceman

  • Unusually high Anatolian-farmer-related ancestry

  • Dark skin and likely bald


Wang et al. reported a newly generated high-coverage genome of the Tyrolean Iceman and revealed his unusually high Anatolian-farmer-related ancestry as well as his potential male-pattern baldness and high levels of skin pigmentation.

Introduction

The Tyrolean Iceman (hereafter referred to as the Iceman), also known as “Ötzi,” is the world's oldest glacier mummy. Radiocarbon dating and stable isotope analysis have revealed that the Iceman lived during the Chalcolithic (Copper Age) in the Southern slopes of the eastern Italian Alps.1,2 His remains were found in the Italian part of the Ötztal Alps in 1991 and were directly dated to 3350–3120 calibrated BCE. In 2012, Keller et al. published the first whole-genome sequence of the Iceman.3 Comparative analyses based on autosomal data reported a close genetic affinity between the Iceman and present-day Sardinians. These findings were, however, published before genomes from a larger number of ancient western Eurasian individuals became available. Genomic data from European ancient individuals from 3000 to 4000 BCE, who we consider contemporaneous populations to the Iceman, showed that the genetic similarity between the Iceman and present-day Sardinians is due to common genetic components that were geographically widespread across Europe during the Neolithic period.4,5,6,7,8 The geographic region of the Alps, where the Iceman was discovered, remains, however, rather understudied.

The first Iceman genome from 2012 was generated using the ABI SOLiD sequencer platform, which requires complex computational infrastructure at high economical cost.9 It had relatively low coverage (7.6×) compared with the high-coverage genome generated in this study and showed the presence of modern human DNA contamination. Therefore, thanks to the recent development of sequencing technologies (Illumina technology) with higher output and lower cost, which have become standard in the field of ancient DNA research,9 we produced a new high-coverage genome for the Iceman (15.3× coverage) from the same left iliac bone sample used for the 2012 study, with minimal modern human contamination (0.5% ± 0.06%). We found that the Iceman shows unusually high early Neolithic-farmer-related ancestry among the analyzed European individuals from the 4th millennium BCE. Moreover, we show that the two ancestry components from European hunter-gatherer-related ancestry and early Neolithic-farmer-related ancestry present in the Iceman admixed rather recently before the Iceman’s death (56 ± 21 generations ago, namely 4880 ± 635 BCE), suggesting the survival of hunter-gatherer groups south of the Alps as late as 5000 to 4000 BCE.

Results

A new high-coverage genome

We obtained two samples of the left iliac bone and the surrounding tissue for a series of four extractions each in order to improve the amount of recovered ancient DNA and reduce modern human DNA contamination. In order to identify the extracts with the highest human DNA content for further processing, we compared the percentage of mapped human reads (i.e., endogenous DNA) after shotgun sequencing on an Illumina HiSeq platform, ranging from 1.58% to 51.02% endogenous human DNA (Table S1), for DNA libraries generated from these eight extracts (Table S1A). From the best two extracts (1412E2 and 1412E3), four double-stranded DNA libraries were generated with uracil DNA glycosylase (UDG), which reduces substitutions due to ancient DNA damage (i.e., deaminated cytosines) at the end of DNA fragments.10 We then performed paired-end shotgun sequencing on a total of 36 Illumina HiSeq sequencing lanes for all four libraries (Tables S1B and S1C). We processed the raw sequencing data with EAGER 1.92.211 and obtained a combined alignment with average genomic coverage of 15.3× after removing duplicates. In the end, we obtained 45.4% endogenous human DNA content (Table S1), with more than 90.6% of the genome covered by at least five reads (STAR Methods). We estimated the contamination level in the high-coverage genome using ANGSD based on the heterozygosity on the haploid X chromosome.12 The high-coverage Iceman genome has 0.5% ± 0.06% contamination, 10× less than the contamination level found in the previously published genome sequence3 (7.5% ± 0.25%) (Table S2).

Genetic ancestry analysis

It has been shown that early Neolithic European farmers derived most of their ancestry from early Anatolian farmers, suggesting that farming spread with people from the Near East through Anatolia and the Balkan peninsula starting around 7000 BCE.4,5,13 The arrival of farmers in Europe is followed by an increasing amount of admixture with local hunter-gatherers at variable levels during the initial expansion14,15,16 and subsequently into the 4th millennium BCE. By the end of the 4th millennium BCE, admixture between early Neolithic-farmer-related ancestry originating from Anatolia and European hunter-gatherer-related ancestry had become prevalent in most parts of Europe.5,15,16,17,18,19 Later, beginning from 2900 BCE, herders from the Pontic-Caspian steppe introduced substantial levels of so-called “Steppe-related ancestry” throughout Europe.17 After the 3rd millennium BCE until present day, all three ancestry components are found in almost all European populations.13

We examined the Iceman’s ancestry makeup in the context of those three ancestral components with corresponding representative proxies—western hunter-gatherers (“WHGs”), early Neolithic farmers from Anatolia (“Anatolia_N”) or Germany (“Germany_EN_LBK”), and herders from the Samara region (“Russia _Samara_EBA_Yamnaya”)—together with ancient populations from Germany, the northern Iberian Peninsula, Italy, and Sardinia also dated to the 4th millennium BCE (Figure 1; Table S3).

Figure 1.

Figure 1

Geographic location of the Iceman and analyzed published ancient western Eurasian groups

See also Table S3.

Projecting both the high-coverage and the previously published3 Iceman genome on modern western Eurasian genetic variation using principal-component analysis (PCA) (STAR Methods), the high-coverage genome is slightly shifted compared with the previously published one3 (Figure 2). The high-coverage genome of the Iceman clusters between the two groups in the PCA formed by (1) Middle-Neolithic and Chalcolithic Europeans dated to the 4th millennium BCE and by (2) early Neolithic European farmers, which in turn fall closely together with earlier farmers from Anatolia. In the genetic affinity test, the high-coverage Iceman genome shows a similar pattern, presenting the closest genetic affinity to contemporaneous Europeans from the 4th millennium BCE and early Neolithic European farmers (Figure S1).

Figure 2.

Figure 2

Principal-component analysis (PCA)

We project the high-coverage Iceman genome, the previously published Iceman genome, and related published ancient western Eurasian individuals onto present-day western Eurasian populations. See also Table S3.

In the PCA (Figure 2), the high-coverage Iceman genome locates closer to the cluster of early Neolithic Farmers from Europe (farmers associated with Linear Pottery culture, in short “LBK” hereafter) and Anatolia_N than other Middle-Late Neolithic to Chalcolithic Europeans (Spain_MLN, Italy_Sardinia_N, Italy_Sardinia_C, Italy_N.SG, Germany_MN_Baalberge, Germany_MN_Salzmuende, Germany_MN_Esperstedt, Italy_Broion_CA.SG),6,7,18,20 indicating that the Iceman may have more early Neolithic-farmer-related ancestry than other tested European individuals from the 4th millennium BCE.

To calculate the exact proportions of ancestral components in the Iceman and other contemporaneous ancient European groups, we applied qpAdm modeling to test three proxies for the early Neolithic-farmer-related ancestry—Germany_EN_LBK, Anatolia_N, and Germany_EN_LBK_Stuttgart.DG (a 7,000-year-old high-coverage shotgun genome from the LBK culture in Germany) (Tables S3 and S4). We found that the Iceman derives 90% ± 2.5% ancestry from early Neolithic farmer populations when using Anatolia_N as the proxy for the early Neolithic-farmer-related ancestry and WHGs as the other ancestral component (Figure 3; Table S4). When testing with a 3-way admixture model including Steppe-related ancestry as the third source for the previously published3 and the high-coverage genome, we found that our high-coverage genome shows no Steppe-related ancestry (Table S5), in contrast to ancestry decomposition of the previously published Iceman genome.3,17 We conclude that the 7.5% Steppe-related ancestry previously estimated for the previously published Iceman genome3,17 is likely the result of modern human contamination.

Figure 3.

Figure 3

Global and local ancestral proportions and admixture time of ancient groups in the 4th millennium BCE

(A) qpAdm estimates on the proportion of western hunter-gatherer (WHG) and Anatolia_N ancestry in Iceman and contemporaneous ancient European groups.

(B) Dating the admixture time using DATES for target populations shown in panel (A).

Horizontal bars in (A) and (B) represent ±1 standard error (SE) estimated by qpAdm and DATES correspondingly.

(C) Local ancestry assignments across 22 autosomes of the Iceman genome. The majority of the Iceman genome is inferred to be of farmer origin (LBK, orange), and a small fraction is inferred to be of WHG origin (blue), consistent with the global ancestry proportion estimates from qpAdm. Gray area represents short arms on the chromosomes, which do not contain any unique genetic material.

See also Table S4.

Compared with the Iceman, the analyzed contemporaneous European populations from Spain and Sardinia (Italy_Sardinia_C, Italy_Sardinia_N, Spain_MLN) show less early Neolithic-farmer-related ancestry, ranging from 27.2% to 86.9% (Figure 3A; Table S4). Even ancient Sardinian populations,7 who are located further south than the Iceman and are geographically separate from mainland Europe, derive no more than 85% ancestry from Anatolia_N (Figure 3; Table S4). The higher levels of hunter-gatherer ancestry in individuals from the 4th millennium BCE have been explained by an ongoing admixture between early farmers and hunter-gatherers in the Middle and Late Neolithic in various parts of Europe, including western Europe (Germany and France), central Europe, Iberia, and the Balkans.5,15,16,17,18,19 Only individuals from Italy_Broion_CA.SG found to the south of the Alps present similarly low hunter-gatherer ancestry as seen in the Iceman.21 We conclude that the Iceman and Italy_Broion_CA.SG might both be representatives of specific Chalcolithic groups carrying higher levels of early Neolithic-farmer-related ancestry than any other contemporaneous European group. This might indicate less gene flow from groups that are more admixed with hunter-gatherers or a smaller population size of hunter-gatherers in that region during the 5th and 4th millennium BCE.

Recent admixture between early farmers and hunter-gatherers in southern Europe

Given the high proportion of early Neolithic-farmer-related ancestry in the Iceman genome, we also tested if using the early Neolithic-farmer-related ancestry as a single source is sufficient. We found that qpWave results suggest neither Anatolia_N nor Germany_EN_LBK as an appropriate single source, confirming that the European hunter-gatherer-related ancestry is low but significantly present in the Iceman’s genome (p < 0.05; Table S6).

We estimated the admixture date between the early Neolithic-farmer-related (using Anatolia_N as proxy) and WHG-related ancestry sources using DATES22 to be 56 ± 21 generations before the Iceman’s death, which corresponds to 4880 ± 635 calibrated BCE assuming 29 years per generation23 (Figure 3B; Table S7) and considering the mean C14 date of this individual. Alternatively, using Germany_EN_LBK as the proxy for early Neolithic-farmer-related ancestry, we estimated the admixture date to be 40 ± 15 generations before his death (Table S7), or 4400 ± 432 calibrated BCE, overlapping with estimates from nearby Italy_Broion_CA.SG, who locate to the south of the Alps7,18,20 (Figure 3B).

While compared with the admixture time between early Neolithic farmers and hunter-gatherers in other parts of southern Europe, for instance in Spain and southern Italy, we found that, particularly, the admixture with hunter-gatherers as seen in the Iceman and Italy_Broion_CA.SG is more recent (Figure 3B; Table S3), suggesting a potential longer survival of hunter-gatherer-related ancestry in this geographical region.

Effective population size and heterozygosity

The high-coverage genome allows for additional analyses, such as estimating effective population sizes through time and genome-wide heterozygosity, that are not possible with lower-coverage or SNP-captured genomes. Specifically, we estimated the population-size history of the population represented by the Iceman and the two source populations represented by an early Neolithic farmer from Stuttgart in Germany (“Germany_EN_LBK_Stuttgart.DG”) and by a Mesolithic hunter-gatherer from Loschbour in Luxembourg (“Luxembourg_Loschbour.DG”)13 using MSMC2.24 The demographic histories estimated using the aforementioned three ancient high-coverage genomes share the same population bottleneck between 25,000 and 200,000 years ago, similar to that obtained from a present-day Sardinian individual, and they show a slight population size increase in a recent time epoch from 20,000–25,000 years ago (Figure S2). We observe a higher population size in recent times for the Iceman and the Germany_EN_LBK_Stuttgart.DG individual (both with high early Neolithic-farmer-related ancestry) compared with the Luxembourg_Loschbour.DG hunter-gatherer, which is possibly linked to the larger population size of early farming populations versus the hunter-gatherer populations in recent times.25

We estimated the rate of heterozygosity for the Iceman, Germany_EN_LBK_Stuttgart.DG, Luxembourg_Loschbour.DG, and a present-day Sardinian individual and plotted the per-chromosome estimate together with the standard error calculated from a weighted jackknife procedure in Figure S3 (Table S8). Both Germany_EN_LBK_Stuttgart.DG and the Iceman show higher heterozygosity levels than Luxembourg_Loschbour.DG, but the Iceman shows a relatively lower level than Germany_EN_LBK_Stuttgart.DG. This is consistent with the supposed relative isolation of the Iceman and the low WHG-derived ancestry seen in his genome.

New insights into the phenotypic traits and local ancestry assignments of the Iceman

The high-coverage genome allows us to investigate SNPs of phenotypic significance with sufficient coverage on individual allelic sites. We analyzed 147 SNPs of phenotypic interest, summarized in Table S9, including phenotypic sites examined in the previously published genome.3 We newly reported alleles for phenotypic traits of the Iceman related to reduced hair curliness, black hair color, obesity-related metabolic disorders, reduced freckling, and male-pattern baldness (Tables S9 and S10), in addition to the previously reported phenotype of possibly light skin pigmentation, brown eyes, and blood type O from Keller et al.3 (Table S9).

In particular, five SNPs (rs4988235, rs1050152, rs1495741, rs4751995, rs174546) assumed to be related to the adaptation to an agricultural lifestyle26 suggest that the Iceman was a comparatively slow metabolizer with low concentrations of animal-oriented fatty acids but high concentrations of plant-oriented fatty acids and may have had an intermediate high-density lipoprotein-cholesterol concentration level in general (Tables S9 and S10).

While genetic information cannot yet be used to completely reconstruct the appearance of an individual, genetic models exist for specific phenotypic features. Among those, skin pigmentation is a relatively well-understood trait that can be inferred from genetic data. We examined 170 skin pigmentation-associated SNPs from the UK Biobank genome-wide association study (GWAS) for skin color27,28 and retrieved diploid genotype information from 154 biallelic sites in Iceman. Each phenotype-informative SNP has a different effect size, i.e., the variance in pigmentation explained by individual SNPs is different. Thus, we combined the effect size of each pigmentation-informative SNP together with all examined effective alleles as an indication for the final phenotypic trait. To take effect size impact into consideration, we calculated a weighted genetic score given the individual SNP weight from the UK Biobank GWAS-estimated effect sizes,27,28 which is the weighted proportion of dark pigmentation alleles used as an indicator of skin pigmentation. The weighted genetic score of dark pigmentation in the Iceman is estimated to be 0.591, higher than the score of present-day southern European populations taking Sardinians as an example (Table S11), which the Iceman shares closest genetic affinity to (Figure S1) and which represent the highest level of pigmentation among modern-day European groups,29 although it is lower than the score of ancient LBK farmers and the Luxembourg_Loschbour.DG hunter-gatherer (Table S11).

The high-coverage genome also enables us to explore the ancestral origin of genomic regions along the genome given a set of phased genomes as ancestral references, making it possible to assign SNPs of phenotypic interests to a specific ancestral origin. We assigned local ancestry tracts across 22 autosomes of the Iceman using RFMix30 (Figure 3C), employing an imputed and phased dataset as the reference for WHG and the early Neolithic-farmer-related ancestry (STAR Methods). The WHG ancestry tracts have an average length of 1.174cM on average across 22 autosomes, close to the expected tract length calculated from admixture time and the WHG admixture proportion (1.754cM). In total, genomic tracts of the early Neolithic-farmer-related ancestry account for 91.4% of the genome, with the remaining genomic chunks being assigned to WHG origin (in 8.6%), in line with the global ancestry estimation from qpAdm (90%/10%; Table S4). Based on the assigned local ancestral tract distribution, we inferred two farmer’s diet-related SNPs (rs1495741, rs174546) mentioned above to be of LBK farmer origin, as expected.

Discussion

The reconstruction of a high-quality genome of the Iceman using Illumina sequencing enabled reanalyses providing novel insights into the phenotypic traits and genetic history of this individual. Contamination estimates showed that the high-coverage genome is almost free of contamination, in contrast to the previous genome showing around 7% of human DNA contamination.

Unlike previous analyses performed on the previous Iceman genome,17 we found no evidence for the presence of Steppe-related ancestry in the high-coverage one. Instead, his genome is best modeled as a genetic mixture between European hunter-gatherer-related ancestry and early Neolithic-farmer-related ancestry. The absence of Steppe-related ancestry is consistent with the dating of this individual preceding the arrival of Steppe-related ancestry in central and southern Europe.17 We found that the Iceman, together with the contemporary Italy_Broion_CA.SG located to the south of the Alps, carries the largest proportion of early Neolithic-farmer-related ancestry among all contemporaneous European individuals analyzed so far, suggesting that these individuals were relatively isolated from other European individuals who were more genetically admixed with ancient European hunter-gatherers. The remote location of Alpine valleys might contribute to such an isolation. We did, however, not find lower levels of genetic diversity in the Iceman genome compared with other early European farmers (represented by LBK farmers in Germany), with no signs of inbreeding. Altogether, these observations add nuance to our understanding of the mixture processes underlying the rise of the European hunter-gatherer-related ancestry throughout Europe during the 5th and 4th millennium BCE.5,15,17,18,19,31 While the general trend of rising European hunter-gatherer-related ancestry has been described throughout Europe, the presence of individuals with low European hunter-gatherer-related ancestry south of and within the eastern Italian Alps suggests regional heterogeneity in Late Neolithic Europe.

The high-coverage genome yielded further novel insights into the possible phenotypic traits of the Iceman, especially for complex traits regulated by multiple SNPs in the genome. We found that SNPs associated with an agricultural diet were present in the Iceman genome, two of which are assigned to local ancestry tracts of farmer origin, in line with the estimated high early Neolithic farmer ancestry. Our estimation of skin pigmentation for the Iceman based on over 100 regulatory SNPs related to that trait suggests that he displayed a rather dark skin, as also displayed by the actual mummy.32 While this was discussed as a result of the mummification process itself,33 our findings indicate a relatively dark skin complexion during his lifetime. Additional support for this assumption comes from a previous histological analysis of the Iceman's skin, where a small layer of brown melanin granules had been identified in the stratum basale of the epidermis.32 The appearance of the baldness-related allele in the high-coverage Iceman genome may be related to the fact that almost no human hair was found with the otherwise well-preserved mummy.34 Similar approaches like polygenic risk score and heritability calculation on other ancient high-coverage genomes might allow a more accurate reconstruction of complex phenotypic appearance such as skin pigmentation35 of ancient human individuals.

Limitations of the study

This study produced a high-coverage genome for the Tyrolean Iceman that enabled the detection of high Anatolian-farmer-related genetic ancestry in his genome; a single individual has, however, limited resolution in representing the population history of his time and region. Nevertheless, another individual from Broion Italy, bordering the southern Alps, presents similarly high Anatolian-farmer-related ancestry, supporting the observation for the Iceman genome. Future studies with a denser sampling from the southern Alps will be needed to replicate our findings and show if the Iceman was an outlier or a representative of his population.

Moreover, this study makes exploratory analyses on cross-comparing genetic-predicting scores of phenotypes based on high-coverage ancient genomes. For instance, we estimate the possible phenotypes based on the presence of phenotype-related alleles and the prediction on skin pigmentation based on polygenetic risk sores. We caution that the actual phenotype is a combined effect from genetic mechanism and environment exposures through gene-by-environment interaction,36 and multiple SNPs could be responsible for the heritability of complex traits like male-pattern baldness37 and skin pigmentation.27 Here, direct observation of the actual mummy allows us, however, to validate some of the findings such as pigmentation and baldness, corroborating the genetic prediction based on the genomic data. However, genomic data from ancient mummies is rather exceptional. In most ancient DNA studies, the predictive accuracy of ancient polygenetic scores for complex traits should be interpreted carefully together with various confounding factors, such as allelic turnover38 or population stratification.39

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Biological samples

Human archaeological skeletal material This study Iceman

Chemicals, peptides, and recombinant proteins

USER enzyme New England Biolabs Cat# M5505
Uracil Glycosylase inhibitor (UGI) New England Biolabs Cat# M0281
AccuPrime Pfx DNA polymerase Invitrogen Cat#12344024
Phusion High Fidelity DNA polymerase Thermo Scientific Cat# F530S
1x Tris-EDTA pH 8.0 AppliChem Cat# A85690500
0.5 M EDTA pH 8.0 Life Technologies Cat# AM9261
10x Buffer Tango Life Technologies Cat# BY5
Isopropanol Merck Cat# 1070222511
Ethanol Merck Cat# 1009832511
Proteinase K Sigma Aldrich Cat# P2308
Guanidine hydrochloride Sigma Aldrich Cat# G3272
3M Sodium Acetate pH 5.2 Sigma Aldrich Cat# S7899
Tween-20 Sigma Aldrich Cat# P9416
5M NaCl Sigma Aldrich Cat# S5150
ATP 100 mM Thermo Fisher Scientific Cat# R0441
1 M Tris-HCl pH 8.0 Thermo Fisher Scientific Cat# 15568025
GeneAmp 10x PCR Gold Buffer Thermo Fisher Scientific Cat# 4379874
dNTP Mix Thermo Fisher Scientific Cat# R1121
T4 DNA Polymerase New England Biolabs Cat# M0203
T4 Polynucleotide Kinase New England Biolabs Cat# M0201
Bst 2.0 DNA Polymerase New England Biolabs Cat# M0537
Quick Ligation Kit New England Biolabs Cat# M2200L

Critical commercial assays

MinElute PCR Purification Kit QIAGEN Cat# 28006

Deposited data

Raw and analyzed data This study ENA: PRJEB56570

Software and algorithms

EAGER v1.92.2 Peltzer et al.11 https://github.com/apeltzer/EAGER-GUI
AdapterRemoval v2.2.0 Schubert et al.40 https://github.com/MikkelSchubert/adapterremoval
BWA v0.7.12 Li and Durbin41 http://bio-bwa.sourceforge.net
dedup v0.12.2 Peltzer et al.11 https://github.com/apeltzer/DeDup
snpAD Pruefer42 https://bioinf.eva.mpg.de/snpAD/
Eigensoft v7.2.1 Patterson et al.43 https://github.com/DReichLab/EIG
DATES v753 Narasimhan et al.22 https://github.com/priyamoorjani/DATES
admixtools v5.1 Patterson et al.44 https://github.com/DReichLab/AdmixTools
Schmutzi Renaud et al.45 https://bioinf.eva.mpg.de/schmutzi
ANGSD v0.910 Korneliussen et al.12 http://www.popgen.dk/angsd/index.php/ANGSD
contamMix 1.0-10 Fu et al.46 https://github.com/plfjohnson/contamMix
RFMix v2.03-r0 Maples et al.29 http://med.stanford.edu/bustamantelab/
MSMC2 Schiffels and Wang47 https://github.com/stschiff/msmc-tools/
GLIMPSE Rubinacci et al.48 https://odelaneau.github.io/GLIMPSE

Resource availability

Lead contact

Further information and requests for resources should be directed and will be fulfilled by the lead contact, Johannes Krause (krause@eva.mpg.de).

Materials availability

This study did not generate new unique reagents.

Experimental model and subject details

Genome generation

A bone biopsy was taken from the Iceman’s left ilium (0.1 g) under sterile conditions in the Iceman’s preservation cell at the South Tyrol Archaeological Museum in Bolzano, Italy. DNA was extracted at the Institute for Archaeological Sciences, University of Tübingen, Germany, by using two powdered samples (1411 and 1412) of bone material and surrounding tissue from pelvic bone for a series of four sequential extractions each to reduce contamination and maximise the recovery of endogenous DNA. We followed the extraction protocol by Dabney et al.,49 and varied incubation times and temperatures in four test extractions. In the first extraction (E1) the samples were incubated for 10 min with the extraction buffer at 37°C. The samples were then centrifuged and the extraction buffer removed. These steps were then repeated for a second extraction (E2). In the third extraction (E3) the incubation time was elongated to an overnight incubation at 37°C; and for the last extraction (E4) one hour at 56°C was used. From each of the extractions, we took an aliquot of 10μl to build a double-stranded, dual-indexed DNA library,40,41 for which shallow shotgun sequencing was performed on an Illumina HiSeq platform in order to identify the optimal extracts for deeper sequencing. The resulting reads were processed with the EAGER11 pipeline, revealing the highest percentage of mapped reads to the human reference genome for the extracts 1412E2 and 1412E3 (Table S1). Of these two extracts, four additional double-stranded DNA libraries were prepared with uracil DNA glycosylase (UDG) treatment.40,41 To generate a high coverage genome, we then subjected the libraries to paired-end shotgun sequencing on a total of 36 Illumina HiSeq sequencing lanes.

Method details

Bioinformatic processing

We processed raw sequencing data per sequencing lane following the pipeline in EAGER 1.92.2,11 with adaptors removed by AdapterRemoval v242 reads mapped to the human reference genome by BWA v0.7.1245 and polymerase chain reaction (PCR) duplicates removed by Dedup v0.12.2.11 We merged sequencing data for each sequencing library and performed duplication removal again for each individual-library-based BAM (Table S1). Across four sequencing libraries, there are 2,221,240,494 sequencing reads in total, out of which 1,008,074,503 reads mapped to the human reference genome hs37d5. Then, we merged four individual-library-based BAM files (after removing duplicates), and ended up with 707,832,853 reads (Table S1), based on which we ran Dedup v0.12.211 again, and finally obtained 678,475,568 reads for genomic coverage calculation and downstream analyses.

We called diploid genotypes with snpAD46 using a minimum base quality (Phred-scaled) of 30 and a minimum mapping quality (Phred-scaled) of 30. It resulted in 1,062,059 SNPs overlapping with the 1240k SNP panel. The diploid genotype was later merged with a publicly released dataset of ancient and present-day individuals for the same set of ∼1.24million SNPs (https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadrdownloadable-genotypes-present-day-and-ancient-dna-data, v44.3) and individuals from Saupe et al. 2021.21 The details of analyzed genetic groups in this study are summarised in Table S3. We estimated nuclear contamination using ANGSD v0.910.12 We estimated mitochondria contamination using schmutzi43 and contamMix 1.0-10.50

Principal component analysis (PCA)

We carried out a principal components analysis using smartpca v16000 from the eigensoft v7.2.1 package44 with “lsproject: YES” and no shrink mode. The PCA is based on 68 western Eurasian populations in the Human Origins datasets (Table S3).4,13,48,51 We projected our new high coverage Iceman genome and the previously published Iceman shotgun genome data from Keller et al. 20123 onto the PC space calculated from modern western Eurasians. We used DataGraph to visualize the PCA results.

Outgroup f3 statistics

We examined the genetic affinity of the high coverage Iceman genome to previously published relevant ancient genetic groups and worldwide modern populations utilizing outgroup f3(Iceman, Ancient populations; Mbuti.DG) and outgroup f3(Iceman, Modern population X, Mbuti.DG) statistics. We used the qp3Pop v435 programs in the AdmixTools v5.1 package47 for the f3 statistics calculations. Modern population X and Mbuti.DG are from Simons Diversity Genome Project (SGDP).52

qpAdm and qpWave analyses

We modeled the ancestry of the high coverage Iceman genome using qpAdm v810 and qpWave v410 in the AdmixTools-5.1 package.47 We tested various two-way combinations of WHG/Luxembourg_Loschbour.DG and Antolia_N/Germany_EN_LBK/Germany_EN_LBK_Stuttgart.DG5,13,16,19,22,53,54 for Iceman and five contemporaneous ancient groups around the 4th millennium BCE from southern Europe (Italy_N.SG,20 Italy_Broion_CA.SG,21 Italy_Sardinia_C,6,7 Italy_Sardinia_N,6,7 Spain_MLN5,16,18,55). We used a list of 13 ancient populations as our default outgroup list, including Ethiopia_4500BP_published.SG56, Russia_Ust_Ishim.DG57, Russia_Kostenki1458, Belgium_UP_GoyetQ116_1_published_all58, Czech_Vestonice1658, Russia_MA1_HG.SG59, Spain_ElMiron58, Italy_North_Villabruna_HG58, Iran_GanjDareh_N4,53, Israel_Natufian_published4,53, CHG60, EHG5,58, Levant_N.4 To replicate the modeling results of the previously published Iceman genome3 in Haak et al. 2015,17 we also applied the same ‘O9’ and ‘O9N’ outgroup list used in Haak et al. 2015,17 and summarized corresponding modeling results in Table S5.

Local ancestry decomposition

To examine the ancestral tracts of specific ancestry, we decomposed the high-coverage Iceman genome using RFMix,30 which identifies genomic chunks of contiguous ancestry of a given phased genome using a reference panel of phased haplotypes from two or three source populations. As Rfmix requires more than one phased genome for each source population as reference, we used four ancient genomes13,19 as the WHG source, and 14 ancient LBK genomes5,13,16,19 as the early Neolithic farmer source. Low coverage WHG and LBK genomes were imputed and phased with GLIMPSE61 using 1000 Genomes as a reference panel. The high coverage genome Luxembourg_Loschbour.DG from Luxembourg (22x, 6221-5986 calBCE62) and Germany_EN_LBK_Stuttgart.DG from Germany (19x, 5307-5071 calBCE)13 were called and phased following the same strategy as described for the high-coverage Iceman genome. We applied DATES22 to estimate the admixture date in the Iceman using the same set of reference individuals, and estimated the expected average length of WHG ancestry tract by 1/admixture date ∗ (1 – WHG ancestry proportion).

Effective population size estimates

We estimated effective population sizes using MSMC263 for the high coverage Iceman and a modern Sardinian genome,64 as well as for Luxembourg_Loschbour.DG and Germany_EN_LBK_Stuttgart.DG that represent ancient European hunter-gatherer and early Neolithic farmer populations respectively. As MSMC2 requires high coverage genomes, we randomly chose a high coverage modern Sardinian genome (SS6004474) published in Pruefer et al. 2014,64 together with the high-coverage ancient genomes Luxembourg_Loschbour.DG and Germany_EN_LBK_Stuttgart.DG. We followed the MSMC2 tutorial (https://github.com/stschiff/msmc-tools/blob/master/msmc-tutorial) for preparing individual masks and used VCF files with reliable diploid genotype calls from snpAD. For Sardinian SS6004474, we used masks as described in Supplementary section 5b of ref. 62 instead of individual masks. Together with negative universal masks (https://github.com/wangke16/MSMC-IM/tree/master/masks), we generated the required format of input files for MSMC2 using generate_multihetsep.py. We run MSMC2 using command line msmc2 –I 0,1 –o outfile.msmc2 input.chr∗.multihetsep.txt.

Phenotypic SNP analyses

We first examined a list of 147 SNPs encoded for traits related to diabetes, male-pattern baldness, metabolic disorders, skin/hair/eye pigmentation and agriculturalist’s diet etc. In particular, we examined the list of SNPs across the whole genome for Type 2 diabetes and male-pattern baldness phenotypic trait based on previous findings from GWAS studies. (https://www.ebi.ac.uk/gwas/api/search/downloads/studies_alternative). For each locus, we report the risk allele and called genotype in the Iceman genome, as summarized in Tables S9 and S10. We also examined five SNPs (FADS1/rs174546, SLC22A4/rs1050152, MCM6 (LCT)/rs4988235, NAT2/rs1495741, PRLP2/rs4751995), which are related to putative agricultural adaptations as summarized in Mathieson et al. 2018.26 Meanwhile, we examined 170 skin-pigmentation related SNPs analyzed in Ju and Mathieson 2020,27 and calculated the polygenic genetic score by a weighted sum for dark pigmentation phenotypic trait in the Iceman, Germany_EN_LBK_Stuttgart.DG, Luxembourg_Loschbour.DG and Sardinian SS6004474.

Estimates of heterozygosity

We estimate average heterozygosity levels for the Iceman, Germany_EN_LBK_Stuttgart.DG, Luxembourg_Loschbour.DG and a modern Sardinian individual by taking the fraction of the number of heterozygous sites over the total number of sites across 22 autosomes. We also estimated heterozygosity by applying a set of filters for ancient genomes, which filtered out regions of low mappability and sequence complexity, indels, and unusually high or low coverage after correcting for local GC content as described in Supplementary section 5b of ref. 62 We calculated error bars for our heterozygosity estimates via weighted jackknife following formulas provided here (https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/wjack.pdf).

Acknowledgments

We thank Alexander Peltzer for his help with the bioinformatic processing of shallow sequencing data. We thank the Multimedia department at the Max Planck Institute for Evolutionary Anthropology, Leipzig, and Michelle O'Reilly for help with the geographic map. We thank Dan Ju for discussion on the polygenetic risk score of skin pigmentation. V.J.S. was supported by the University of Zurich’s University Research Priority Program “Evolution in Action: From Genomes to Ecosystems.”

Author contributions

J.K. and A.Z. conceived the study. J.K., S.S., and A.Z. supervised the study. A.Z., F.M., and V.C. provided archaeological material and advised on the material background and interpretation. V.J.S. and B.K.-K. performed laboratory work. K.W., K.P., A.C., and S.S. analyzed data. K.W. and J.K. wrote the manuscript with input from all co-authors.

Declaration of interests

The authors declare no competing interests.

Published: August 16, 2023

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.xgen.2023.100377.

Contributor Information

Albert Zink, Email: albert.zink@eurac.edu.

Johannes Krause, Email: krause@eva.mpg.de.

Supplemental information

Document S1. Figures S1–S3 and Tables S2, S6–S8, S10, and S11
mmc1.pdf (271.7KB, pdf)
Table S1. Sequencing details of the high-coverage Iceman genome, related to STAR Methods
mmc2.xlsx (12.9KB, xlsx)
Table S3. List of modern-day and ancient populations analyzed in this study, related to Figures 1, 2, and 3
mmc3.xlsx (14.4KB, xlsx)
Table S4. qpAdm modeling results of two-way admixture between hunter-gatherer-related and early Neolithic-farmer-related ancestry, related to Figure 3 and STAR Methods

We use WHG and Anatolia_N as the ultimate ancestry source, and use Loschbour and Germany_EN_LBK_Stuttgart.DG as the proximal ancestry source.

mmc4.xlsx (13.4KB, xlsx)
Table S5. Comparison of 3-way modeling results of the high-coverage Iceman genome and previously published Iceman genome, related to Figure 3 and STAR Methods

We tested three outgroup lists here, including the default outgroup list we used in this study, ‘O9’ and ‘O9N’ outgroup panel used in Haak et al.17.

mmc5.xlsx (13KB, xlsx)
Table S9. The whole list of phenotype-related SNPs we examined in this study, related to STAR Methods

Here we report the list of 147 SNPs for various phenotype examination, and the list of 170 UK Bio Bank SNPs used for calculating polygenic risk score of skin pigmentation and corresponding genotype in Iceman. We also report comparison of genotype calls in new and old Iceman genome based on the list of phenotype-related SNPs in Keller et al.3.

mmc6.xlsx (46.1KB, xlsx)
Document S2. Article plus supplemental information
mmc7.pdf (4.3MB, pdf)

Data and code availability

Sequencing data are available at the European Nucleotide Archive (ENA) under ENA: PRJEB56570.

References

  • 1.Bonani G., Ivy S.D., Hajdas I., Niklaus T.R., Suter M. Ams 14C age determinations of tissue, bone and grass samples from the Ötztal Ice Man. Radiocarbon. 1994;36:247–250. [Google Scholar]
  • 2.Müller W., Fricke H., Halliday A.N., McCulloch M.T., Wartho J.-A. Origin and migration of the Alpine Iceman. Science. 2003;302:862–866. doi: 10.1126/science.1089837. [DOI] [PubMed] [Google Scholar]
  • 3.Keller A., Graefen A., Ball M., Matzas M., Boisguerin V., Maixner F., Leidinger P., Backes C., Khairat R., Forster M., et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat. Commun. 2012;3:698. doi: 10.1038/ncomms1701. [DOI] [PubMed] [Google Scholar]
  • 4.Lazaridis I., Nadel D., Rollefson G., Merrett D.C., Rohland N., Mallick S., Fernandes D., Novak M., Gamarra B., Sirak K., et al. Genomic insights into the origin of farming in the ancient Near East. Nature. 2016;536:419–424. doi: 10.1038/nature19310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Mathieson I., Lazaridis I., Rohland N., Mallick S., Patterson N., Roodenberg S.A., Harney E., Stewardson K., Fernandes D., Novak M., et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature. 2015;528:499–503. doi: 10.1038/nature16152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Fernandes D.M., Mittnik A., Olalde I., Lazaridis I., Cheronet O., Rohland N., Mallick S., Bernardos R., Broomandkhoshbacht N., Carlsson J., et al. The spread of steppe and Iranian-related ancestry in the islands of the western Mediterranean. Nat. Ecol. Evol. 2020;4:334–345. doi: 10.1038/s41559-020-1102-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Marcus J.H., Posth C., Ringbauer H., Lai L., Skeates R., Sidore C., Beckett J., Furtwängler A., Olivieri A., Chiang C.W.K., et al. Genetic history from the Middle Neolithic to present on the Mediterranean island of Sardinia. Nat. Commun. 2020;11:939. doi: 10.1038/s41467-020-14523-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Raveane A., Aneli S., Montinaro F., Athanasiadis G., Barlera S., Birolo G., Boncoraglio G., Di Blasio A.M., Di Gaetano C., Pagani L., et al. Population structure of modern-day Italians reveals patterns of ancient and archaic ancestries in Southern Europe. Sci. Adv. 2019;5 doi: 10.1126/sciadv.aaw3492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Liu L., Li Y., Li S., Hu N., He Y., Pong R., Lin D., Lu L., Law M. Comparison of next-generation sequencing systems. J. Biomed. Biotechnol. 2012;2012:251364. doi: 10.1155/2012/251364. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Rohland N., Harney E., Mallick S., Nordenfelt S., Reich D. Partial uracil-DNA-glycosylase treatment for screening of ancient DNA. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2015;370:20130624. doi: 10.1098/rstb.2013.0624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Peltzer A., Jäger G., Herbig A., Seitz A., Kniep C., Krause J., Nieselt K. EAGER: efficient ancient genome reconstruction. Genome Biol. 2016;17:60. doi: 10.1186/s13059-016-0918-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Korneliussen T.S., Albrechtsen A., Nielsen R. ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinf. 2014;15:356. doi: 10.1186/s12859-014-0356-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lazaridis I., Patterson N., Mittnik A., Renaud G., Mallick S., Kirsanow K., Sudmant P.H., Schraiber J.G., Castellano S., Lipson M., et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature. 2014;513:409–413. doi: 10.1038/nature13673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hofmanová Z., Kreutzer S., Hellenthal G., Sell C., Diekmann Y., Díez-Del-Molino D., van Dorp L., López S., Kousathanas A., Link V., et al. Early farmers from across Europe directly descended from Neolithic Aegeans. Proc. Natl. Acad. Sci. USA. 2016;113:6886–6891. doi: 10.1073/pnas.1523951113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mathieson I., Alpaslan-Roodenberg S., Posth C., Szécsényi-Nagy A., Rohland N., Mallick S., Olalde I., Broomandkhoshbacht N., Candilio F., Cheronet O., et al. The genomic history of southeastern Europe. Nature. 2018;555:197–203. doi: 10.1038/nature25778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lipson M., Szécsényi-Nagy A., Mallick S., Pósa A., Stégmár B., Keerl V., Rohland N., Stewardson K., Ferry M., Michel M., et al. Parallel palaeogenomic transects reveal complex genetic history of early European farmers. Nature. 2017;551:368–372. doi: 10.1038/nature24476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Haak W., Lazaridis I., Patterson N., Rohland N., Mallick S., Llamas B., Brandt G., Nordenfelt S., Harney E., Stewardson K., et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature. 2015;522:207–211. doi: 10.1038/nature14317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Olalde I., Mallick S., Patterson N., Rohland N., Villalba-Mouco V., Silva M., Dulias K., Edwards C.J., Gandini F., Pala M., et al. The genomic history of the Iberian Peninsula over the past 8000 years. Science. 2019;363:1230–1234. doi: 10.1126/science.aav4040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rivollat M., Jeong C., Schiffels S., Küçükkalıpçı İ., Pemonge M.-H., Rohrlach A.B., Alt K.W., Binder D., Friederich S., Ghesquière E., et al. Ancient genome-wide DNA from France highlights the complexity of interactions between Mesolithic hunter-gatherers and Neolithic farmers. Sci. Adv. 2020;6 doi: 10.1126/sciadv.aaz5344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Antonio M.L., Gao Z., Moots H.M., Lucci M., Candilio F., Sawyer S., Oberreiter V., Calderon D., Devitofranceschi K., Aikens R.C., et al. Ancient Rome: A genetic crossroads of Europe and the Mediterranean. Science. 2019;366:708–714. doi: 10.1126/science.aay6826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Saupe T., Montinaro F., Scaggion C., Carrara N., Kivisild T., D’Atanasio E., Hui R., Solnik A., Lebrasseur O., Larson G., et al. Ancient genomes reveal structural shifts after the arrival of Steppe-related ancestry in the Italian Peninsula. Curr. Biol. 2021;31:2576–2591.e12. doi: 10.1016/j.cub.2021.04.022. [DOI] [PubMed] [Google Scholar]
  • 22.Narasimhan V.M., Patterson N., Moorjani P., Rohland N., Bernardos R., Mallick S., Lazaridis I., Nakatsuka N., Olalde I., Lipson M., et al. The formation of human populations in South and Central Asia. Science. 2019;365 doi: 10.1126/science.aat7487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fenner J.N. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am. J. Phys. Anthropol. 2005;128:415–423. doi: 10.1002/ajpa.20188. [DOI] [PubMed] [Google Scholar]
  • 24.Schiffels S., Wang K. In: Statistical Population Genomics. Dutheil J.Y., editor. Springer US; 2020. MSMC and MSMC2: The Multiple Sequentially Markovian Coalescent; pp. 147–166. [DOI] [PubMed] [Google Scholar]
  • 25.Marchi N., Winkelbach L., Schulz I., Brami M., Hofmanová Z., Blöcher J., Reyna-Blanco C.S., Diekmann Y., Thiéry A., Kapopoulou A., et al. The genomic origins of the world’s first farmers. Cell. 2022;185:1842–1859.e18. doi: 10.1016/j.cell.2022.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Mathieson S., Mathieson I. FADS1 and the Timing of Human Adaptation to Agriculture. Mol. Biol. Evol. 2018;35:2957–2970. doi: 10.1093/molbev/msy180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ju D., Mathieson I. The evolution of skin pigmentation-associated variation in West Eurasia. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2009227118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lab N. 2018. UK Biobank GWAS.www.nealelab.is/uk-biobank [Google Scholar]
  • 29.Liu F., Visser M., Duffy D.L., Hysi P.G., Jacobs L.C., Lao O., Zhong K., Walsh S., Chaitanya L., Wollstein A., et al. Genetics of skin color variation in Europeans: genome-wide association studies with functional follow-up. Hum. Genet. 2015;134:823–835. doi: 10.1007/s00439-015-1559-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Maples B.K., Gravel S., Kenny E.E., Bustamante C.D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 2013;93:278–288. doi: 10.1016/j.ajhg.2013.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lipson M., Reich D. A Working Model of the Deep Relationships of Diverse Modern Human Genetic Lineages Outside of Africa. Mol. Biol. Evol. 2017;34:889–902. doi: 10.1093/molbev/msw293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Pabst M.A., Letofsky-Papst I., Bock E., Moser M., Dorfer L., Egarter-Vigl E., Hofer F. The tattoos of the Tyrolean Iceman: a light microscopical, ultrastructural and element analytical study. J. Archaeol. Sci. 2009;36:2335–2341. [Google Scholar]
  • 33.Spindler K. 2000. Der Mann im Eis. Neue sensationelle Erkenntnisse über die Mumie in den Ötztaler Alpen (Goldmann) [Google Scholar]
  • 34.Wilfing H., Teschler-Nicola M., Seidler H., Weber G., Schlagenhaufen C., Irgolic K.J., Gössler W., Platzer W., Spindler K., Notdurfther H., et al. Untersuchungen an Haarresten des Mannes vom Hauslabjoch. Naturwiss. Rundsch. 1993;46:257–260. [Google Scholar]
  • 35.Martin A.R., Lin M., Granka J.M., Myrick J.W., Liu X., Sockell A., Atkinson E.G., Werely C.J., Möller M., Sandhu M.S., et al. An Unexpectedly Complex Architecture for Skin Pigmentation in Africans. Cell. 2017;171:1340–1353.e14. doi: 10.1016/j.cell.2017.11.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Mostafavi H., Harpak A., Agarwal I., Conley D., Pritchard J.K., Przeworski M. Variable prediction accuracy of polygenic scores within an ancestry group. Elife. 2020;9 doi: 10.7554/eLife.48376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Pirastu N., Joshi P.K., de Vries P.S., Cornelis M.C., McKeigue P.M., Keum N., Franceschini N., Colombo M., Giovannucci E.L., Spiliopoulou A., et al. GWAS for male-pattern baldness identifies 71 susceptibility loci explaining 38% of the risk. Nat. Commun. 2017;8:1584. doi: 10.1038/s41467-017-01490-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Carlson M.O., Rice D.P., Berg J.J., Steinrücken M. Polygenic score accuracy in ancient samples: Quantifying the effects of allelic turnover. PLoS Genet. 2022;18 doi: 10.1371/journal.pgen.1010170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Yang J., Manolio T.A., Pasquale L.R., Boerwinkle E., Caporaso N., Cunningham J.M., de Andrade M., Feenstra B., Feingold E., Hayes M.G., et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 2011;43:519–525. doi: 10.1038/ng.823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Meyer M., Kircher M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc. 2010;2010 doi: 10.1101/pdb.prot5448. [DOI] [PubMed] [Google Scholar]
  • 41.Kircher M., Sawyer S., Meyer M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 2012;40 doi: 10.1093/nar/gkr771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Schubert M., Lindgreen S., Orlando L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res. Notes. 2016;9:88. doi: 10.1186/s13104-016-1900-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Renaud G., Slon V., Duggan A.T., Kelso J. Schmutzi: estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA. Genome Biol. 2015;16:224. doi: 10.1186/s13059-015-0776-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Patterson N., Price A.L., Reich D. Population structure and eigenanalysis. PLoS Genet. 2006;2 doi: 10.1371/journal.pgen.0020190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Prüfer K. snpAD: an ancient DNA genotype caller. Bioinformatics. 2018;34:4165–4171. doi: 10.1093/bioinformatics/bty507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Patterson N., Moorjani P., Luo Y., Mallick S., Rohland N., Zhan Y., Genschoreck T., Webster T., Reich D. Ancient admixture in human history. Genetics. 2012;192:1065–1093. doi: 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Jeong C., Balanovsky O., Lukianova E., Kahbatkyzy N., Flegontov P., Zaporozhchenko V., Immel A., Wang C.-C., Ixan O., Khussainova E., et al. The genetic history of admixture across inner Eurasia. Nat. Ecol. Evol. 2019;3:966–976. doi: 10.1038/s41559-019-0878-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Dabney J., Knapp M., Glocke I., Gansauge M.-T., Weihmann A., Nickel B., Valdiosera C., García N., Pääbo S., Arsuaga J.-L., Meyer M. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl. Acad. Sci. USA. 2013;110:15758–15763. doi: 10.1073/pnas.1314445110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Fu Q., Mittnik A., Johnson P.L.F., Bos K., Lari M., Bollongino R., Sun C., Giemsch L., Schmitz R., Burger J., et al. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr. Biol. 2013;23:553–559. doi: 10.1016/j.cub.2013.02.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Biagini S.A., Solé-Morata N., Matisoo-Smith E., Zalloua P., Comas D., Calafell F. People from Ibiza: an unexpected isolate in the Western Mediterranean. Eur. J. Hum. Genet. 2019;27:941–951. doi: 10.1038/s41431-019-0361-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Mallick S., Li H., Lipson M., Mathieson I., Gymrek M., Racimo F., Zhao M., Chennagiri N., Nordenfelt S., Tandon A., et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature. 2016;538:201–206. doi: 10.1038/nature18964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Lazaridis I., Alpaslan-Roodenberg S., Acar A., Açıkkol A., Agelarakis A., Aghikyan L., Akyüz U., Andreeva D., Andrijašević G., Antonović D., et al. Ancient DNA from Mesopotamia suggests distinct Pre-Pottery and Pottery Neolithic migrations into Anatolia. Science. 2022;377:982–987. doi: 10.1126/science.abq0762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Olalde I., Brace S., Allentoft M.E., Armit I., Kristiansen K., Booth T., Rohland N., Mallick S., Szécsényi-Nagy A., Mittnik A., et al. The Beaker phenomenon and the genomic transformation of northwest Europe. Nature. 2018;555:190–196. doi: 10.1038/nature25738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Villalba-Mouco V., van de Loosdrecht M.S., Posth C., Mora R., Martínez-Moreno J., Rojo-Guerra M., Salazar-García D.C., Royo-Guillén J.I., Kunst M., Rougier H., et al. Survival of late Pleistocene Hunter-gatherer ancestry in the Iberian Peninsula. Curr. Biol. 2019;29:1169–1177.e7. doi: 10.1016/j.cub.2019.02.006. [DOI] [PubMed] [Google Scholar]
  • 56.Gallego Llorente M., Jones E.R., Eriksson A., Siska V., Arthur K.W., Arthur J.W., Curtis M.C., Stock J.T., Coltorti M., Pieruccini P., et al. Ancient Ethiopian genome reveals extensive Eurasian admixture in Eastern Africa. Science. 2015;350:820–822. doi: 10.1126/science.aad2879. [DOI] [PubMed] [Google Scholar]
  • 57.Fu Q., Li H., Moorjani P., Jay F., Slepchenko S.M., Bondarev A.A., Johnson P.L.F., Aximu-Petri A., Prüfer K., de Filippo C., et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature. 2014;514:445–449. doi: 10.1038/nature13810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Fu Q., Posth C., Hajdinjak M., Petr M., Mallick S., Fernandes D., Furtwängler A., Haak W., Meyer M., Mittnik A., et al. The genetic history of Ice Age Europe. Nature. 2016;534:200–205. doi: 10.1038/nature17993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Raghavan M., Skoglund P., Graf K.E., Metspalu M., Albrechtsen A., Moltke I., Rasmussen S., Stafford T.W., Jr., Orlando L., Metspalu E., et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. 2014;505:87–91. doi: 10.1038/nature12736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Jones E.R., Gonzalez-Fortes G., Connell S., Siska V., Eriksson A., Martiniano R., McLaughlin R.L., Gallego Llorente M., Cassidy L.M., Gamba C., et al. Upper Palaeolithic genomes reveal deep roots of modern Eurasians. Nat. Commun. 2015;6:8912. doi: 10.1038/ncomms9912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Rubinacci S., Ribeiro D.M., Hofmeister R.J., Delaneau O. Efficient phasing and imputation of low-coverage sequencing data using large reference panels. Nat. Genet. 2021;53:120–126. doi: 10.1038/s41588-020-00756-0. [DOI] [PubMed] [Google Scholar]
  • 62.Prüfer K., de Filippo C., Grote S., Mafessoni F., Korlević P., Hajdinjak M., Vernot B., Skov L., Hsieh P., Peyrégne S., et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science. 2017;358:655–658. doi: 10.1126/science.aao1887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Schiffels S., Durbin R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 2014;46:919–925. doi: 10.1038/ng.3015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Prüfer K., Racimo F., Patterson N., Jay F., Sankararaman S., Sawyer S., Heinze A., Renaud G., Sudmant P.H., de Filippo C., et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 2014;505:43–49. doi: 10.1038/nature12886. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S3 and Tables S2, S6–S8, S10, and S11
mmc1.pdf (271.7KB, pdf)
Table S1. Sequencing details of the high-coverage Iceman genome, related to STAR Methods
mmc2.xlsx (12.9KB, xlsx)
Table S3. List of modern-day and ancient populations analyzed in this study, related to Figures 1, 2, and 3
mmc3.xlsx (14.4KB, xlsx)
Table S4. qpAdm modeling results of two-way admixture between hunter-gatherer-related and early Neolithic-farmer-related ancestry, related to Figure 3 and STAR Methods

We use WHG and Anatolia_N as the ultimate ancestry source, and use Loschbour and Germany_EN_LBK_Stuttgart.DG as the proximal ancestry source.

mmc4.xlsx (13.4KB, xlsx)
Table S5. Comparison of 3-way modeling results of the high-coverage Iceman genome and previously published Iceman genome, related to Figure 3 and STAR Methods

We tested three outgroup lists here, including the default outgroup list we used in this study, ‘O9’ and ‘O9N’ outgroup panel used in Haak et al.17.

mmc5.xlsx (13KB, xlsx)
Table S9. The whole list of phenotype-related SNPs we examined in this study, related to STAR Methods

Here we report the list of 147 SNPs for various phenotype examination, and the list of 170 UK Bio Bank SNPs used for calculating polygenic risk score of skin pigmentation and corresponding genotype in Iceman. We also report comparison of genotype calls in new and old Iceman genome based on the list of phenotype-related SNPs in Keller et al.3.

mmc6.xlsx (46.1KB, xlsx)
Document S2. Article plus supplemental information
mmc7.pdf (4.3MB, pdf)

Data Availability Statement

Sequencing data are available at the European Nucleotide Archive (ENA) under ENA: PRJEB56570.


Articles from Cell Genomics are provided here courtesy of Elsevier

RESOURCES