Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2020 Dec 21;118(1):e2022112118. doi: 10.1073/pnas.2022112118

Ancient DNA from Guam and the peopling of the Pacific

Irina Pugach a, Alexander Hübner a,b, Hsiao-chun Hung c, Matthias Meyer a, Mike T Carson d,1, Mark Stoneking a,1
PMCID: PMC7817125  PMID: 33443177

Significance

We know more about the settlement of Polynesia than we do about the settlement of the Mariana Islands in the western Pacific. There is debate over where people came from to get to the Marianas, with various lines of evidence pointing to the Philippines, Indonesia, New Guinea, or the Bismarck Archipelago, and over how the ancestors of the present Mariana Islanders, the Chamorro, might be related to Polynesians. We analyzed ancient DNA from Guam from two skeletons dating to ∼2,200 y ago and found that their ancestry is linked to the Philippines. Moreover, they are closely related to early Lapita skeletons from Vanuatu and Tonga, suggesting that the early Mariana Islanders may have been involved in the colonization of Polynesia.

Keywords: ancient DNA, Micronesia, Polynesia, human settlement

Abstract

Humans reached the Mariana Islands in the western Pacific by ∼3,500 y ago, contemporaneous with or even earlier than the initial peopling of Polynesia. They crossed more than 2,000 km of open ocean to get there, whereas voyages of similar length did not occur anywhere else until more than 2,000 y later. Yet, the settlement of Polynesia has received far more attention than the settlement of the Marianas. There is uncertainty over both the origin of the first colonizers of the Marianas (with different lines of evidence suggesting variously the Philippines, Indonesia, New Guinea, or the Bismarck Archipelago) as well as what, if any, relationship they might have had with the first colonizers of Polynesia. To address these questions, we obtained ancient DNA data from two skeletons from the Ritidian Beach Cave Site in northern Guam, dating to ∼2,200 y ago. Analyses of complete mitochondrial DNA genome sequences and genome-wide SNP data strongly support ancestry from the Philippines, in agreement with some interpretations of the linguistic and archaeological evidence, but in contradiction to results based on computer simulations of sea voyaging. We also find a close link between the ancient Guam skeletons and early Lapita individuals from Vanuatu and Tonga, suggesting that the Marianas and Polynesia were colonized from the same source population, and raising the possibility that the Marianas played a role in the eventual settlement of Polynesia.


Many books have been written about where the Polynesians came from but nobody cares a straw about where the Guamanians came from. And yet it is probable that they can tell at least as much about the peopling of the Pacific as can the Polynesians.

–William Howells, The Pacific Islanders (1)

The human settlement of the Mariana Islands, in western Micronesia, was in some respects more remarkable than the settlement of Polynesia. And yet, as noted in the quote above and by others (2), the settlement of Polynesia has received far more attention than that of the Mariana Islands. Consisting of 15 islands (of which Guam is the largest and southernmost) stretching across some 750 km of sea, the Marianas archipelago is located ∼2,500 km east of the Philippines and ∼2,200 km north of New Guinea (Fig. 1). The earliest archaeological sites date to around 3.5 thousand y ago (kya) (3), and paleoenvironmental evidence suggests even older occupation, starting around 4.3 kya (4). Thus, the first human presence in the Marianas was at least contemporaneous with, and possibly even earlier than, the earliest Lapita sites in Island Melanesia and western Polynesia that date to after 3.3 kya (5) and are associated with the ancestors of Polynesians. However, reaching the Marianas necessitated crossing more than 2,000 km of open ocean, whereas voyages of similar length were not accomplished by Polynesian ancestors until they ventured into eastern Polynesia within the past 1,000 y (2, 6).

Fig. 1.

Fig. 1.

Map of the western Pacific, showing locations and areas mentioned in the text. The Inset shows the location of the Ritidian Site on Guam. Location names in red have been suggested as potential sources for the settlement of the Mariana Islands. Wallace’s Line divides biogeographic regions and lies at the boundary of the prehistoric continental landmasses of Sunda and Sahul. The dashed blue line indicates the boundary between Near and Remote Oceania: The islands of Near Oceania were colonized beginning 45 to 50 kya and involved relatively short, intervisible water crossings, while the islands of Remote Oceania required substantial water crossings that were not intervisible and that were not achieved until ∼3.5 kya or later. Red dots indicate the locations of the early Lapita samples from Vanuatu and Tonga; the blue arrow indicates the conventional route for the Austronesian expansion to the Bismarck Archipelago, which was then the source of initial voyages to Remote Oceania; the solid red arrow indicates the route for the settlement of the Marianas supported by this study; and the dashed red arrow indicates the potential additional contribution of Mariana Islanders to further settlement of the Pacific, suggested by this study.

Where these intrepid voyagers originated, and how they relate to Polynesians, are open questions. Mariana Islanders are unusual in many respects when compared with other Micronesians and Polynesians. Chamorro, the indigenous language of Guam, is classified as a Western Malayo-Polynesian language within the Austronesian language family, along with the languages of western Indonesia (the islands west of Wallace’s Line) (Fig. 1), Sulawesi, and the Philippines. Palauan, another indigenous language of western Micronesia, is also a Western Malayo-Polynesian language, whereas all other Micronesian and all Polynesian languages belong to the Oceanic subgroup of Eastern Malayo-Polynesian (7). The most definitive features of Lapita pottery, associated with the earliest presence of Austronesians in Island Melanesia and western Polynesia (8), are absent in the Marianas, as are the domestic animals, such as pigs, dogs, and chickens typically associated with Lapita sites and Polynesian settlement (9). Moreover, rice cultivation seems to have been present as an indigenous tradition in the Marianas (10), but so far no such evidence has been found elsewhere in Remote Oceania.

These linguistic and cultural differences have led most scholars to conclude that the settlement of western Micronesia and Polynesia had little to do with one another. To be sure, indications have been noted of morphological (1), cranial (11), and genetic (1216) affinities between Micronesians and Polynesians [see also Addison and Matisoo-Smith (17)], and stylistic links between the pottery of the Philippines, the Marianas, and the Lapita region have also been illustrated (18). Nonetheless, the standard narrative for Polynesian origins (Fig. 1) is that they reflect a movement of Austronesian-speaking people from Taiwan beginning 4.5 to 4 kya that island-hopped through the Philippines and southeastward through Indonesia, reaching the Bismarck Archipelago around 3.5 to 3.3 kya. From there they spread into western Polynesia, with subsequent additional migrations from Near Oceania around 2.5 kya that brought more Papuan-related ancestry that ultimately spread throughout Polynesia. This narrative is supported by a large body of archaeological, linguistic, and genetic data (8, 1927), and western Micronesia typically does not figure in this orthodox story.

Compared to Polynesians, the origins of the Mariana islanders are more uncertain. Most mitochondrial DNA (mtDNA) sequences of modern Chamorros belong to haplogroup E, which occurs across Island Southeast Asia and is thought to be associated with the initial peopling of the Marianas, while the less-frequent haplogroup B4 sequences, which are found in high frequency in Polynesians, are attributed to later contact (28). Studies of a limited number of autosomal short-tandem repeat loci similarly indicate differences in the affinities of western Micronesians (Palau and the Marianas) vs. eastern Micronesians, with the former showing ties to Southeast Asia and the latter to Polynesia (12, 15). The linguistic evidence for Chamorro would suggest an origin from Sulawesi in Indonesia (29) or directly from the central or northern Philippines (2, 21), and the oldest decorated pottery and other artifacts of the Marianas, dating to around 3.5 kya, have been matched with counterparts in the Philippines at around the same time or even earlier (30). However, alternative views have been proposed and debated (31, 32), and it is not clear to what extent the genetic and linguistic relationships of the contemporary Chamorro reflect initial settlement vs. later contact. Moreover, computer simulations of sea voyaging found no instances of successful voyaging from the Philippines or western Indonesia across to the Marianas; instead, these simulations pointed to New Guinea and the Bismarck Archipelago as the most likely starting points (33, 34).

Genomic evidence can shed light on this debate over the origin of the Chamorro, as well as on their relationships with Polynesians. Two main genetic ancestries are present in New Guinea and the Bismarck Archipelago: The aforementioned Austronesian (Malayo-Polynesian), which arrived with the spread of Austronesian speakers from Taiwan, and “Papuan,” which is a general term for the non-Austronesian ancestry that was present in New Guinea and Island Melanesia prior to the arrival of the Austronesians; it should be kept in mind that “Papuan” ancestry is quite heterogeneous in composition across the region (3537). Papuan-related ancestry probably traces back to the original human populations of the region, at least 49 kya (38), and is readily distinguished from Austronesian ancestry. Papuan-related ancestry is present not only in New Guinea and the Bismarck Archipelago, but also at substantial frequencies in eastern Indonesia (3941), defined here as all Indonesian islands to the east of Wallace’s Line (Fig. 1). However, Papuan-related ancestry is practically absent west of Wallace’s Line (Fig. 1), so if the first settlers of the Marianas started from the Philippines or west of Wallace’s Line, then they should have had little if any Papuan-related ancestry. Conversely, if they started from eastern Indonesia, New Guinea, or the Bismarck Archipelago, then they should have brought appreciable amounts of Papuan-related ancestry.

In principle, to address this issue, the ancestry of the modern inhabitants of the Marianas could be analyzed for Papuan-related ancestry. However, a common finding of ancient DNA studies is that the ancestry of people in a region today may not reflect the ancestry of people living in that region thousands of years ago (42). In particular for the Marianas, the archaeological evidence indicates substantial cultural change around ∼1 kya (28, 43), coinciding with the construction of stone-pillar houses in formal village arrangements (latte) at a time when nearly all of the Pacific Islands were populated and connected by long-distance sea voyaging (8). The presence of mtDNA haplogroup B4 sequences in modern Chamorro has been attributed to contact during the latte period (28).

In addition, population contacts and movements became more complicated during the European colonial period, starting with the arrival of Magellan in 1521 in the Marianas and continuing with the Manila-Acapulco galleons (and slave trade) from 1565 to 1815; Guam was a regular stopover on these voyages. European colonialism also involved multiple relocations and reductions in population size across the archipelago. These events undoubtedly had an impact on the genetic ancestry of the modern Chamorros, making it more difficult to assess their origins and potential relationships with Polynesians. It would therefore be preferable to address these issues with ancient DNA from the Marianas.

At the Ritidian Site in northern Guam (SI Appendix, Fig. S1), two skeletons clearly predating the latte period were found outside a ritual cave site (44). These individuals, RBC1 and RBC2, were buried side-by-side in extended positions, with heads and torsos removed (SI Appendix, Fig. S2). Direct radiocarbon dating of a bone from RBC2 produced a result of 2,180 ± 30 calibrated years BP (44), which is thus some 1,000 y after the initial settlement of Guam, but also some 1,000 y before the latte period. Here we report the analysis of ancient DNA retrieved from these remains; our results contribute to the debate over the starting point for the first voyages that led to human settlement of the Marianas, and we provide additional insights into the role of the Marianas in the larger view of the peopling of the Pacific.

Results

Shotgun sequencing of libraries constructed from DNA extracted from the ancient Guam skeletons revealed elevated C→T substitutions at the ends of fragments, as expected for ancient DNA (SI Appendix, Fig. S3 and Dataset S1). The percent endogenous DNA was too low for further shotgun sequencing (Dataset S1); we therefore proceeded by capture enrichment for the mtDNA genome, and for a panel of 1.2 million SNPs used in previous ancient DNA studies (2426, 45, 46), prior to sequencing.

mtDNA and Y Chromosome.

After merging the sequence data from libraries enriched for mtDNA while excluding those that were highly contaminated (Dataset S2), we were able to obtain mtDNA genome sequences at an average coverage of 95.2-fold for RBC1 and 261.3-fold for RBC2. Estimated contamination in the mtDNA sequences, using a likelihood-based approach (47) subsequently referred to as contamMix (48), was 17.9% for RBC1 and 6.6% for RBC2 (Dataset S3). The sequences are identical where they overlap, and even with this relatively high level of contamination, both sequences are confidently assigned to haplogroup E2a (Dataset S3). In addition to the diagnostic mutations for haplogroup E2a, both sequences carry a novel-derived substitution at position 8981, which results in an amino acid substitution (Gln → Arg) in the ATP6 gene.

Haplogroup E2a is the most common haplogroup in the modern Chamorro population of Guam (28), with a frequency of 65%. Elsewhere it is reported to occur sporadically in populations from the Philippines and Indonesia (4952), and in a single individual from the Solomon Islands (53); otherwise, it is absent from Oceania and has not been reported from Mainland Southeast Asia. The finding of this haplogroup in the ancient Guam skeletons thus suggests links to the Philippines and Indonesia, rather than New Guinea or the Bismarck Archipelago. Of additional importance, the high frequency of this haplogroup in modern Chamorros suggests a degree of genetic continuity with the population represented by the ancient skeletons, persisting through the interceding cross-population contacts since the latte period after ∼1 kya and later European colonial events.

Based on the ratio of the average coverage of X chromosome vs. X chromosome + autosomal reads in the shotgun sequencing data, RBC1 is male and RBC2 is female (SI Appendix, Fig. S4). The Y chromosome of RBC1 is assigned to haplogroup O2a2 (formerly haplogroup O3a3), based on having the derived allele for the diagnostic marker P201 (54); genotypes at all other informative Y-chromosome SNPs for which there are data from RBC1 are consistent with this haplogroup (Dataset S4). Haplogroup O2a2-P201 is widespread across Mainland and Island Southeast Asia and Oceania, and has been associated with the Austronesian expansion (55, 56).

Genome-Wide SNP Data: Ancient Guam Origins.

We enriched 13 sequencing libraries from RBC1 and 11 from RBC2 for ∼1.2 million SNPs (Dataset S5) and obtained data (Dataset S6) for 128,772 SNPs (39,760 in deaminated reads) for RBC1 and 361,982 SNPs (143,451 in deaminated reads) for RBC2. Given the relatively high contamination estimates for some of the libraries (Dataset S5), we either redid analyses using only deaminated reads (if there were enough deaminated reads), or included data from Europeans in the analysis, to ensure that contamination with modern European DNA was not influencing the results. The results reported below are based on all reads, as we did not find any indication of contamination influencing the results.

We first checked if RBC1 and RBC2 might be related by calculating the fraction of pairwise differences for the 33,400 overlapping SNPs between them, and comparing this to mean pairwise distances for first-, second-, and third-degree relatives in the 1000 Genomes Project dataset (57), using sites on the Human Origins Array (SI Appendix, Fig. S5A). While the pairwise distance between RBC1 and RBC2 is similar to that for first-degree relatives in the 1000 Genomes dataset, suggesting that they might be first-degree relatives, we obtained similar mean pairwise distances for other ancient samples from Southeast Asia and Oceania (SI Appendix, Fig. S5B). We therefore conclude that the limited amount of data and low overall genetic diversity characteristic of the ancient samples preclude accurate assessment of relatedness.

We then projected RBC1 and RBC2 onto principal components (PCs) constructed with modern samples genotyped on the Affymetrix 6.0 platform and with data from the Simons Genome Diversity Project (SGDP); the overlap with this array is provided in Dataset S6, and details on the modern samples are in Dataset S7. This dataset has good coverage of populations from Island Southeast Asia, in particular from eastern Indonesia, which exhibit both Asian-related and Papuan-related ancestry and hence are a potential proxy for the ancestry in the ancient Guam samples if they in fact have Papuan-related ancestry. The results for the first two PCs (Fig. 2A) show three axes of variation, with Europe/South Asia, New Guinea, and Southeast Asia at the vertices. The two ancient Guam samples overlap samples from Taiwan and the Philippines. There is no indication of any Papuan-related ancestry in the ancient Guam samples, particularly when compared with eastern Indonesian samples, all of which have some Papuan-related ancestry and hence are clearly separated from other Southeast Asian samples.

Fig. 2.

Fig. 2.

PCA and ADMIXTURE analyses of the ancient Guam samples merged with modern samples genotyped on the Affymetrix 6.0 platform and with SGDP samples. (A) Plot of the first two PCs. The ancient Guam samples are projected. (B) ADMIXTURE results for K = 6. Population names are color-coded as in the PC plot.

We next carried out ADMIXTURE analysis of the same dataset; while the results for K = 3 are associated with the lowest cross-validation error (SI Appendix, Fig. S6A), the results for K = 6 distinguish different ancestry components for Mainland vs. Island Southeast Asia, so we show these results in Fig. 2B and the results for K = 2 to K = 8 in SI Appendix, Fig. S7. Notably, the yellow ancestry component, which is characteristic of New Guinea and is also present in eastern Indonesia, is completely lacking in the ancient Guam samples for all analyzed values of K (Fig. 2B and SI Appendix, Fig. S7). Moreover, at K = 6 the two ancient Guam samples have the dark blue ancestry component, which is at highest frequency in individuals from the Philippines and Taiwan (Fig. 2B). RBC1 also has a purple component, which likely reflects recent European DNA contamination.

Thus, these PC and ADMIXTURE analyses suggest that there is no Papuan-related ancestry in the ancient Guam samples, and moreover indicate that they are most similar to modern samples from the Philippines and Taiwan. However, the number of SNPs in the Affymetrix 6.0 dataset that overlap the ancient Guam samples is too small for more formal tests of population relationships (Dataset S6), and moreover this dataset has limited coverage of modern Oceanian populations. We therefore carried out all further analyses with the Human Origins dataset, which includes more modern samples from Near and Remote Oceania (Dataset S7), more overlap with the ancient Guam data (Dataset S6), and also includes data from ancient samples from Asia and the Pacific (Dataset S8), including early Lapita samples from Vanuatu and Tonga.

A principle components analysis (PCA) of these samples with the ancient samples projected (Fig. 3A) places the early Lapita samples at one vertex, East Asia at another, and New Guinea at the third vertex; the ancient Guam samples are now projected away from modern Taiwan and Philippine samples, in the direction of the early Lapita samples. An ADMIXTURE analysis of these data for K= 9 (Fig. 3B; results for K = 5 to K = 12 in SI Appendix, Fig. S8), which has the lowest cross-validation error (SI Appendix, Fig. S6B), now reveals two primary ancestry components in the ancient Guam samples: A dark blue component as before that is at highest frequency in Indonesia and the Philippines, and an orange component that is at highest frequency in Polynesia; the additional minor purple component likely reflects recent European contamination. As before, there is no indication from either the PCA or the ADMIXTURE analysis of any Papuan-related ancestry in the ancient Guam samples.

Fig. 3.

Fig. 3.

PCA and ADMIXTURE analyses of the ancient Guam samples merged with Human Origins Array data for modern and ancient samples. (A) Plot of the first two PCs. Ancient samples are projected. (B) ADMIXTURE results for K = 9. Population names are color-coded as in the PC plot.

While the presence of these two ancestry components in the ancient Guam samples could indicate admixture between a source population related to Indonesia/Philippines and another related to Polynesians, other explanations for the presence of multiple ancestry components are possible (37, 58). In particular, it could be that the ancient Guam samples are ancestrally related to both Indonesia/Philippines and to Polynesians, and that subsequent divergence and genetic drift has facilitated the identification of separate Indonesia/Philippine and Polynesia-related ancestry components in the ADMIXTURE analysis, both of which are present in the ancient Guam samples. To investigate the relationships of the ancient Guam samples in more detail, we analyzed outgroup-f3 and -f4 statistics. The outgroup-f3 analysis, which compares the amount of drift (i.e., ancestry) shared by the ancient Guam samples with other populations relative to an outgroup (Mbuti), shows that the ancient Guam samples share the most drift with the Lapita Vanuatu and Lapita Tonga samples, followed by an ancient sample from the Philippines and then by modern samples from the Philippines and Taiwan and late Neolithic samples from the Taiwan Strait Islands (Fig. 4A). Notably, the drift shared with New Guinea, and with the French, is less than that with any other population, indicating that the ancient Guam samples show the least relatedness with these two populations. These results further support the lack of any Papuan-related ancestry in the ancient Guam samples, and moreover also indicate that recent European contamination is not influencing these results.

Fig. 4.

Fig. 4.

Outgroup-f3 and -f4 results for the relationships of the ancient Guam samples with other populations. (A) Outgroup-f3 results comparing the ancient Guam samples to other modern and ancient samples, with Mbuti used as the outgroup. Bars indicate 1 SE. Larger values of the f3 statistic indicate more shared drift, and hence a closer relationship with the ancient Guam samples. (B) results for an f4 test of the form f4(test, Kankanaey; New Guinea highlands, Mbuti). f4 values that are significantly different from zero are in red.

We then constructed an f4 statistic of the form f4(test, Kankanaey; New Guinea highlands, Mbuti); values of this statistic that are equal to zero indicate that the test population forms a clade with Kankanaey relative to New Guinea; values less than zero indicate that Kankanaey shares more ancestry with New Guinea than does the test population; and values greater than zero indicate that the test population shares more ancestry with New Guinea than does Kankanaey. We used the ancient Guam samples and all other populations from Oceania as the test population; the results (Fig. 4B) indicate that all populations from Oceania tested share ancestry with New Guinea in comparison to Kankanaey, except for the ancient Guam samples. These form a clade with Kankanaey, as the Z-statistic is not significantly different from zero (Z = −1.93)

Genome-Wide SNP Data: Relationships with Early Lapita Samples.

The PCA, ADMIXTURE, and outgroup-f3 analyses not only indicate affinities between the ancient Guam samples and Philippine/Taiwan populations, but additionally suggest strong affinities between the ancient Guam and early Lapita samples. To investigate in more detail the relationships among the ancient Guam and early Lapita samples with samples from Asia and Oceania, we conducted f4 analyses of the form (ancient Guam, early Lapita; Asia/Oceania, Mbuti), separately for the early Lapita Vanuatu and Tonga samples and for all modern and ancient Asian and Oceanian samples in the dataset. Values of this f4 statistic that are consistent with zero imply that the ancient Guam and early Lapita samples form a clade; negative values indicate excess shared ancestry between the early Lapita sample and the Asia/Oceania population; and positive values indicate excess shared ancestry between the ancient Guam samples and the Asia/Oceania population. The results (SI Appendix, Fig. S9) show that the ancient Guam and early Lapita samples always form a clade with one another when compared with any Asian population. However, both of the early Lapita samples share more ancestry with ancient and modern Polynesian samples (but not with any other samples from Oceania) than do the ancient Guam samples. This is further supported by outgroup-f3 comparisons of the ancient Guam and early Lapita samples with other populations (SI Appendix, Fig. S10): Both early Lapita samples share more drift with the modern and ancient Remote Oceanians sampled than do the ancient Guam samples. Nonetheless, f4 statistics of the form (Oceania, early Lapita; ancient Guam, Mbuti) are always significantly negative for both early Lapita samples, regardless of which Oceanian population is included in the test (SI Appendix, Fig. S11). These f4 results indicate that there is shared drift between the early Lapita and ancient Guam samples when compared with any other Oceanian sample, in keeping with the outgroup-f3 results (Fig. 4). Overall, the f3 and f4 results imply that while the early Lapita and ancient Guam samples are closely related to each other, the early Lapita samples are a better proxy for the Polynesian-related ancestry in modern and ancient Oceanian samples than are the ancient Guam samples.

We next used admixture graphs (i.e., trees that allow for admixture or migration) to further investigate the relationships among the ancient Guam, early Lapita, and other Asian and Oceanian samples. Included in these analyses were: New Guinea Highlanders as a source of Papuan ancestry; Han Chinese as a source of Asian ancestry; Kankanaey as a source of Austronesian ancestry; Tolai (mixed Papuan/Austronesian ancestry) and Baining_Marabu (Papuan ancestry only) from New Britain to investigate relationships with the Bismarck Archipelago; modern Vanuatu with mixed Papuan/Austronesian ancestry; and the ancient Guam, Lapita Vanuatu, and Lapita Tonga samples. We also included Mbuti as an outgroup. We first constructed a maximum-likelihood tree and added migration edges, using the software TreeMix (59); a tree with two migration edges (Fig. 5A) has all residuals within 3 SE (SI Appendix, Fig. S12) and thus provides a reasonable fit. This tree indicates shared drift between the ancient Guam and Lapita samples, with the migration edges bringing Lapita-related ancestry into the modern Vanuatu and Tolai samples.

Fig. 5.

Fig. 5.

Tree and graph depictions of the relationships of ancient Guam, early Lapita, and select Asian and Oceanian populations. (A) Maximum-likelihood tree with two migration edges. All residuals (SI Appendix, Fig. S12) are within 3 SE. (B) Consensus graph with nodes present in at least 50% of the topology sets recovered with AdmixtureBayes. (C) Admixture graph obtained with qpGraph for the topology found by AdmixtureBayes with the highest posterior probability (SI Appendix, Fig. S13). The colored arrows in B and C indicate drift (ancestry) shared between the ancient Guam and early Lapita samples.

We additionally investigated admixture graphs using a Markov chain Monte Carlo method, implemented in the software AdmixtureBayes (60), to sample the space of possible admixture graphs. The graph with the highest posterior probability (17.6%) supports shared drift between the ancient Guam and early Lapita samples (SI Appendix, Fig. S13); moreover, a consensus graph that depicts the nodes present in at least 50% of the posterior sample of 1,000 admixture graphs (Fig. 5B) indicates that the shared drift between the ancient Guam and early Lapita samples (node n3 in Fig. 5B) appears in 99% of the topologies. We further examined this topology, inferred in an unsupervised manner by both TreeMix and AdmixtureBayes, with a combination of f statistics using the qpGraph software. This topology has a worst-fitting Z-score of 4.56 (Fig. 5C), which is above the conventional threshold of the worst-fitting |Z-score| < 3 for an “acceptable” graph. In general, deviations between the fitted and observed data can be explained either by an incorrect topology (which, in the case of qpGraph, is specified by the user and not inferred from the data) or by unmodeled admixture. The worst-fitting f statistics tend to involve Han Chinese; when they are excluded the worst-fitting Z-score is reduced to −3.72. This graph has five f statistics with |Z-score| > 3, all of which involve Mbuti and New Guinea Highlanders, so this graph probably provides a reasonable depiction of the relationships of the Oceanian samples, in particular the shared drift between the ancient Guam and early Lapita samples. For the two populations with mixed ancestry, the modern Vanuatu sample is inferred to have 65% Papuan-related and 35% Austronesian-related ancestry, while the Tolai sample has 85% Papuan-related and 15% Austronesian/Lapita-related ancestry; these estimates are in close agreement with those from AdmixtureBayes (Vanuatu: 66% Papuan-related and 34% Austronesian-related ancestry; Tolai: 87% Papuan-related and 13% Austronesian-related ancestry).

We further investigated the shared drift between the ancient Guam and early Lapita samples by including ancient samples from Liangdao that share ancestry with aboriginal Taiwanese (61) in the admixture graph analyses. While the results suggest that Liangdao is a better proxy than modern samples for the Austronesian-related ancestry in the ancient Guam and early Lapita samples (SI Appendix, Fig. S14), there is still shared drift between the ancient Guam and early Lapita samples.

Discussion

Some caution is warranted in interpreting the results of this study of ancient DNA from Guam, as they are based on two skeletons that may be related and that date from ∼1,400 y after the first human settlement of Guam. Previous studies of ancient DNA from early Lapita sites in Remote Oceania have found that initial results based on a limited number of samples (26) did not capture the full complexity revealed when additional samples were analyzed (24, 25). Nonetheless, the relationships that these ancient Guam samples exhibit with other ancient samples, as well as with modern samples from the region, provide some interesting insights into the peopling of Guam and the further settlement of Remote Oceania that should be the basis for further investigations.

Origins of the Ancient Guam Samples.

The mtDNA and Y chromosome haplogroups of the ancient Guam samples suggest links with Southeast Asia rather than New Guinea or the Bismarck Archipelago. Moreover, none of the analyses of the genome-wide data found any trace of Papuan-related ancestry in the ancient Guam samples. Our results thus rule out any source for the ancestry of these individuals that is east of Wallace’s Line, as substantial amounts of Papuan-related ancestry are present in eastern Indonesia, New Guinea, and the Bismarck Archipelago. The most likely source is the Philippines, although western Indonesia is also possible; further sampling of Philippine and Indonesian populations—and ancient DNA from these regions—would help pinpoint the source. Moreover, in considering the archaeological evidence, finer-scale sampling is needed to contend with a rapid geographic spread of the red-slipped pottery horizon around 3.5 kya, reflecting population dispersal from the Philippines both eastward into the Marianas and southward into Sulawesi, as well as eventually farther.

A Philippine source for the foundational population of Guam is consistent with the findings of modern DNA sampling (28), the linguistic evidence (2, 21), and the archaeological signature at the time of first Marianas settlement about 3.5 kya (30, 43). However, computer simulations of sea voyaging instead have indicated New Guinea or the Bismarck Archipelago as probable origin points of voyages reaching the Marianas (33, 34). One potential scenario to reconcile these two lines of evidence is that people traveled from the Philippines to New Guinea or the Bismarcks, without mixing with any populations along the way, and then voyaged from New Guinea/the Bismarcks to Guam, again without first mixing with any resident populations. However, the TreeMix and AdmixtureBayes results (Fig. 5) do not support this scenario, nor does the linguistic and archaeological evidence. In particular, the earliest pottery in the Marianas, dating to around 3.5 kya (3, 43), likely predates the oldest Lapita sites to the east of New Guinea, dated to not more than 3.3 kya (5). Yet the pottery, fine shell ornaments, and other cultural objects in the Marianas dating to 3.5 kya are quite distinct from the Lapita tradition, and instead can be linked to material markers in the Philippines that date to 3.8 to 3.5 kya (19, 30, 62), thus supporting movement from the Philippines to the Marianas. Moreover, the computer simulations of sea voyaging do not adequately consider the ability of ancient voyagers to travel against strong ocean currents and prevailing winds; in particular, the single outrigger canoes of the Chamorros—the “flying proas”—impressed early visitors with their greater speed and maneuverability, compared with Spanish ships (63). There is even at least one historically documented event of a Chinese trader drifting in a “sampan” from Manila to Guam during the 1600s (64). Ancient DNA from early Lapita skeletons in the Bismarcks would provide a further test of the hypothesis that people moved from the Bismarcks to Guam. It is, of course, possible that later periods of cross-contact voyaging brought additional groups of people to the Marianas from elsewhere (including perhaps the Bismarcks, among other places).

Relationships between Ancient Guam and Early Lapita Samples.

What about a Micronesian route [for the colonization of Polynesia]? It is not in favor with the anthropologists, though after all it was not anthropologists who settled Polynesia.

–William Howells, The Pacific Islanders (1)

All analyses consistently point to a surprisingly close relationship between the ancient Guam and early Lapita samples. This closeness is particularly evident in the outgroup-f3 and various -f4 analyses (Fig. 4 and SI Appendix, Figs. S9–S11), and in the TreeMix and admixture graph results (Fig. 5), all of which indicate shared ancestry between the ancient Guam and early Lapita samples. Moreover, admixture graphs indicate that the ancient Guam samples diverged first, and do not support movement of people from the Bismarcks to Guam (Fig. 5 and SI Appendix, Figs. S13 and S14). However, the admixture graph results should be viewed with caution, as they may be influenced by including a mix of ancient and modern DNA samples in the analyses (usually with fewer ancient than modern samples for each population), with possible attractions between ancient samples due to similar patterns of contamination or sequencing errors due to damage. Nonetheless, it appears that people either moved from the Marianas to the Bismarcks (or elsewhere in Island Melanesia) and then to other parts of Remote Oceania, or that the ancestors of the ancient Guam and early Lapita samples migrated separately, and by different routes, from the same source population.

Our results do not allow us to distinguish between these two possibilities. Arguing against a direct role for the Marianas in the later colonization of Polynesia is the lack of a linguistic connection, definitive Lapita pottery, or domesticated animals characteristic of Polynesia. However, languages spoken by people today may reflect subsequent developments, domestic animals may have been introduced via other routes, and the pottery of the Marianas predates Lapita pottery by a few centuries and is considered by some to be a related variety of the finely decorated pottery that subsequently became elaborated in Lapita pottery (18, 43). Moreover, we point out that a direct movement of people from the Philippines (or nearby areas) to the Bismarcks, either via the Marianas or by some other path that bypassed eastern Indonesia and the rest of New Guinea, would account for one peculiar observation, and that is the lack of Papuan-related ancestry in the early Lapita samples from Vanuatu and Tonga (2426). If the ancestors of Polynesians migrated from Taiwan or the Philippines to the Bismarcks by island-hopping through eastern Indonesia and along the coast of New Guinea (Fig. 1), in a process that took a few hundred years (perhaps 10 to 15 generations), then they would have encountered people with Papuan-related ancestry along the way, and there would have been ample opportunity for them to have picked up some Papuan-related ancestry. Perhaps the ancestors of Polynesia did move via this route, but did not immediately mix with the people along the way, because of social or other perceived differences. However, any such barrier to mixing did not last long, as Papuan-related ancestry shows up in Vanuatu almost at the same time as the early Lapita samples (25), and there is evidence for substantial later Papuan-related contact in Vanuatu, Santa Cruz, and Fiji that then spread throughout Polynesia (24, 25, 27, 37). An alternative explanation that is worth considering is that the early ancestors of Polynesians lack Papuan-related ancestry because they did not encounter people with Papuan-related ancestry until they reached the Bismarcks, perhaps because they voyaged via the Marianas or otherwise bypassed eastern Indonesia and coastal New Guinea.

As the quotation from Howells (1) at the beginning of this section indicates, the settlement of Polynesia via Micronesia has generally not been considered by researchers. However, this possibility has been suggested based on pottery evidence (18), and the genetic evidence presented here provides further insights into the connections between Micronesians and Polynesians noted previously (1117). Howell’s (1) suggestion of a role for Micronesia (specifically, the Marianas) in the settlement of Polynesia merits further consideration.

Methods

Site Description and Samples.

The two skeletons, RBC1 and RBC2, were uncovered outside Ritidian Beach Cave (also called Ritidian First Cave), within the larger Ritidian Site of northern Guam (SI Appendix, Figs. S1 and S2). The two individuals had been buried side by side, in extended position inside distinctive pits. The heads and torsos had been removed slightly later. Details of these findings have been reported elsewhere and situated within the larger site chronology and context (44). The two skeletons from Ritidian offer a rare view of ancient burial practice in the Marianas region, as similar burial practices have been observed in the Philippines (65, 66) and Indonesia (67). While the site and indeed this specific cave revealed multiple cultural occupation layers dating back to the first regional settlement about 3.5 kya, these two burials of RBC1 and RBC2 were found within the layer of ∼2.5 to 2 kya, confirmed by direct radiocarbon date from a bone of RBC2 of 2,180 ± 30 y BP (44). A tarsal bone was provided from each skeleton for ancient DNA analysis.

DNA Extraction, Library Preparation, and Whole-Genome Sequencing.

In an ancient DNA clean room, ∼1 mm of material was removed from the surface of each specimen and ∼50 mg bone powder obtained by drilling into the bone with a dentistry drill at low speed. DNA was extracted following a protocol provided elsewhere, using spin columns and binding buffer option “D” (68). DNA libraries were prepared from 10-µL aliquots of each DNA extract using an automated protocol for single-stranded library preparation (69) with a Bravo NGS workstation. Negative controls were included both during DNA extraction and library preparation; these contained water instead of sample powder or DNA extract, respectively. The number of library molecules obtained from each sample DNA extract was more than 100 times higher than in the extraction and library negative controls (Dataset S1). All libraries, including the negative controls, were then amplified and double-indexed (70) as described elsewhere (69).

Whole-genome sequencing data were generated on the Illumina HiSEq. 2500 platform (2× 76-bp paired-end sequencing). After de-multiplexing (requiring a perfect index), overlapping paired-end sequences were merged into full-length molecule sequences (71), and subsequently aligned to the human reference genome hg19 with decoy sequences (ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz), using bwa aln (72) with parameters optimized for ancient DNA [“-n 0.01 -o 2 -l 16500” (73)]. The sequencing data were filtered for a minimum read length of 35 bp and a minimal mapping quality of 25. Duplicate reads were removed using DeDup (74) and the number of substitutions compared with the human reference genome was quantified using damageprofiler (https://github.com/Integrative-Transcriptomics/DamageProfiler). Finally, we subset the sequencing data to reads for which we observed a C→T substitution in the first three bases at either read end (Dataset S1).

MtDNA Enrichment and Sequencing.

Libraries were enriched for human mitochondrial DNA using a synthetic probe set (75) encompassing the revised Cambridge reference sequence (rCRS) (76) in 1-bp tiling. Hybridization capture was performed in two successive rounds, following an on-bead capture protocol (77) implemented on the Bravo NGS workstation (78). The enriched libraries were pooled with libraries from other projects and sequenced on an Illumina MiSeq in paired-end mode (2× 76 cycles).

The sequencing data were processed as described above for the whole-genome sequencing data, but mapped to the rCRS using bwa aln with the same settings. Sequences were assigned to their respective source libraries, requiring perfect matching of both indices. Sequences that were shorted than 35 bp or that did not produce alignments with a map quality of at least 25 were discarded. PCR duplicates were removed using DeDup (74). After discarding libraries with contamination >25% (Dataset S2), estimated by a likelihood-based method (47), we obtained 32,386 unique reads for RBC1 and 94,116 unique reads for RBC2 (Dataset S3). Elevated frequencies of C→T substitutions at the beginning and end of sequence alignments, which result from cytosine deamination in ancient DNA (79, 80), were detected in the mtDNA reads (Dataset S2).

An in-house pipeline (https://github.com/alexhbnr/mitoBench-ancientMT) was used to call the mtDNA consensus sequence, which required a minimum of three reads and used snpAD (81) to infer the consensus allele while taking into account ancient DNA damage. Contamination was estimated by a likelihood-based method (47), and HaploGrep2 (82) was used to call mtDNA haplogroups.

Genome-Wide SNP Capture Enrichment and Sequencing.

Enrichment of the libraries for a panel of ∼1.2 million SNPs was performed using a set of DNA capture probes [“1240k”, composed of SNP panels 1 and 2, as described elsewhere (45)] and two successive rounds of in-solution hybridization capture (75). Sequencing of the enriched libraries and raw data processing were performed as described for the whole-genome sequencing above. Genotypes were inferred by randomly sampling an allele observed at each site after masking Ts at the five terminal bases at each read end by replacing them with Ns. For determining the Y chromosome haplogroup of the male sample, we subset the genotypes to the sites located on the human Y chromosome and analyzed them using yHaplo (83) with the nondefault option “–ancStopThresh 1e6”.

The sequencing data have been made available at the European Nucleotide Archive under accession no. PRJEB40707.

Genome-Wide SNP Data Analysis.

Comparative datasets.

Newly generated data from Guam were merged with published data from modern and ancient samples (Datasets S7 and S8) as follows. First, for comparisons to populations from Island Southeast Asia, the Guam data were merged with previously-curated data from 25 modern populations genotyped on the Affymetrix 6.0 array (27, 84, 85); to provide worldwide context these data were further merged with a subset of the whole genome sequences from the SGDP (86). Related individuals were identified based on kinship coefficients, estimated using the software KING (87), with subsequent removal of one individual from each pair. Pruning of SNPs in linkage disequilibrium (LD) was done using the PLINK tool (87) with the following settings: –indep- pairwise 200 25 0.4 (88). After these quality-filtering steps, there were 136,162 SNPs and 303 individuals from 72 populations from Eurasia and Oceania remaining for the analyses. This dataset was used only for PCA and ADMIXTURE analyses.

Second, to better resolve relationships with populations from Near and Remote Oceania, as well as with other ancient samples from Asia and the Pacific, we used data from 53 modern populations from Oceania and 39 populations from East Asia genotyped on the Affymetrix Human Origins array (24, 26, 8991), as well as previously published shotgun and capture-enrichment sequencing data from 82 ancient samples (2426, 46, 61, 92). After removing related individuals as described above for the Affymetrix 6.0 data, this dataset consisted of 1,194 individuals and 593,124 SNPs. Not all samples were used for all of the analyses. For PCA and ADMIXTURE analyses, we used an LD-pruned dataset of 216,996 SNPs. In addition, ancient samples with more than 15,000 missing sites were excluded from the ADMIXTURE analysis.

Data analyses.

We attempted to estimate relatedness between RBC1 and RBC2 by calculating the fraction of pairwise differences at 33,040 overlapping sites that are included on the Human Origins array. For comparison, we also calculated this fraction for modern samples from the 1000 Genomes Project dataset (57), which includes individuals with known degrees of relatedness, and for ancient samples from Southeast Asia and Oceania; the ancient DNA data were obtained from the Reich laboratory website (https://reich.hms.harvard.edu/downloadable-genotypes-present-day-and-ancient-dna-data-compiled-published-papers; v42.4).

PCA was performed as described previously (93) with one modification, namely for the analyses which included ancient samples, the principle axes were calculated based on modern samples, and the ancient samples were projected using least-squares projection (which is more appropriate than orthogonal projection for samples with high amounts of missing data), as described in the documentation to the smartpca software (94).

To infer individual ancestry components and analyze population structure, we used the ADMIXTURE software (95) in the unsupervised mode. For each dataset, we first removed SNPs in strong LD (r2 > 0.4) using the PLINK tool (96), and for the Human Origins dataset we further excluded ancient samples that had fewer than 15,000 SNPs remaining. We varied the number of ancestral populations (K value) from K = 2 to K = 8 for the Affymetrix 6.0 dataset, and from K = 5 to K = 12 for the Human Origins Array dataset. We performed 100 independent runs for each value of K, and used the cross-validation procedure implemented in the ADMIXTURE software to assess the best value of K.

To formally test population relationships suggested by PCA and ADMIXTURE analyses, we used outgroup-f3 and -f4 statistics, implemented in the ADMIXTOOLS software suite (90). All data processing and analyses were carried out using the admixr R package (97).

To model the relationships between modern and ancient samples, we first used the unsupervised TreeMix (59) and AdmixtureBayes methods (60) to infer topologies, that were then tested using the qpGraph software implemented in ADMIXTOOLS (90). We performed 10 independent runs of TreeMix with zero to five migration events, and report the tree with the highest likelihood. For the AdmixtureBayes analyses we increased the default number of Markov chain Monte Carlo steps to 1,000,000, as recommended by the developers to avoid convergence problems for a model with 10 populations. We used the 10 topologies with the highest posterior probabilities estimated by AdmixtureBayes as input graphs for qpGraph, which we ran with parameters: blgsize: 0.05; forcezmode: YES; lsqmode: YES; diag: 0.0001; bigiter: 6; hires: YES; λ-scale: 1. All three methods were applied to the exact same dataset, using all samples available for each population (Dataset S7), with the exception of the admixed modern Vanuatu. The amount of Polynesian ancestry in this population is highly variable, with a range of 9 to 38% today, so for our analyses we took all individuals from the island of Futuna, where Polynesian-related ancestry is highest (24). TreeMix and AdmixtureBayes do not allow sites with missing data, so for each SNP each population is required to have at least one genotype call. Since our model included three ancient populations, the number of sites available for these analyses was reduced to 76,284. For qpGraph it is possible to use an option which would maximize the number of sites for each computed statistic, but since we have modern and ancient data in the same analysis, this would result in dramatically uneven SNP sets for different comparisons. As this could bias the results, we therefore chose not to use this option. Mbuti was used as an outgroup in all of the admixture graph analyses.

Statistical programming was done using the statistical program R v4.0.1 (https://www.R-project.org/). We used the tidyverse (98), data.table (https://CRAN.R-project.org/package=data.table), Hmisc (https://CRAN.R-project.org/package=Hmisc), and pheatmap (https://CRAN.R-project.org/package=pheatmap) packages.

Supplementary Material

Supplementary File
Supplementary File
pnas.2022112118.sd01.xlsx (14.2KB, xlsx)
Supplementary File
pnas.2022112118.sd02.xlsx (11.9KB, xlsx)
Supplementary File
Supplementary File
pnas.2022112118.sd04.xlsx (87.4KB, xlsx)
Supplementary File
pnas.2022112118.sd05.xlsx (14.7KB, xlsx)
Supplementary File
Supplementary File
pnas.2022112118.sd07.xlsx (15.5KB, xlsx)
Supplementary File
pnas.2022112118.sd08.xlsx (12.3KB, xlsx)

Acknowledgments

We thank S. Nagel, B. Nickel, B. Schellbach, A. Schmidt, A. Weihmann, and M. Wunsch for performing the laboratory work; the Bioinformatics Group in the Department of Evolutionary Genetics at the Max Planck Institute for Evolutionary Anthropology for the initial processing of the sequencing data; P. Bellwood and R. Blust for comments on an earlier draft; and M. Hajdinjak for helpful discussion. Research at the Ritidian Site in Guam was performed under Special Use Permit 12518-1601 and Archaeological Resources Protection Act Permit 15GUA001, in cooperation with the Guam National Wildlife Refuge and the US Fish and Wildlife Service, and was funded by the Chiang Ching-kuo Foundation for International Scholarly Exchange (Grant RG021-P-10) and by the Australian Research Council (Grant DP150104458). Research was funded by the Max Planck Society. Disclaimer: PNAS policy is that reviewers of Inaugural Articles should not have coauthored a paper with any author within the past 4 years. Although reviewer G.R.S. and author H.-c.H. were coauthors of an invited festschrift chapter that appeared in Terra Australis in early 2017, their active collaboration for this work occurred in 2014, and G.R.S. and H.-c.H. have not collaborated since then. G.R.S. was therefore allowed to serve as a reviewer of this article.

Footnotes

The authors declare no competing interest.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2022112118/-/DCSupplemental.

Data Availability.

All data used in this paper are in the main text or in the SI Appendix. The new data reported in this paper have been deposited in the European Nucleotide Archive, https://www.ebi.ac.uk/ena/browser/home (accession no. PRJEB40707).

References

  • 1.Howells W. W., The Pacific Islanders (Charles Scribner’s Sons, New York, 1973). [Google Scholar]
  • 2.Blust R., Chamorro historical phonology (Mariana Islands, proto-Austronesian). Oceanic Linguistics 39, 83–122 (2000). [Google Scholar]
  • 3.Carson M. T., Peopling of Oceania: Clarifying an initial settlement horizon in the Mariana Islands at 1500 BC. Radiocarbon (2020), 10.1017/RDC.2020.89. [DOI] [Google Scholar]
  • 4.Athens J. S., Dega M. F., Ward J. V., Austronesian colonization of the Mariana Islands: The palaeoenvironmental evidence. Bull. Indo-Pacific Prehistory Assoc. 24, 21–30 (2004). [Google Scholar]
  • 5.Summerhayes G., “Lapita interaction—An update” in 2009 International Symposium on Austronesian Studies, Gadu M., Lin H., Eds. (National Museum of Prehistory, Taidong, 2010), pp. 11–40. [Google Scholar]
  • 6.Wilmshurst J. M., Hunt T. L., Lipo C. P., Anderson A. J., High-precision radiocarbon dating shows recent and rapid initial human colonization of East Polynesia. Proc. Natl. Acad. Sci. U.S.A. 108, 1815–1820 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Blust R., The Austronesian Languages (Australia National University, Research School of Pacific and Asian Studies, Canberra, 2013). [Google Scholar]
  • 8.Kirch P. V., Peopling of the Pacific: A holistic anthropological perspective. Annu. Rev. Anthropo. 39, 131–148 (2010). [Google Scholar]
  • 9.Wickler S., “Modelling colonisation and migration in Micronesia from a zooarchaeological perspective” in Colonisation, Migration and Margina Areas: A Zooarchaeological Approach, Mondini M., Munoz S., Wickler S., Eds. (Oxbow Books, Oxford, 2002), pp. 28–40. [Google Scholar]
  • 10.Hunter-Anderson R. L., Thompson G. B., Moore D. R., Rice as a prehistoric valuable in the Mariana Islands, Micronesia. Asian Perspect. 34, 69–89 (1995). [Google Scholar]
  • 11.Pietrusewsky M., Douglas M. T., “Review of Polynesian and Pacific skeletal biology” in Skeletal Biology of the Ancient Rapanui (Easter Islanders), Stefan V. H., Gill G. W., Eds. (Cambridge University Press, Cambridge, 2016), pp. 14–43. [Google Scholar]
  • 12.Lum J. K., Cann R. L., mtDNA and language support a common origin of Micronesians and Polynesians in Island Southeast Asia. Am. J. Phys. Anthropol. 105, 109–119 (1998). [DOI] [PubMed] [Google Scholar]
  • 13.Lum J. K., Cann R. L., mtDNA lineage analyses: Origins and migrations of Micronesians and Polynesians. Am. J. Phys. Anthropol. 113, 151–168 (2000). [DOI] [PubMed] [Google Scholar]
  • 14.Lum J. K., Cann R. L., Martinson J. J., Jorde L. B., Mitochondrial and nuclear genetic relationships among Pacific Island and Asian populations. Am. J. Hum. Genet. 63, 613–624 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lum J. K., Jorde L. B., Schiefenhovel W., Affinities among Melanesians, Micronesians, and Polynesians: A neutral biparental genetic perspective. Hum. Biol. 74, 413–430 (2002). [DOI] [PubMed] [Google Scholar]
  • 16.O’Shaughnessy D. F., Hill A. V. S., Bowden D. K., Weatherall D. J., Clegg J. B., Globin genes in Micronesia: Origins and affinities of Pacific Island peoples. Am. J. Hum. Genet. 46, 144–155 (1990). [PMC free article] [PubMed] [Google Scholar]
  • 17.Addison D. J., Matisoo-Smith E., Rethinking Polynesians origins: A West-Polynesia triple-I model. Archaeol. Ocean. 45, 1–12 (2010). [Google Scholar]
  • 18.Carson M. T., Hung H. C., Summerhayes G., Bellwood P., The pottery trail from Southeast Asia to Remote Oceania. J. Island Coast. Archaeol. 8, 17–36 (2013). [Google Scholar]
  • 19.Bellwood P., First Islanders: Prehistory and Human Migration in Island Southeast Asia (John Wiley & Sons, Hoboken, NJ, 2017). [Google Scholar]
  • 20.Bellwood P., Dizon E., “Austronesian cultural origins out of Taiwan, via the Batanes Islands, and onwards to western Polynesia” in Past Human Migrations in East Asia: Matching Archaeology, Linguistics and Genetics, A. SanchezMazas, Bleach R., Ross M. D., Peiros I., Lin M., Eds. (Routledge, London, 2008), vol. 5, pp. 23–39. [Google Scholar]
  • 21.Blust R., The Austronesian homeland and dispersal Annu. Rev. Linguist. 5, 417–434 (2019). [Google Scholar]
  • 22.Gray R. D., Drummond A. J., Greenhill S. J., Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323, 479–483 (2009). [DOI] [PubMed] [Google Scholar]
  • 23.Kayser M., The human genetic history of Oceania: Near and remote views of dispersal. Curr. Biol. 20, R194–R201 (2010). [DOI] [PubMed] [Google Scholar]
  • 24.Lipson M., et al. , Population turnover in remote Oceania shortly after initial settlement. Curr. Biol. 28, 1157–1165.e7 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Posth C., et al. , Language continuity despite population replacement in Remote Oceania. Nat. Ecol. Evol. 2, 731–740 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Skoglund P., et al. , Genomic insights into the peopling of the southwest Pacific. Nature 538, 510–513 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wollstein A., et al. , Demographic history of Oceania inferred from genome-wide data. Curr. Biol. 20, 1983–1992 (2010). [DOI] [PubMed] [Google Scholar]
  • 28.Vilar M. G., et al. , The origins and genetic distinctiveness of the Chamorros of the Marianas Islands: An mtDNA perspective. Am. J. Hum. Biol. 25, 116–122 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zobel E., “The position of Chamorro and Palauan in the Austronesian family tree: Evidence from verb morphosyntax” in The History and Typology of Western Austronesian Voice Systems, Wouk F., Ross M. D., Eds. (Pacific Linguistics, Canberra, 2002), pp. 405–434. [Google Scholar]
  • 30.Hung H. C., et al. , The first settlement of Remote Oceania: The Philippines to the Marianas. Antiquity 85, 909–926 (2011). [Google Scholar]
  • 31.Hung H. C., Carson M. T., Bellwood P., Earliest settlement in the Marianas— A response. Antiquity 86, 910–914 (2012). [Google Scholar]
  • 32.Winter O., Clark G., Anderson A., Lindahl A., Austronesian sailing to the northern Marianas, a comment on Hung et al. (2011). Antiquity 86, 898–910 (2012). [Google Scholar]
  • 33.Fitzpatrick S. M., Callaghan R. T., Estimating trajectories of colonisation to the Mariana Islands, western Pacific. Antiquity 87, 840–853 (2013). [Google Scholar]
  • 34.Montenegro Á., Callaghan R. T., Fitzpatrick S. M., Using seafaring simulations and shortest-hop trajectories to model the prehistoric colonization of Remote Oceania. Proc. Natl. Acad. Sci. U.S.A. 113, 12685–12690 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bergström A., et al. , A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Friedlaender J. S., et al. , The genetic structure of Pacific Islanders. PLoS Genet. 4, e19 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Pugach I., et al. , The gateway from Near into Remote Oceania: New insights from genome-wide data. Mol. Biol. Evol. 35, 871–886 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Summerhayes G. R., et al. , Human adaptation and plant use in highland New Guinea 49,000 to 44,000 years ago. Science 330, 78–81 (2010). [DOI] [PubMed] [Google Scholar]
  • 39.Hudjashov G., et al. , Complex patterns of admixture across the Indonesian archipelago. Mol. Biol. Evol. 34, 2439–2452 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Jacobs G. S., et al. , Multiple deeply divergent Denisovan ancestries in Papuans. Cell 177, 1010–1021.e32 (2019). [DOI] [PubMed] [Google Scholar]
  • 41.Xu S., Pugach I., Stoneking M., Kayser M., Jin L.; HUGO Pan-Asian SNP Consortium , Genetic dating indicates that the Asian-Papuan admixture through Eastern Indonesia corresponds to the Austronesian expansion. Proc. Natl. Acad. Sci. U.S.A. 109, 4574–4579 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Skoglund P., Mathieson I., Ancient genomics of modern humans: The first decade. Annu. Rev. Genom. Hum. G. 19, 381–404 (2018). [DOI] [PubMed] [Google Scholar]
  • 43.Carson M. T., Archaeological Landscape Evolution: The Mariana Islands in the Asia-Pacific Region (Springer International, Cham, Switzerland, 2016). [Google Scholar]
  • 44.Carson M. T., Cultural spaces inside and outside caves: A study in Guam, western Micronesia. Antiquity 91, 421–441 (2017). [Google Scholar]
  • 45.Fu Q., et al. , An early modern human from Romania with a recent Neanderthal ancestor. Nature 524, 216 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Lipson M., et al. , Ancient genomes document multiple waves of migration in Southeast Asian prehistory. Science 361, 92–95 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Fu Q., et al. , A revised timescale for human evolution based on ancient mitochondrial genomes. Curr. Biol. 23, 553–559 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Renaud G., Slon V., Duggan A. T., Kelso J., Schmutzi: Estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA. Genome Biol. 16, 224 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Delfin F., et al. , Complete mtDNA genomes of Filipino ethnolinguistic groups: A melting pot of recent and ancient lineages in the Asia-pacific region. Eur. J. Hum. Genet. 22, 228–237 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Soares P., et al. , Climate change and postglacial human dispersals in southeast Asia. Mol. Biol. Evol. 25, 1209–1218 (2008). [DOI] [PubMed] [Google Scholar]
  • 51.Tabbada K. A., et al. , Philippine mitochondrial DNA diversity: A populated viaduct between Taiwan and Indonesia? Mol. Biol. Evol. 27, 21–31 (2010). [DOI] [PubMed] [Google Scholar]
  • 52.Tumonggor M. K., et al. , The Indonesian archipelago: An ancient genetic highway linking Asia and the Pacific. J. Hum. Genet. 58, 165–173 (2013). [DOI] [PubMed] [Google Scholar]
  • 53.Duggan A. T., et al. , Maternal history of Oceania from complete mtDNA genomes: Contrasting ancient diversity with recent homogenization due to the Austronesian expansion. Am. J. Hum. Genet. 94, 721–733 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Karafet T. M., et al. , New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome Res. 18, 830–838 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Delfin F., et al. , Bridging near and remote Oceania: mtDNA and NRY variation in the Solomon Islands. Mol. Biol. Evol. 29, 545–564 (2012). [DOI] [PubMed] [Google Scholar]
  • 56.Karafet T. M., et al. , Major east-west division underlies Y chromosome stratification across Indonesia. Mol. Biol. Evol. 27, 1833–1844 (2010). [DOI] [PubMed] [Google Scholar]
  • 57.1000 Genomes Consortium et al. , A global reference for human genetic variation. Nature 526, 68–74 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Lawson D. J., van Dorp L., Falush D., A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots. Nat. Commun. 9, 3258 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Pickrell J. K., Pritchard J. K., Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Nielsen S. V., Inferring Gene Flow Between Populations with Statistical Methods (Aarhus Universitet, Aarhus, Denmark, 2018). [Google Scholar]
  • 61.Yang M. A., et al. , Ancient DNA indicates human population shifts and admixture in northern and southern China. Science 369, 282–288 (2020). [DOI] [PubMed] [Google Scholar]
  • 62.Craib J. L., “Colonisation of the Mariana Islands: New evidence and implications for human movements in the western Pacific” in The Pacific from 5000 to 2000 BP: Colonisation and Transformations, Galipaud J. C., Lilley I., Eds. (IRD Editions, Paris, 1999), pp. 477–485. [Google Scholar]
  • 63.Cunningham L. J., Ancient Chamorro Society (Bess Press, Honolulu, 1992). [Google Scholar]
  • 64.García F., The Life and Martyrdom of the Venerable Father Diego Luis de san Vitores, of the Society of Jesus, First Apostle of the Mariana Islands and Events of These Islands from the Year 1668 through the Year 1681. E. b. J. A. McDonough, Ed., Monograph Series Number 3 (Richard Flores Taitano Micronesian Area Research Center, University of Guam, Mangilao, Guam, 2004).
  • 65.Bolunia M. J. L. A., The archaeological excavation of the Bequibel shell midden. Journal of Southeast Asian Archaeology 25, 31–42 (2005). [Google Scholar]
  • 66.Hung H. C., Migration and Cultural Interaction in Southern Coastal China, Taiwan and the Northern Philippines, 3000 BC to AD 100: The Early History of the Austronesian Speaking Populations (Australian National University, Canberra, 2008). [Google Scholar]
  • 67.Galipaud J.-C., et al. , The Pain Haka burial ground on Flores: Indonesian evidence for a shared Neolithic belief system in Southeast Asia. Antiquity 90, 1505–1521 (2016). [Google Scholar]
  • 68.Rohland N., Glocke I., Aximu-Petri A., Meyer M., Extraction of highly degraded DNA from ancient bones, teeth and sediments for high-throughput sequencing. Nat. Protoc. 13, 2447–2461 (2018). [DOI] [PubMed] [Google Scholar]
  • 69.Gansauge M. T., Aximu-Petri A., Nagel S., Meyer M., Manual and automated preparation of single-stranded DNA libraries for the sequencing of DNA from ancient biological remains and other sources of highly degraded DNA. Nat. Protoc. 15, 2279–2300 (2020). [DOI] [PubMed] [Google Scholar]
  • 70.Kircher M., Sawyer S., Meyer M., Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Renaud G., Stenzel U., Kelso J., leeHom: adaptor trimming and merging for Illumina sequencing reads. Nucleic Acids Res. 42, e141 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Li H., Durbin R., Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Meyer M., et al. , A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Peltzer A., et al. , EAGER: Efficient ancient genome reconstruction. Genome Biol. 17, 60 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Fu Q., et al. , DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl. Acad. Sci. U.S.A. 110, 2223–2227 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Andrews R. M., et al. , Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat. Genet. 23, 147 (1999). [DOI] [PubMed] [Google Scholar]
  • 77.Maricic T., Whitten M., Pääbo S., Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS One 5, e14004 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Slon V., et al. , Neandertal and Denisovan DNA from Pleistocene sediments. Science 356, 605–608 (2017). [DOI] [PubMed] [Google Scholar]
  • 79.Briggs A. W., et al. , Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl. Acad. Sci. U.S.A. 104, 14616–14621 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Sawyer S., Krause J., Guschanski K., Savolainen V., Pääbo S., Temporal patterns of nucleotide misincorporations and DNA fragmentation in ancient DNA. PLoS One 7, e34131 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Prüfer K., snpAD: An ancient DNA genotype caller. Bioinformatics 34, 4165–4171 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Weissensteiner H., et al. , HaploGrep 2: Mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 44, W58–W63 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Poznik G., Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men. BioRxiv: 10.1101/088716 (19 November 2016). [DOI]
  • 84.Altshuler D. M.et al.; International HapMap 3 Consortium , Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Reich D., et al. , Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. Am. J. Hum. Genet. 89, 516–528 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Mallick S., et al. , The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Manichaikul A., et al. , Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Rasmussen M., et al. , An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Lazaridis I., et al. , Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Patterson N., et al. , Ancient admixture in human history. Genetics 192, 1065–1093 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Qin P., Stoneking M., Denisovan ancestry in east Eurasian and Native American populations. Mol. Biol. Evol. 32, 2665–2674 (2015). [DOI] [PubMed] [Google Scholar]
  • 92.McColl H., et al. , The prehistoric peopling of Southeast Asia. Science 361, 88–92 (2018). [DOI] [PubMed] [Google Scholar]
  • 93.Pugach I., Matveyev R., Wollstein A., Kayser M., Stoneking M., Dating the age of admixture via wavelet transform analysis of genome-wide data. Genome Biol. 12, R19 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Patterson N., Price A. L., Reich D., Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Alexander D. H., Novembre J., Lange K., Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Purcell S., et al. , PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Petr M., Vernot B., Kelso J., admixr-R package for reproducible analyses using ADMIXTOOLS. Bioinformatics 35, 3194–3195 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Wickham H., et al. , Welcome to the tidyverse. J. Open Source Softw. 4, (2019). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
Supplementary File
pnas.2022112118.sd01.xlsx (14.2KB, xlsx)
Supplementary File
pnas.2022112118.sd02.xlsx (11.9KB, xlsx)
Supplementary File
Supplementary File
pnas.2022112118.sd04.xlsx (87.4KB, xlsx)
Supplementary File
pnas.2022112118.sd05.xlsx (14.7KB, xlsx)
Supplementary File
Supplementary File
pnas.2022112118.sd07.xlsx (15.5KB, xlsx)
Supplementary File
pnas.2022112118.sd08.xlsx (12.3KB, xlsx)

Data Availability Statement

All data used in this paper are in the main text or in the SI Appendix. The new data reported in this paper have been deposited in the European Nucleotide Archive, https://www.ebi.ac.uk/ena/browser/home (accession no. PRJEB40707).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES