Skip to main content
Genome Biology and Evolution logoLink to Genome Biology and Evolution
. 2024 Aug 28;16(9):evae190. doi: 10.1093/gbe/evae190

Comparative Population Genomics of Arctic Sled Dogs Reveals a Deep and Complex History

Tracy A Smith 1,, Krishnamoorthy Srikanth 2, Heather Jay Huson 3,
Editor: Davide Pisani
PMCID: PMC11403282  PMID: 39193769

Abstract

Recent evidence demonstrates genomic and morphological continuity in the Arctic ancestral lineage of dogs. Here, we use the Siberian Husky to investigate the genomic legacy of the northeast Eurasian Arctic lineage and model the deep population history using genome-wide single nucleotide polymorphisms. Utilizing ancient dog-calibrated molecular clocks, we found that at least two distinct lineages of Arctic dogs existed in ancient Eurasia at the end of the Pleistocene. This pushes back the origin of sled dogs in the northeast Siberian Arctic with humans likely intentionally selecting dogs to perform different functions and keeping breeding populations that overlap in time and space relatively reproductively isolated. In modern Siberian Huskies, we found significant population structure based on how they are used by humans, recent European breed introgression in about half of the dogs that participate in races, moderate levels of inbreeding, and fewer potentially harmful variants in populations under strong selection for form and function (show, sled show, and racing populations of Siberian Huskies). As the struggle to preserve unique evolutionary lineages while maintaining genetic health intensifies across pedigreed dogs, understanding the genomic history to guide policies and best practices for breed management is crucial to sustain these ancient lineages and their unique evolutionary identity.

Keywords: canine evolution, Arctic dogs, sled dogs, canine genetic health, Siberian Husky


Significance.

The early evolution and genetic health of dogs have spurred significant interest with the public as well as the broader scientific community. In this research, we focus on the Siberian Husky as a model from the northeast Eurasian Arctic ancestral lineage of dogs and perform one of the largest genomic surveys of any modern breed. We reconstructed the evolutionary history of this lineage and found contemporary American and Eurasian Arctic sled dogs to be more distantly related than previously thought, pushing back the origin of sled dogs to the end of the Pleistocene. Recent West Eurasian admixture, small population size, and inbreeding introduce a challenge to the conservation of these ancient lineages.

Introduction

Recent advances in phylogenomics, population genetics, ancient DNA analysis, and comparative genomics are allowing us to reconstruct the past and dig deeper into the origin and evolution of dogs. It is generally accepted that the dog evolved alongside hunter–gatherer human populations in Eurasia sometime during the late Pleistocene from a now-extinct wolf ancestor. Although exactly when, where, and how many domestication events are still debated (Savolainen et al. 2002; Pang et al. 2009; Freedman et al. 2014; Skoglund et al. 2015; Perri et al. 2021), recent findings suggest that modern dogs are more closely related to ancient wolves from Eastern Siberia and China than to those from Western Eurasia, indicating a likely domestication event in East Eurasia (Bergström et al. 2022). However, ancient and modern western dogs have up to half of their ancestry related to southwestern Eurasian wolves, with evidence of genetic continuity from Neolithic to modern European dogs (Botigué et al. 2017). This may suggest a second domestication or a large amount of ancient western wolf admixture in these regions (Bergström et al. 2022). By 11,000 years ago, at least three major ancestral dog lineages appeared to be well established: East Asian, West Eurasian (including European, Indian, and African dogs), and Arctic/American (Frantz et al. 2016; Ní Leathlobhair et al. 2018; Bergström et al. 2020; Feuerborn et al. 2021). The Arctic/American lineage includes 9,500-year-old sled dog remains from Zhokhov Island, Siberia, several mid-Holocene dogs from Lake Baikal and other sites across Siberia, ancient North American precontact dogs, and modern sled dog breeds such as the Siberian Husky and Greenland sled dog (Sinding et al. 2020; Feuerborn et al. 2021). Notably, numerous dog and wooden sled remains from the Zhokhov site in Arctic Siberia provide compelling evidence that dogs were already pulling sleds 9,500 years ago (Pitulko and Kasparov 2017; Sinding et al. 2020). This phylogenetic clustering of modern-day and ancient sled dogs supports genetic continuity and indicates that the sled dog originated >9,500 years ago in the northeast Siberian Arctic (Sinding et al. 2020). Yet, while previous research has shown that the ancient American Arctic lineage of precontact dogs has been mostly lost after the colonization of the Americas (Ní Leathlobhair et al. 2018), the genomic history and legacy of the northeast Eurasian Arctic lineage of sled dogs remain largely unexplored.

Archaeological evidence suggests sled dogs were at an advanced stage of domestication by the time of the Zhokhov dogs. Pitulko and Kasparov (2017) concluded at least two fully domesticated and standardized dog forms were present at the ancient Zhokhov site with 10 out of 11 individuals “perfectly correspond[ing] to the modern Siberian Husky standard” in terms of body weight and height and a second larger breed similar in size to the modern Alaskan Malamute (hereafter Malamute) and Greenland sled dog. This is probably not a coincidence as we would expect strong purifying selection for optimal efficiency in working dogs. In sled dogs used for long-distance transportation, larger dogs cannot radiate heat fast enough and small dogs have problems conserving heat and keeping up with a team of larger dogs; thus, a body size of 38 to 55 lbs (17.2 to 24.9 kg) has been shown to be optimal in dogs that pull light loads over long distances (Phillips et al. 1981). Morphometric differences between the two forms suggest humans might have already been selecting dogs for different functions, with the Siberian Husky-like morph likely being better suited for pulling sleds long distances, and the larger morph for freighting, hunting, or defending against large predators. More recent, 2,000-year-old Arctic dog remains from the Aachim-Mayak site in West Chukotka also fall within the modern Siberian Husky standard (Pitulko and Kasparov 2017) and may demonstrate the temporal consistency of size and weight of at least one form of Arctic dog in this region.

The sled dogs that inhabited the far northeast of Siberia, east of the Indigirka River, and predominantly associated with Chukchi, Koryak, Yukagir, and Kamchadal people, were the progenitors of the modern Siberian Husky breed (Thomas and Thomas 2015). In coastal regions, these sled dogs were vital to the survival of humans whose main sustenance came from hunting walrus and seals on sea ice (Bogoslovskaya 2022). Sled dogs were also used for transportation across mountainous tundra with severe climatic conditions and winters that last much of the year. Indeed, it may be this harsh and isolated environment that contributed to the historic genetic isolation of northeastern Siberian sled dogs. This contrasts with the northwestern Siberian dog populations that appear to have experienced multiple admixture episodes from the Eurasian Steppe and Europe over the past few thousand years (Feuerborn et al. 2021). Although, by the late 1700s, Russian expansion had reached the far northeast, and by the early 20th century, Soviet collectivization led to significant conflict in this region (Kuzina 2022). Many affluent farmers, who were also the dog breeders, were killed (Crane 1977), resulting in a loss of knowledge and skills related to reindeer herding and dog sledding (Kuzina 2022). Around the same time, a large number of the Siberian dogs were imported to Alaska to compete in sled dog races against the much larger native Alaskan sled dogs and other dogs brought to Alaska during the gold rush. A subset of the many Siberian dogs, which arrived or directly descended from dogs arriving via direct importation from Siberia to Alaska between the years 1908 and 1930, were recognized as the foundation stock of the Siberian Husky breed by the American Kennel Club (AKC) in 1930.

By the mid-20th century in Russia, dog sledding was decreed as economically inefficient and damaging to the fisheries (Kuzina 2022). Many sled dogs were killed or interbred with foreign cultured breeds to produce larger dogs that could pull heavier loads and hunt and retrieve (Handford 1998; Kuzina 2022). However, the Chukchi people are believed to be the most likely to have maintained their sled dogs free of foreign admixture, particularly in isolated villages along the northeastern coast where the harsh climate also influences survival (Thomas and Thomas 2015; Potseluyeva 2022). By 1970, the sled dog population in eastern Siberia had collapsed due to modern transportation, a decline in hunting and fishing, food shortages, and disease introduced by foreign dogs (Bogoslovskaya 2022). Russian cynologists did not recognize the Chukotka sled dog as a breed, which, combined with the decree and efforts to interbred them with foreign nonsled dogs, may suggest local extirpation or genomic replacement over the past century. In 1988, a survey of dogs in Chukotka districts and villages revealed 1,594 dogs remained (compared to 50,700 in 1937 in Kamchatka alone), but only about 400 were likely of Arctic origin with the rest considered mixes with varying amounts of sled dog ancestry (Bogoslovskaya 2022). Attempts to restore the breed commenced in the 1980s, which was later recognized as the Chukotka Sled Dog (CSD) by the Russian Kynological Federation in the late 20th century.

The recognition of a breed by a kennel club primarily serves two purposes: gene pool isolation and pedigree record keeping. By prohibiting crossbreeding between recognized breeds, kennel clubs establish artificial breed barriers that ensure gene pool isolation and provide a detailed record of ancestry for registered dogs. Consequently, all the genetic diversity a kennel club-recognized breed will ever have is in the gene pool at breed formation, barring any permitted studbook additions or unpermitted crossbreeding and subsequent parentage and pedigree falsification. Here, we investigate the impact of breed barriers on the genetic health of the Siberian Husky, and we test for population differentiation within the breed based on how the dogs are used by humans (show, sled show, racing, pet, or Seppala). We define show dogs as those who compete in conformation shows and are generally bred to a written kennel club standard. Sled show dogs similarly compete in conformation but are also used for sledding and other mushing sports. Racing dogs are bred for function (e.g. to pull sleds for touring kennels or to participate in sled dog races) and may or may not be bred with a specific aesthetic vision. Pet dogs are typically bred for the sole purpose of selling pets to the public (and do not include dogs from the other populations who are also used as pets). Lastly, the Seppala dogs trace their history back to several Siberian dogs that were sold to a single kennel in the 1930s and a split between breeding philosophies. Recently, the Seppala Siberian population has been elevated to breed status in the Continental Kennel Club—although many are still registered as Siberian Huskies in other kennel clubs around the world.

To assess the impact of closed gene pools and breed barriers on genetic health, we leveraged the Zoonomia alignment of 240 mammals (Zoonomia Consortium 2020; Christmas et al. 2023) and measured runs of homozygosity (ROH) to infer damaging variants and assess levels of genetic diversity to aid in preservation of these evolutionarily unique dog lineages. We also use the Siberian Husky as a model to evaluate the genomic legacy of the Eurasian Arctic lineage using genome-wide single nucleotide polymorphism (SNP) data from 358 canids including 280 registered Siberian Huskies and 5 modern CSDs from Russia (see Materials and Methods). We analyzed this genomic data to investigate evolutionary relatedness, population structure, and West Eurasian admixture. We evaluated shared ancestry between modern sled dogs and ancient Arctic dogs, and we reconstructed the deep population history of Arctic dogs using fossil calibrations to estimate divergence times.

Results

Population Structure

To investigate the legacy of the northeast Eurasian Arctic lineage in modern dogs, we first performed a genomic survey to examine levels of isolation and admixture in the Siberian Husky (hereafter Siberian). Principal component analysis (PCA) (Fig. 1), ADMIXTURE (Fig. 2), and phylogenetic reconstruction (Fig. 3a) reveal significant population structure and suggest artificial isolation within the breed based on how they are used by humans has led to substantial genetic differentiation over small evolutionary timescales (<100 years). These populations are found in a near-continuous cline on the PCA (Fig. 1) with the show and Seppala populations forming divergent genetic clusters. ADMIXTURE (Alexander et al. 2009; Alexander and Lange 2011) results also suggest gene flow between modern populations of Siberian Huskies is low with four distinct genetic clusters in the Siberian breed: show, pet, Seppala, and racing (Fig. 2). Other than the sled show population, a few Siberians show a small amount of ancestry from other Siberian populations, but for the most part cluster into their respective owner-reported lineage. However, the sled show dogs did not group together using either method and instead clustered with show, pet, racing or a mixture of all other Siberian populations.

Fig. 1.

Fig. 1.

PCAs of Arctic and West Eurasian dogs. Each point represents an individual dog. Colors signify owner-reported population or breed. a) PCA of all breeds included in this study, including two projected ancient dogs (Zhokov and AL3194), using the balanced data set (n = 91). The two main principal components explain ∼29% of the observed genetic variation. b) PCA of all Siberian Husky (show, sled show, pet, racing, and Seppala populations) and Alaskan sled dogs with closely related individuals removed (kinship coefficient < 0.088; n = 218). Numbers denote the ten subsampled individuals per population used in the balanced data set. The ellipse represents the 99% confidence interval assuming a multivariate t-distribution. The two main principal components explain ∼40% of the observed genetic variation. Starting in the lower left-hand corner, images are of dogs in the show, pet, racing (firmly inside ellipse) (copyright owner Megan Moberly), Seppala (copyright owner Louise Evans), racing (outside of ellipse), and Alaskan sled dog (copyright owner Jaye Foucher) populations.

Fig. 2.

Fig. 2.

Barplot showing individual admixture proportions inferred by ADMIXTURE (Alexander et al. 2009; Alexander and Lange 2011) for select values of K using the balanced data set of ten dogs per population (five for CSD and nine for Greenland) with closely related individuals removed (kinship < 0.088). Each vertical bar represents an individual dog. Colors represent assignment to K source populations. The ancient precontact dog AL3194 clusters with Alaskan and Greenland sled dogs. The ancient Zhokhov dog shares most of its ancestry with Alaskan and Greenland sled dogs and also shares 22% of its ancestry with modern Siberian dogs in K = 8. Pet, racing, Seppala, show, and sled show are populations within the Siberian Husky breed. Barplot was visualized in StructuRly v.0.1.0 (Criscuolo and Angelini 2020).

Fig. 3.

Fig. 3.

Evolutionary relationships and admixture modeling in the Siberian Husky and other breeds. a) Cladogram showing the phylogenetic relationships between dogs in the balanced data set inferred using a GTR model and the approximately maximum-likelihood method in FastTree v2.1.11 (Geneious version 2023.0.4; available from https://www.geneious.com). Ancient Yana wolf was used to root the tree. Bootstrap support values are shown above the branches. b) Admixture graph inferring the demographic history using qpGraph in ADMIXTOOLS v2.0.0 (Patterson et al. 2012). Dotted lines and percentages reflect admixture events and ancestry proportions. Branch lengths and numbers denote genetic drift units (f2 statistics) multiplied by 1,000. The best-fitting model shown here is the one with the lowest (negative) likelihood score (6.85) and worst residual z-score (WR = 1.08) with eight admixture events. c) The European ancestry proportions (alpha) are shown as points and were computed using the ratio of f4(GSP:Wolf; X:Greenland)/f4(GSP:Wolf; GSD:Greenland). Vertical and horizontal lines denote three standard errors. Golden Retriever, a European breed, was included as a positive control. An absolute Z-score ≥ 3 was considered significant and are shown in color (gray is not significant).

To test for complex population structure, we identified the number of optimal genetic clusters using k-means, which were subsequently used in a discriminant analysis of principal components (DAPC). The show, sled show, pet, and racing Siberian populations all grouped together into a single cluster, and the Seppala population formed a unique cluster (supplementary figs. S1 to S4, Supplementary Material online); however, half of the racing individuals (Racing 1, 3, 5, 7, and 10) and four of the CSDs were assigned to the Alaskan/West Eurasian genetic cluster (supplementary figs. S3 and S4, Supplementary Material online). Yet, when we increased k to 9, the pet, racing, and show Siberians, as well as the West Eurasian dogs, also formed their own distinct genetic clusters. Additionally, at k = 9, all Siberians were also assigned to the correct owner-reported population (supplementary fig. S4b, Supplementary Material online) with the exception of the sled show Siberians, which clustered with show, racing, or pet populations. The four CSDs still clustered with the Alaskan sled dogs, and one CSD clustered with the pet Siberian population.

The placement of the 99% confidence ellipse around the Siberian Husky breed with closely related individuals removed in Fig. 1b excludes 13 racing Siberians found between Siberian and Alaskan sled dog clusters. Gene flow between Alaskan sled dogs (hereafter Alaskans) and some racing Siberians is also evident in the ADMIXTURE analysis (Fig. 2). At K = 8, half of the racing Siberians (5/10) were found to share between 15% and 40% of their ancestry with Alaskans, and three of them also share a small amount of Malamute ancestry (<2%), which is also common in modern Alaskans (Huson et al. 2010). Four of these dogs (Racing 1, 3, 5, and 7) were found outside the confidence ellipse on the PCA in Fig. 1b, while the fifth (Racing 10) was just inside the ellipse, and all five of these dogs clustered with the Alaskan/West Eurasian dogs in the DAPC (supplementary figs. S3 and S4, Supplementary Material online).

Consistent with ADMIXTURE, phylogenetic reconstruction also demonstrated four genetically distinct populations in the breed: show, pet, Seppala, and racing (Fig. 3a). However, while the show, pet, and Seppala populations were monophyletic, the racing population of Siberians was found to be paraphyletic with the five individuals that were identified as having Alaskan ancestry in the ADMIXTURE analysis (Fig. 2), who also clustered with the Alaskan and West Eurasian dogs in the DAPC (supplementary figs. S3 and S4a, Supplementary Material online), and were found outside of the 99% confidence interval on the PCA (Fig. 1b), as an outgroup to the rest of the breed. These dogs also formed their own clade despite substantial geographic separation between these individuals (three from North America, one from Europe, and one from Oceania) and no recent shared kinship (kinship coefficient < 0.088).

West Eurasian Admixture in the Arctic Lineage

To better understand the complex evolutionary history of Siberians and how it more broadly relates to that of other Arctic and ancient dogs, we reconstructed the population history using qpGraph (Fig. 3b) and qpAdm (supplementary table S1, Supplementary Material online) as implemented in ADMIXTOOLS v2.0.0 (Patterson et al. 2012; Maier et al. 2023). The Siberian breed was modeled with substantial drift between populations, which is consistent with our previous results. Significant European ancestry was modeled in the racing and Seppala populations (Fig. 3b). QpAdm, which tests for plausible ancestral source populations, also supported a model of significant European admixture in these populations (P > 0.05) and inferred similar admixture proportions (supplementary table S1, Supplementary Material online). In contrast, the show and sled show populations of Siberian Huskies and Greenland sled dogs were not modeled with significant admixture from any of the included or related breeds or unsampled ghost lineages.

Next, to test for gene flow with West Eurasian breeds that have previously been found to be in the genetic makeup of modern Alaskans (Huson et al. 2010; Thorsrud and Huson 2021), we tested all dogs in our data set for European introgression using Patterson's D-statistic (Fig. 4; supplementary figs. S5 and S6, Supplementary Material online), f3 statistics (supplementary fig. S7 and table S2, Supplementary Material online), and f4 ratio (Fig. 3c; supplementary table S3, Supplementary Material online) methods. We discovered significant introgression with multiple West Eurasian breeds in all but one of the Alaskans in our analysis (Fig. 4a; supplementary fig. S6, Supplementary Material online). The Alaskan sled dogs were also shown to have ∼45% European ancestry in the f4 ratio test (|Z| = 14.811) (Fig. 3c; supplementary table S3, Supplementary Material online). Thus, if gene flow were from Alaskan to the racing Siberians as suggested by the ADMIXTURE analysis, we should detect significant introgression with at least some West Eurasian breeds in the Siberian dogs sharing ancestry with Alaskans.

Fig. 4.

Fig. 4.

Individual D-statistics showing West Eurasian introgression in the Arctic breeds. A) A summary of D-statistics for the first ten individuals per population using (((Greenland3, Arctic Dog) European) Wolf). All dogs are shown in supplementary fig. S6, Supplementary Material online. Points show the D statistic and vertical and horizontal lines denote three standard errors. Progressively more negative D values indicate that the Arctic dog shares increasingly more derived alleles with the European breed, which signifies introgression. Absolute Z-scores ≥ 3 were considered significant (|Z| ≥ 3) and are shown in color whereas values < 3 are shown in gray and are not significant. B) A summary of West Eurasian introgression by population as determined by D statistics across all dogs in supplementary fig. S6, Supplementary Material online. Numbers represent the number of individual dogs showing significant introgression (|Z| ≥ 3) with that European breed.

We found ∼17% European ancestry (|Z| = 6.041) in the racing Siberians found outside the confidence ellipse on the PCA (Fig. 3c; supplementary table S3, Supplementary Material online), and significant introgression with German Shorthaired Pointer (GSP), German Shepherd (GSD), Saluki, and Golden Retriever in this population (|Z| ≥ 3.925) (supplementary table S4, Supplementary Material online), which is consistent with the qpGraph and qpAdm analyses. These outlier Siberians also share significantly more derived alleles with European breeds compared to all other populations of Siberians (supplementary fig. S5, Supplementary Material online). Additionally, all dogs sharing ancestry with Alaskan sled dogs in the ADMIXTURE (Alexander et al. 2009; Alexander and Lange 2011) analysis in Fig. 2 (K = 8) also have significant introgression with at least one of the European breeds included in this analysis (Fig. 4a), providing evidence for Alaskan or West Eurasian to Siberian gene flow in these dogs. Hereafter, we will refer to these outlier racing Siberians as the “admixed racing” population.

Similarly, the racing and Seppala populations of Siberians were found to have significant introgression with GSP (|Z| > 3.261) and Saluki (|Z| > 3.566) (supplementary fig. S5 and table S4, Supplementary Material online) and f4 ratio tests showed ∼10% European ancestry (Fig. 3c; supplementary table S3, Supplementary Material online) in both populations exclusive of the admixed racing Siberians found outside of the confidence ellipse in the PCA, which is also consistent with the admixture model and proportions inferred by qpGraph (Fig. 3b) and qpAdm (supplementary table S1, Supplementary Material online). On an individual level, about half of the 111 Racing and 24 Seppala lineage Siberians included in this analysis were found to have significant European introgression (Fig. 4b; supplementary fig. S6, Supplementary Material online). In contrast, only 5 out of the 73 show lineage and 4 out of the 50 sled show Siberians have significant European introgression (Fig. 4; supplementary fig. S6, Supplementary Material online), and neither of these populations was found to have significant Alaskan sled dog or European ancestry using ADMIXTURE (Fig. 2), f4 ratios (Fig. 3c; supplementary table S3, Supplementary Material online), D-statistics (supplementary table S4, Supplementary Material online), qpGraph (Figs. 3b and 5b), or qpAdm (supplementary table S1, Supplementary Material online). Similar to the modern Alaskan, the CSD has experienced comparable admixture from European breeds in all five of the modern CSD included in this analysis, especially with GSD (Figs. 4a and 5b and c; supplementary fig. S7, Supplementary Material online).

Fig. 5.

Fig. 5.

Fossil calibrated divergence time estimates and admixture modeling. a) A time-calibrated MCC tree using SNAPP v1.6.1 (Bryant et al. 2012), implemented in BEAST2 v2.7.5.0 (Bouckaert et al. 2014), using divergence time estimation methods of Stange et al. (2018). Labels to the right of the nodes are mean height node ages. The 95% HPD intervals for node ages are shown with gray bars. b) Admixture graph inferring demographic history using qpGraph in ADMIXTOOLS v2.0.0 (Patterson et al. 2012). Dotted lines and percentages reflect admixture events and ancestry proportions. Numbers on branches denote genetic drift/selection units (f2 statistics) multiplied by 1,000. The best-fitting model shown here has three admixture events with a score of 3.61 and WR = 1.16 and shows substantial ancient wolf ancestry in the Zhokhov/Greenland lineage. c) Pleistocene wolf introgression is also evident in the Greenland sled dog/Zhokhov lineage using TreeMix (Pickrell and Pritchard 2012) and is not found in other Arctic dog breeds. Residuals above 0 indicate the two breeds are more closely related than illustrated by the tree and thus may be admixed. d) A predicted phenotypic reconstruction of the ancient Zhokhov sled dog based on genotypes at various coat color and other trait loci (supplementary table S10, Supplementary Material online).

Ancient Arctic Ancestry

The Greenland sled dog was found to be the closest extant relative included in our analysis of the Zhokhov and Port au Choix precontact dog lineages (Figs. 3a and 5b and c; supplementary figs. S7 to S10 and tables S2 and S5 to S8, Supplementary Material online). ADMIXTURE analysis demonstrated a close affinity of the 4,000-year-old Port au Choix precontact dog (AL3194) with the American sled dogs, with the majority (72%) of its ancestry clustering with the Alaskans and 28% with the Greenland sled dog (K = 8; Fig. 2). The Zhokhov dog also clustered mostly with the American sled dogs (34% Alaskan and 44% Greenland), but 22% of its ancestry was shared with Siberian dog and CSD (K = 8; Fig. 2). All modern Arctic dogs were found to share a significant proportion of ancestry with the Zhokhov/American precontact dog lineages, but the Greenland sled dog and Malamute derived far more (70% to 72%) than the Siberian lineages (28% to 42%) (supplementary table S6, Supplementary Material online). In the Siberian dogs, the Seppala and modern CSD share the least ancestry with the ancient Zhokhov sled dog (28%), and the show and sled show populations share the most (42%) (supplementary table S6, Supplementary Material online). Similar to the modern European dogs, the Samoyed, which was found to have mostly West Eurasian ancestry (Fig. 3a to c; supplementary figs. S3 and S4, Supplementary Material online), was not found to share significant ancestry with the ancient sled dogs using f4 ratio (supplementary table S6, Supplementary Material online) but does share an excess of derived alleles with the Zhokhov dog compared to European dogs using D-statistics (supplementary table S7, Supplementary Material online).

Deep Population History

To estimate divergence times within the Arctic clade, we first utilized fossilized birth–death (FBD) (Heath et al. 2014) and optimized relaxed clock (Douglas et al. 2021) models in a Bayesian phylogenetic inference in BEAST2 (Bouckaert et al. 2014, 2019). This analysis (supplementary fig. S10, Supplementary Material online) strongly supported a topology congruent with our Treemix (Pickrell and Pritchard 2012) and FastTree (as implemented in GENEIOUS PRIME v2023.0.4; https://www.geneious.com) results (Figs. 3a and 5c; supplementary fig. S9, Supplementary Material online), indicating the presence of at least two Arctic dog lineages in northeast Arctic Siberia at the end of the Pleistocene/early Holocene (95% Highest Posterior Density [HPD]: 9,700 to 14,300 years bp). Divergence timing estimates of the Arctic lineages using SNAPP (Bryant et al. 2012) and the methods of Stange et al. (2018) also confirmed these results (Fig. 5a) placing the divergence between these two modern Arctic lineages at the end of the Pleistocene ∼12,800 years ago.

Next, to investigate the complex population history of these divergent Arctic dog lineages, we explored ancient admixture using the find_graph() function of qpGraph (Fig. 5b) (Patterson et al. 2012; Maier et al. 2023). The best-fitting model of the f-statistics shows the Zhokhov dog sharing the majority of its ancestry with the Arctic dog lineage but ∼36% derived from ancient wolves. Treemix (Pickrell and Pritchard 2012) also showed Pleistocene wolf ancestry in the Zhokhov and Greenland sled dogs, as well as European breed introgression in the Malamute, Alaskan, and Chukotka sled dogs (Fig. 5c). In contrast, we did not find any wolf introgression in a large pool of modern Siberian Huskies and Alaskan sled dogs or in the few Alaskan Malamutes and CSDs included in this study. This is consistent with D-statistics that showed significant Pleistocene wolf introgression in Zhokhov, Port au Choix precontact dog AL3194, and Greenland sled dogs, but not in other Arctic dogs (supplementary table S9, Supplementary Material online).

Finally, to investigate phenotypic similarities to modern sled dog breeds, we predicted what the ancient Zhokhov dog looked like from the genome sequence. Combined with ASIP data from Bannasch et al. (2021) and IGF1-AS data from Plassais et al. (2022), the Zhokhov dog, which was sequenced from the second largest mandible discovered at the former site, was likely a medium body size with a double coat, black coat color with tan points, and a moderate amount of white spotting, with brown eyes (Fig. 5d; supplementary table S10, Supplementary Material online). This ancient dog carried recessive red (e1) at the MC1R gene and an allele that decreases pheomelanin intensity at the MFSD12 gene, which is common in modern sled dogs. Interestingly, Zhokhov does not carry blue eyes or ancient red (eA), which are both commonly found in modern Siberian Huskies but not in Greenland sled dogs.

Genetic Health

Population Size

Like many dog breeds today, the modern Arctic breeds have low effective population sizes ranging from 41 in Malamutes to a high of 192 in Greenland sled dogs and 571 in pet Siberian dogs (Fig. 6a). Among the Siberian Huskies, the show and Seppala populations have the greatest amount of linkage disequilibrium (LD) and smallest effective population size (Ne = 79 and 47, respectively). While the Siberian Husky and Greenland sled dog have effective population sizes over 50, the Malamute and Seppala Siberians do not. All breeds tested here, with the exception of the pet Siberian population, have Ne < 200, down from ∼3,000 in the ancestral pool of the Siberian Husky and Alaskan Sled dog, and 500 in the Greenland sled dog, just 125 generations ago (Fig. 6a).

Fig. 6.

Fig. 6.

Summary of the genetic health across populations of the Siberian Husky and other Arctic breeds. Asterisks represent a significant difference between populations using the Wilcoxon test. Show, racing, Seppala, and pet are populations of the Siberian Husky. a) Effective population size (Ne) over time estimated from LD in modern Arctic breeds. The Seppala Siberian and Malamute have the lowest Ne today at 47 and 41, respectively. b) Boxplot distribution of FROH across Siberian Husky populations and other Arctic breeds. Horizontal lines represent the median, vertical lines represent the range, larger circles represent the mean, and dots represent outliers. The pet and racing FROH are not significantly different. c) Distribution of the frequency of mutated evolutionarily constrained positions (PhyloP FDR < 0.05) in the Siberian Husky. d) Distribution of the frequency of deleterious mutations in the Siberian Husky.

Inbreeding

Using the Siberian Husky and its distinct populations with varying levels of admixture as a model to investigate the impact of closed gene pools on genetic diversity, we found the relatively nonadmixed show population had the highest average proportion of their genome contained in ROH (FROH = 31%) (Fig. 6b; supplementary fig. S11, Supplementary Material online) and the lowest heterozygosity out of all Siberian populations (supplementary fig. S12, Supplementary Material online). In contrast, the racing population had a significantly lower average FROH of 18% (Wilcoxon test, P = 2.2 ∗ 10−16) and the highest observed heterozygosity out of all Siberian populations (supplementary fig. S12, Supplementary Material online). Within the Siberian breed, the proportion of pairs of dogs in each population that share segments of DNA that are identical by descent (IBD) ranges from a low of 31% in the racing population to a high of 90% in the Seppala population (supplementary table S11, Supplementary Material online). The Greenland sled dog also has a very high proportion of pairs of dogs sharing IBD segments (92%), in contrast with the Alaskan sled dog with only 7%.

Potentially Harmful Variants

By leveraging the use of the dog-referenced version of Zoonomia's 240 mammal Cactus alignment to annotate positions that were evolutionarily constrained in the Siberian Husky (Zoonomia Consortium 2020; Christmas et al. 2023), we found the Seppala and pet Siberian populations to have significantly higher frequencies of mutated evolutionarily constrained variants compared to the racing population (Fig. 6c). Additionally, the Seppala population also had significantly more variants predicted to be deleterious using the SnpEff annotation (Cingolani et al. 2012) (Fig. 6d).

Discussion

Population Structure

Population structure is a consequence of nonrandom mating when isolated populations share fewer common ancestors than would be expected by chance. When dogs are used for different purposes, they become more specialized with traits that enhance that purpose. This results in increased isolation as humans tend to artificially select mates within a population due to a perceived disadvantage of losing gains when outcrossing to populations that have not been bred for their specific purpose. Therefore, it is unsurprising to find population structure within the Siberian Husky breed based on how they are used by humans, and this divergence between populations will continue to increase over time unless substantial gene flow between populations occurs.

West Eurasian Admixture in the Arctic Lineage

Divergence between populations can also be caused by gene flow with other breeds, and this likely explains the second cluster of racing Siberians pulling toward the Alaskans in the PCA in Fig. 1. These dogs share a significant portion of their ancestry with Alaskan sled dogs (Fig. 2) and modern European breeds (Figs. 3c and 4), which explains their placement on the PCA (Fig. 1), DAPC (supplementary figs. S3 and S4, Supplementary Material online), and phylogenetic tree (Fig. 3a) as European/Alaskan admixture would increase their genetic distance to the other Siberian populations. The finding of European breed introgression in about half of the Siberian dogs that compete in sled dog races, with notable European ancestry in both the Racing and Seppala populations, but minimal admixture in other Siberian populations (Figs. 3b and c and 4; supplementary tables S1, S3, and S4, Supplementary Material online), supports Alaskan or European breed gene flow into the Siberian Husky after breed formation.

Alaskan sled dog or European breed crosses in the Siberian Husky may have been used to introduce performance-enhancing alleles from faster and more heat-tolerant breeds, which would have been advantageous as sled dog racing expanded into warmer climates. The Alaskan sled dog, primarily bred for performance and lacking formal recognition by kennel clubs, has become a leading competitor in sled dog races, including the Iditarod, winning numerous events across various distances. Although their origins are initially from the native village dogs in Alaska, today they are mainly purpose bred for sprint and distance racing and have included many different breeds in their genetic makeup. Crosses to European breed groups such as sighthounds and pointers have been used to improve speed and heat tolerance in the Alaskan over the past century (Huson et al. 2012). They have evolved their own unique genomic breed signature (Huson et al. 2010; Thorsrud and Huson 2021), which is supported by our data, yet they still share a large proportion of their ancestry with ancient sled dogs (supplementary tables S6 and S7, Supplementary Material online).

The modern CSD was also found to have significant European ancestry (Figs. 3c, 4, and 5b, c; supplementary tables S1, S3, and S4, Supplementary Material online) on an Arctic genetic background, which could explain their proximity to the Alaskans on the PCA and DAPC (Fig. 1; supplementary figs. S3 and S4, Supplementary Material online). The systematic breeding of sled dogs in Russia for a larger freighting dog (Handford 1998), as well as Soviet collectivization and human migrations to eastern Siberia along with their western dogs (Bogoslovskaya 2022), is a likely explanation of the sizable amount of European introgression in these modern dogs. Additionally, two of the CSDs share a large proportion of ancestry with modern Siberians and Alaskans (Fig. 2) and one also shares IBD segments with a large number of modern show and pet Siberian dogs, suggesting recent admixture with Siberian Huskies. These findings suggest the modern Siberian Husky, with the exception of about half the Siberians that participate in races, may be the least-admixed Eurasian Arctic lineage remaining without significant recent West Eurasian admixture. However, it should be noted that we only sampled five CSDs from two different kennels. Additional sampling may reveal more isolated populations with less West Eurasian admixture.

Ancient Arctic Ancestry

Previous studies have demonstrated a continuous Arctic lineage of dogs for at least 9,500 years (Brown et al. 2013; Sinding et al. 2020; Feuerborn et al. 2021). We also found the 9,500-year-old sled dog from Zhokhov Island, Siberia shares significantly more derived alleles, ancestry, and drift with modern Arctic dogs compared to modern European dogs (Fig. 5b; supplementary fig. S8 and tables S6 and S7, Supplementary Material online), which recapitulates our phylogenetic data (Figs. 3a and 5c; supplementary fig. S10, Supplementary Material online) showing the close phylogenetic relationship between ancient and modern sled dogs. Similar to Feuerborn et al. (2021), we also found the Greenland sled dog as sister to the ancient Siberian and American sled dogs (Figs. 3a and 5c; supplementary fig. S10, Supplementary Material online). In agreement with this phylogenetic topology, we observed that the Greenland sled dog derives a large portion of its ancestry from this ancient lineage while modern Siberian dogs share far less ancestry and drift with these ancient dogs (Figs. 2 and 5b; supplementary fig. S8 and tables S2 and S6, Supplementary Material online). A recent study found this was likely due to admixture in the Siberian Husky and demonstrated significant European introgression in non-Greenland sled dog breeds thus concluding that the Greenland sled dog shares the most genomic ancestry with the Zhokhov lineage because they are the least admixed (Sinding et al. 2020). However, we found that European admixture is mostly present in a proportion of Siberian Huskies that are used for racing and other Siberian populations do not have significant recent West Eurasian introgression. Moreover, we show the Malamute has significant European ancestry (13% to 16%, Z ≥ 4.248; Fig. 3c and supplementary table S3, Supplementary Material online) and derives greater ancestry from Zhokhov than the relatively nonadmixed show population of Siberian Husky (70% vs. 42%; supplementary table S6, Supplementary Material online). Thus, the recent West Eurasian admixture is likely not a sufficient explanation for why the Greenland sled dog and Malamute share more ancestry with the ancient Zhokhov dog than the Arctic dogs from Siberia. We thus think that the greater genetic similarity between Greenland dogs and the ancient Zhokhov dog, with reference to the Siberian Husky, does not reflect admixture levels in modern Siberian dogs, but the order in which these dog lineages split from each other as suggested by their phylogenetic placement.

Deep Population History

To test this hypothesis of ancient population substructure, we investigated divergence timing in the Arctic lineage using ancient dogs to calibrate relative time to an estimate of absolute age. Importantly, we performed extensive admixture analysis on our Arctic dog data set and included only those dogs and breeds showing no recent West Eurasian introgression in our investigation of divergence timing. Admixture between clades will confound phylogenomic analyses leading to an over- or underestimation of internal node ages and has been shown to be relatively common in modern pedigreed dogs (Meadows et al. 2023), which we have also demonstrated in this study.

Our results suggest an ancient split between Siberian and Greenland/Zhokhov sled dog lineages at the end of the Pleistocene (Fig. 5a; supplementary fig. S10, Supplementary Material online), which coincides with abrupt climatic shifts occurring as the Earth transitioned from a glacial to a warmer and wetter interglacial cycle characterized by habitat loss and subsequent large megafaunal extinctions (Monteath et al. 2021). This may have initiated or accelerated the utility of canids for long-distance transportation and hunting and hauling smaller game such as caribou and various sea mammals driving the selective breeding of dogs for different purposes. Our hypothesis of an earlier split of the Siberian from the Zhokhov/Greenland lineage is also supported by an Arctic/American genomic shared drift cline (supplementary fig. S8, Supplementary Material online) observed from Greenland across the Bering Strait through Siberia to Europe. Additionally, the Zhokhov and Greenland split occurred just before or during the flooding of the Bering land bridge that hampered and eventually prevented biotic migration ∼11,000 to 9,500 years ago (Elias et al. 1996) and likely represents a later dispersal of dogs into the Americas after the initial split found in previous studies (Ní Leathlobhair et al. 2018; Ameen et al. 2019). Our phylogenetic analyses also suggest modern Greenland dogs descend from a lineage that split more recently from the Port au Choix precontact dog compared to all other modern Arctic lineages, which is supported by an excess of shared derived alleles (supplementary table S8, Supplementary Material online) and greater shared genetic drift (supplementary table S2 and fig. S8, Supplementary Material online) with Greenland dogs compared to every other modern Arctic breed.

Pleistocene Siberian wolf ancestry has previously been found in the Arctic dog lineage (Skoglund et al. 2015; Ní Leathlobhair et al. 2018; Sinding et al. 2020; Feuerborn et al. 2021; Ramos Madrigal et al. 2021). Here, however, we find this ancient wolf admixture in Greenland, Zhokhov, and precontact AL3194 dogs, but we did not find evidence for excess allele sharing in a large pool of modern Siberian Huskies or other sled dogs compared to all other tested Arctic and European breeds, and neither Siberians nor CSDs were modeled with Pleistocene wolf ancestry using qpGraph or Treemix (Fig. 5b and c; supplementary table S9, Supplementary Material online). Previous studies have also shown a strong signal of Yana wolf admixture in Greenland and Zhokhov dogs, but the evidence for ancient wolf introgression in Siberian dogs was much weaker (Sinding et al. 2020; Feuerborn et al. 2021). Combined with our results, we think the most parsimonious explanation is that Pleistocene wolf admixture occurred after the Siberian lineage split off from the Zhokhov/Greenland lineage. It is possible that gene flow from Pleistocene wolves could have been used by ancient North Siberian people to increase size and hunting ability in the ancestors of the Zhokhov/Greenland lineage, which may have been useful as the climate warmed, megafaunal animals went extinct, and traditional food sources likely became scarce, while still maintaining lineages of sled dogs for transportation and other activities that are better performed by dogs than wolf hybrids. We do not think this finding can be explained by variable proportions of distinct eastern and western wolf progenitor ancestries within the Arctic dog clade since Greenland sled dogs and Siberian Huskies had similarly low proportions of the western wolf progenitor in Bergström et al. (2022), and none of the Arctic dogs in our data set other than Greenland and Zhokhov show an affinity to the Pleistocene wolf even when compared to modern West Eurasian dogs. Alternatively, it is also possible this pattern may result from unsampled admixture in the Siberian dogs. It is important to note that this ancient admixture with wolves in the Greenland/Zhokhov lineage may also confound our divergence time estimates. Additional analyses including more ancient Siberian dog and wolf genome sequences will help refine these divergence time estimates in the future.

Genetic Health

Population Size

Among the Siberian dogs, the Show and Seppala populations have the greatest amount of LD and smallest effective population size (Ne = 79 and 47, respectively), which is probably due to the large amount of inbreeding in these populations, which leads to a sustained increase in LD and a reduction in effective population size (Fig. 6a). The recent increase in Ne over the past ten generations in pet Siberian Huskies may be due, in part, to the increased popularity of the breed among pet owners and hobby breeders; however, the shape of the curve also suggests recent admixture and thus these data should be interpreted with caution. In terms of conservation and viable populations, we aim for a minimum effective population size of 50 in the short term to reduce loss of heterozygosity and inbreeding depression and a long-term Ne > 500 to maintain evolutionary potential and reduce the random effects of genetic drift. These numbers were based on observations of domestic animals, which demonstrated that selection for performance and fertility could not overcome inbreeding depression if the inbreeding coefficient was increasing by more than 1% per generation (Franklin 1980).

Inbreeding

Pedigree-based inbreeding coefficients (i.e. COI) provide an estimate of the expected proportion of the genome that is IBD and contained in ROH. Using genome-wide DNA markers, we can not only determine the proportion of the genome that is in ROH, but we can also use the length of the ROH to infer when two lineages shared a common ancestor with longer segments resulting from more recent inbreeding. The racing population has the lowest amount of inbreeding out of all Siberian populations while the show population has the highest (Fig. 6b). Importantly, about half of the inbreeding in the modern Siberian Husky is due to recent inbreeding within the last ∼6 generations (supplementary fig. S11, Supplementary Material online). In contrast, the majority of inbreeding in the Greenland sled dog is much older. The number of ROH in individuals as a function of ROH size suggests an extreme historical bottleneck in the Greenland sled dogs with a large number of small fragments (supplementary fig. S13, Supplementary Material online).

The large proportion of genomes contained in ROH in these populations is also underscored by a large amount of IBD sharing between pairs of individuals within the Siberian, Malamute, and Greenland sled dog populations (supplementary table S11, Supplementary Material online). Nearly every pair of dogs in the Seppala and Greenland populations shares a proportion of their genome coming from a common ancestor, thereby making it challenging to avoid the mating of relatives and increasing the risk of inbreeding depression. Around two-thirds of the pairs of dogs in the show, sled show, and pet populations share IBD segments indicating at least one recent common ancestor. In contrast, only 31% of the pairs of dogs in the racing population share IBD segments, which may be explained by the recent admixture in this population which decreases IBD sharing (Blondeau Da Silva et al. 2024).

Potentially Harmful Variants

Mutations in positions that are evolutionarily constrained across species are enriched for disease-causing variants (Sullivan et al. 2023). Similarly, mutations that result in loss-of-protein function may result in disease phenotypes. Thus, the frequency of these types of variants in a population is informative for genetic health. Fewer potentially harmful variants in the racing, show, and sled show Siberian populations (Fig. 6c and d) may suggest strong selection for form and function, while maintaining effective population sizes > 50, play a role in purging genomes of some harmful variants and reducing genetic load. Alternatively, relaxed selection pressure in the pet population may allow variants that would typically reduce fitness without human intervention or medical care to persist. The Seppala sled dog likely represents a case of a critically low population size (Ne = 47) that makes it susceptible to the random effects of genetic drift and reduces the efficiency of selection. While the Seppala Siberian is genetically distinct from other Siberian Huskies, further breeding isolation may be deleterious to the health of the population based on these results.

Conclusion

European colonization, Soviet expansion, and intentional crossbreeding have led to West Eurasian breed introgression in modern Arctic dog populations, potentially threatening the genomic identity or replacement of unique dog lineages that have existed for thousands of years (Handford 1998; Ní Leathlobhair et al. 2018; Sinding et al. 2020; Feuerborn et al. 2021; Bogoslovskaya 2022). Here, we found that the Siberian Husky (excluding around half of the dogs that participate in sled dog races) and Greenland sled dog represent two of the least-admixed remaining populations of modern Arctic dogs, descending from ancient dog lineages with a continuous history since the late Pleistocene. However, modern breed barriers and the isolation of small, purpose-specific populations have led to significant differentiation and moderate levels of inbreeding in the Siberian Husky over short evolutionary timescales, presenting challenges to the long-term conservation of this lineage. As the struggle to preserve distinct evolutionary lineages while maintaining genetic health intensifies across pedigreed dogs, understanding the genomic history is essential for developing effective policies and best practices for breed management.

Materials and Methods

Sample Collection

Three hundred and forty-four canids were sampled either by using either blood (n = 114) and buccal swab (n = 26) or by collecting raw genomic data from commercially available Embark DNA testing directly from owners (n = 204). This includes 280 AKC or equivalent kennel club-registered Siberian Huskies, 10 AKC-registered Alaskan Malamutes, 42 Alaskan sled dogs (sprint/distance/heritage), 5 CSDs, and 1 AKC-registered dog each from the following breeds: GSD, Golden Retriever, GSP, Samoyed, and Saluki. Raw SNP data were also obtained from two gray wolves in the western United States and Arctic Canada. Prior to DNA or raw data collection, all owners completed a survey online to provide dog information and consent to use the sample. Owners were asked to report how dogs were used (pet, show, and racing/working) and their lineage (pet, show, racing/working, and Seppala). Sled show describes any dog reported as both show and racing/working. Pet describes any dog bred solely as a companion dog and excludes dogs bred for show or racing/working but used only as a pet. The final data set for Siberian Huskies included 111 racing/working lineage, 24 Seppala lineage, 73 show lineage, 50 sled show, and 22 pet lineage dogs. Geographically, 236 Siberians were from North America, 40 from Western Eurasia, and 4 from Oceania. Alaskan sled dogs were from North America (37) and Europe (5).

Blood samples were collected from dogs at AKC events or at their home kennels in North America. Whole blood samples were collected from the cephalic vein in 3- to 5-mL EDTA or ACD tubes. Embark raw data were collected from volunteers via social media posts or through targeted sampling (owners were mailed an Embark DNA collection kit) to include global representation of lineages.

DNA Extraction and SNP Genotyping

For blood samples, genomic DNA was extracted using the standard Qiagen DNA Extraction Protocol. An Epoch microplate spectrophotometer was used to determine DNA purity and quantity. Cheek swabs were sent directly to Illumina for processing. All dogs were genotyped on the Embark customized Illumina CanineHD Whole-Genome BeadChip, which is commercially available for research and industry purposes. The data set for 344 canids included 237,211 variants with a genotyping rate of 0.913. To this data set, we added SNP data from the Port au Choix precontact dog (Ní Leathlobhair et al. 2018: AL3194) and diploid called genotypes from publicly available genomes from Greenland sled dogs (n = 9) (Sinding et al. 2020) and two ancient canids, Zhokhov dog (Sinding et al. 2020; Zh-03-97) and Yana wolf (Sinding et al. 2020; Y06-NP-18994). The Greenland, Zhokhov, and Yana wolf genome sequences were downloaded from the sequence read archive (Bioproject id: PRJNA608847). The reads were processed using the PALEOMIX pipeline v.1.3.4 (Schubert et al. 2014). After trimming the reads with AdapterRemoval 2.0 (Schubert et al. 2016), the reads were mapped to the dog reference genome, CanFam 3.1, with BWA aln v.0.7.15 (Li and Durbin 2009). Duplicate reads were discarded with Picard tools v.2.26.0 (https://broadinstitute.github.io/picard), genotype calling was performed with the HaplotypeCaller algorithm implemented in GATK v.4.3.0 (DePristo et al. 2011), and bases with base quality < 20 and reads with mapping quality < 40 were filtered. SNP data for the Port au Choix precontact dog AL3194 were downloaded from DRYAD (https://doi.org/10.5061/dryad.s1k47j4; Ní Leathlobhair et al. 2018).

Prior to merging the Port au Choix precontact dog to the rest of the data set, the initial genomic data set of 237,211 variants was filtered in PLINK v.1.9 (Purcell et al. 2007; Chang et al. 2015) to remove variants with missing call rates above 1%, minor allele frequencies below 1%, and singletons. One pet Siberian Husky was excluded for missing genotype data > 33%. Data were also pruned for variants in LD (--indep pairwise 200 100 0.9) leaving 108,433 variants and a genotyping rate of 0.999. Next, all removed variants in the 41,605 variants from the precontact dog were added back to the filtered and pruned data set prior to merging the precontact dog since this individual had lower coverage and we want to retain as many informative SNPs as possible for this dog. The final data set included 355 canids and 124,905 variants with a genotyping rate of 0.973. This data set was used for all downstream analyses unless otherwise noted.

To validate conclusions, we also merged two Vietnamese village dogs and one Andean fox from DRYAD (https://doi.org/10.5061/dryad.s1k47j4; Ní Leathlobhair et al. 2018) for a total of 358 canids. However, they were missing 74% and 67% of the SNP data, respectively, and thus were excluded from primary analyses.

A limitation of this study is its reliance on SNP array data, which restricted the number of variants available from ancient dogs and other publicly available sequences. This limitation may have impacted the statistical power of our analyses, especially those involving ancient sequences. While SNP array data may be subject to ascertainment bias (Albrechtsen et al. 2010), we do not anticipate this bias to significantly alter our results (Durand et al. 2011; Patterson et al. 2012; Suissa et al. 2024). The variants on the Illumina CanineHD beadchip used in this study were selected from a comprehensive discovery panel of 2.5 million SNPs derived from a diverse range of dog breeds and other canids in the Dog Genome Project (Lindblad-Toh et al. 2005). These variants were validated for use across 26 diverse dog breeds, including the Greenland sled dog. Therefore, any potential bias in SNP selection is likely to be consistent across Arctic lineages, potentially compensating for any proportional bias in branch lengths or ABBA and BABA patterns, and thereby not affecting the overall conclusions.

Population Structure and Admixture

PCA

Two data sets were explored using a PCA, and the results were visualized in the ggplot2 package in R v4.1.2. One PCA included individuals from all breeds in our data set, including the ancient samples, and intended to explore population structure on a larger scale (Fig. 1a). Since related individuals can introduce uninformative clustering in downstream analyses, the 321 dog Siberian and Alaskan Husky data set was first filtered using –king-cutoff in PLINK v.2.0 (Purcell et al. 2007; Chang et al. 2015) to remove individuals related to the second degree or higher (>0.088 kinship coefficient) leaving 218 individuals. Next, to control for possible bias due to unbalanced sampling (Toyama et al. 2020), the first ten remaining randomly sorted individuals from each owner-reported population were subsampled to use in this analysis (hereafter referred to the “balanced data set”). Data from ancient dogs and other breeds, a total of 31 individuals, were added to this balanced data set for a total of 91 dogs. Ancient dogs were projected onto the balanced PCA using smartsnp v1.1.0 (Herrando-Pérez et al. 2021).

A second PCA (Fig. 1b) was performed with PLINK v1.9 with the --pca flag using data from all Siberian Huskies and Alaskan sled dogs with close relatives removed (kinship < 0.088; n = 218) to examine population structure, genetic diversity, and gene flow within these groups.

ADMIXTURE

Unsupervised model-based clustering to test for population structure and to visualize genetic ancestry assuming two to nine clusters was performed on 86 individuals from the balanced data set excluding five European breeds due to unbalanced sampling (Saluki, GSP, GSD, Samoyed, and Golden Retriever) using the program ADMIXTURE (Alexander et al. 2009; Alexander and Lange 2011) and visualized in StructuRly v.0.1.0 (Criscuolo and Angelini 2020). The lowest CV value was found for K = 3.

DAPC

DAPC was conducted on the balanced data set using the adegenet package v.2.1.10 (Jombart 2008; Jombart et al. 2010; Jombart and Ahmed 2011) in R to identify genetic clusters and test for complex population structure. DAPC involves first using the k-means algorithm to determine the optimal number of genetic clusters using the first 88 PCs by varying the possible number from 2 to 10 and then assessing the best-supported model in the plot of Bayesian information criterion (BIC values for each value of K. The optimal number of genetic clusters (5) was determined using the inflection point and lowest BIC value in the plot of BIC values for each value of K (supplementary fig. S1, Supplementary Material online). The optimal number of PCs (6) to use in DAPC was determined using the optimized a-score (supplementary fig. S2, Supplementary Material online).

D and f-Statistics

To explicitly test for gene flow, Patterson's D-statistic (Green et al. 2010; Durand et al. 2011) was calculated either using single individuals or breeds/populations as tips of the four-taxon tree using the AdmixR wrapper in R based on ADMIXTOOLS software (Patterson et al. 2012; Petr et al. 2019). D-statistics tests for introgression by comparing ancestral “A” and derived “B” allele patterns (known as ABBA and BABA sites) in the DNA using a simple four-taxon species tree with topology (((P1, P2), P3), O) where P1 and P2 are sister taxa, P3 is a candidate introgressing taxon that should be equally related to P1 and P2, and O is an ancestral outgroup. To test for introgression between European breeds and modern Arctic/American breeds, the two gray wolves were used as an outgroup, European breeds were set as P3, an individual or an Arctic breed/population corresponded to P2, and either an individual or all Greenlandic or show Siberian samples were used as P1 in the tree (((Greenland/Show, X) European) Wolf)). The European breeds used as P3, candidates for introgressing, included the Golden Retriever, GSD, Saluki, and GSP. Three of these candidate introgressing breeds have been previously shown to be in the recent ancestry of the Alaskan sled dog (Huson et al. 2010; Thorsrud and Huson 2021), and three are in the top 10 most popular breeds in AKC. Greenland sled dog was selected as the P1 sister taxon to X since it was previously shown to have no significant European ancestry (Sinding et al. 2020), which was also confirmed here (see below for f4 ratio or qpAdm tests) and has a large proportion of shared ancestry with the ancient sled dogs. AdmixR calculates D as (BABA − ABBA)/(BABA + ABBA), with positive values suggesting gene flow between European (P3) and Greenlandic (P1) breeds, and negative values indicating an excess of shared derived alleles between European (P3) and other Arctic (P2) breeds.

We also calculated outgroup f3 statistics to estimate shared genetic drift and relative time since divergence using gray wolf as an outgroup. Higher f3 values indicate a closer affinity between two populations compared to the outgroup and thus determine the degree of similarity between two populations.

To further investigate West Eurasian admixture in the Siberian Husky, we used the f4 ratio and qpAdm tests implemented in the admixR (Reich et al. 2009; Patterson et al. 2012; Petr et al. 2019) wrapper in R. f4 statistics measure the correlation of allele frequencies between two pairs of populations and thus can estimate admixture proportions, while qpAdm models target populations as a mixture of two source or reference populations. A P-value of >0.05 suggests the model is plausible. We did not test East Asian ancestry as a potential source population since we found no evidence for recent gene flow from East Asia to the Siberian or American Arctic (supplementary table S12, Supplementary Material online), in agreement with previous studies (Bergström et al. 2020; Feuerborn et al. 2021). To test for the presence of East Asian gene flow in modern Arctic dogs, we used the Vietnamese village dog to represent East Asian ancestry since it is one of the few East Asian populations that has not been admixed with European dogs (Ní Leathlobhair et al. 2018). For the qpAdm test, we examined all possible target populations using European breeds (GSP and GSD) and the show Siberian Husky population as potential sources or closely related reference populations. Gray wolves, the ancient Zhokhov dog, and the ancient Yana wolf were used as outgroups. Importantly, the source populations are likely not actual representatives of the true ancestral populations; however, they are phylogenetically closer to the target populations than they are to the outgroups, which allows for accurate ancestry estimation of related populations. The five taxa used in the f4 ratio test included two European breeds (GSP and GSD), a potentially introgressed taxon X (Siberian Husky populations, Alaskan Malamute, CSD, Alaskan sled dog, Samoyed, Golden Retriever, and Saluki were all tested), the sister taxon to X (Greenland sled dogs), and two modern gray wolves as an outgroup. The European ancestry proportion, alpha, was computed using the ratio of f4(GSP:Wolf; X:Greenland)/(f4(GSP:Wolf; GSD:Greenland) or f4(GSD:Wolf; X:Greenland)/(f4(GSD:Wolf; GSP:Greenland).

Finally, we used D-statistics, f3, and f4 ratio tests to investigate ancient Arctic/American ancestry in modern Arctic/American and other breeds (Reich et al. 2009; Green et al. 2010; Durand et al. 2011; Patterson et al. 2012) in the admixR (Petr et al. 2019) wrapper in R. We estimated Ancient Arctic ancestry by computing alpha using f4(AL3194:Andean Fox; X:VVD)/(f4(AL3194:Andean Fox; Zhokhov:VVD). We also investigated the affinity of Yana Wolf, Zhokhov, and Port au Choix precontact dog to all Siberian populations and other breeds in our study using D-statistics (((Greenland, X) Ancient) Gray Wolf) and (((GSD, X) Ancient) Gray Wolf), and we measured shared genetic drift with both ancient dogs using f3 statistics (Ancient Dog, X, Gray Wolf). We did not remove transitions in the ancient analyses due to the limited number of SNPs available. All data were visualized using ggplot2 (Wickham 2011), and an absolute z-score > 3 was considered significant. We chose to use this significance value to prevent false positives and increase confidence in the results to greater than 99.7%.

Phylogenetic Reconstruction

Evolutionary relatedness between modern arctic breeds and the ancient dogs was first explored using a GTR model and the approximately maximum-likelihood method in FastTree v2.1.11 (as implemented in GENEIOUS PRIME v2023.0.4 https://www.geneious.com) with the balanced data set (124,905 variants). Yana Wolf and two gray wolves were added to root the tree. The sled show population was excluded from this and other phylogenetic analyses since they were shown to be a mix between other Siberian populations and would therefore confound phylogenetic reconstruction.

Treemix

We further used TreeMix v.1.13 software (Pickrell and Pritchard 2012) to reconstruct a Maximum likelihood tree of the canids in our balanced data set (Yana and gray wolves added) and to model gene flow between them. We first converted allele frequency outputs from PLINK v.1.9 (Purcell et al. 2007; Chang et al. 2015) into TreeMix format using the plink2treemix.py script included with the software. We performed 30 independent runs of 500 bootstrap replications and generated a consensus tree using the PHYLIP Consense program (Felsenstein 1993). Node robustness was estimated using 500 bootstrap replicates and plotted using the treemix.bootstrap function in the R package BITE (Milanesi et al. 2017). The following options were used in the treemix software for generating the maximum likelihood tree: SNP block size (-k) was set to 20, -noss flag was used to avoid sample size overcorrection, wolf was used as outgroup for rooting the tree (-root), and one to six migration edges were modeled (-m).

Deep Population History

BEAST2

To estimate divergence times of modern sled dog clades, we first performed a Bayesian Markov chain Monte Carlo (MCMC) analysis with BEAST2 (Bouckaert et al. 2014, 2019) using a lognormal optimized relaxed clock and a FBD model to reconstruct a maximum clade credibility (MCC) tree summarized by TreeAnnotator v2.7.5 using mean node heights and visualized in FigTree. XML files were prepared in BEAUTi 2 v2.7.5.0 using a modified balanced data set: only show Siberian Huskies 1 to 9 and all nine Greenland sled dogs were included from the Arctic populations using all 124,905 SNPs since they showed the least amount of European admixture. Substitution rates were estimated using a GTR model with a gamma-distributed rate of heterogeneity. We ran four independent runs for a total of 450 million states (112.5 million per run) sampling every 1,000 generations with a 10% burn-in. Trees were combined using LogCombiner and resampling at a lower frequency (10,000). Tracer v1.7.2 was used to evaluate convergence and verify ESS parameters were over 100. Ancient Yana Wolf, Zhokhov and Port au Choix precontact dogs were included as Sampled Ancestor MRCA priors using the SA package with a uniform distribution of tip age and age/height values of 33,020 for Yana Wolf (upper 32,000, lower 34,000) (Sinding et al. 2020), 9,515 for Zhokhov dog (upper 10,000, lower 9,000) (Sinding et al. 2020), and 4,000 years before present for Port au Choix precontact Dog (upper 4,500, lower 3,500) (Ní Leathlobhair et al. 2018). Siberians and Greenland sled dogs were set with uniform, monophyletic priors with 80 to 140 years and 750 to 2,000 years as upper and lower bounds, respectively.

To further investigate the divergence timing between the two Arctic dog lineages, we use the protocol recommended by Stange et al. (2018) using SNAPP v1.6.1 (Bryant et al. 2012), implemented in BEAST2 v2.7.5.0. Due to the long computational time of SNAPP, we used a reduced data set composed of only one individual from four taxa (Greenland3, Siberian [Show 4 or Racing 8], Yana wolf, and Zhokhov) along with all 124,905 SNPs. We calibrated the molecular clock by using the age of the Yana wolf, setting the crown age of the root to a normal distribution with a mean of 33,020 years. Zhokhov and Yana wolf MRCA priors were set with normally distributed tip ages of 9,515 and 33,020, respectively, and the Zhokhov and Greenland dogs were set with a monophyletic prior. XML files were prepared with the script snapp_prep.rb (freely available at https://github.com/mmatschiner/snapp_prep). It is important to recognize that this approach relies on the assumptions of a strict molecular clock and constant population sizes within lineages—assumptions that may not hold true for modern dog populations, many of which have experienced recent bottlenecks and recent intense artificial selection. Nonetheless, we employed this method to validate our previous analysis, given its suitability for estimating divergence times using SNP data. Two independent simulations were run for a total of 2,000,000 generations each sampling trees every 1,000 generations. Stationarity and convergence were evaluated using Tracer v1.7.2, verifying all ESS parameters were over 200 after discarding the first 15% of each MCMC chain as burn-in. Tree files were then combined using LogCombiner, and a MCC tree was summarized by TreeAnnotator v2.7.5 using mean node heights and visualized in FigTree. Model performance for both BEAST2 analyses was assessed by sampling from prior without sequence data to confirm the data were informing the divergence timing rather than the priors.

qpGraph

We further investigated complex relationships among all individuals in our data set using qpGraph implemented in ADMIXTOOLS v2.0.0 (Patterson et al. 2012; Maier et al. 2023). We modeled population histories by automatically finding and optimizing topologies and parameters using the find_graph() function that searches for admixture graphs that are most compatible with the f-statistics (Maier et al. 2023). Default settings were used and the two gray wolves were used as the outgroup. Populations modeled with more than three admixture events (e.g. the pet and sled show populations of Siberians and CSDs) or an excess of missing variants (e.g. precontact dog AL3194) were removed from the first model to improve the ability to model the history of Arctic dogs. We continued adding admixture events and automating the search for the best-fitting graph with the lowest maximum absolute z-score of the difference between observed and modeled f4 statistic. We chose the best model that fit the data as the one that minimized the likelihood score (6.85) and worst residual z-score (1.08). Initial weights were resampled 100 times yielding the same drift lengths and admixture weights. We then modeled a reduced number of populations with only the least-admixed show population of Siberian Huskies with Zhokhov and CSDs to investigate ancient ancestry using the Pleistocene Yana wolf as the outgroup. The best-fitting model had three admixture events and a score of 3.61 with a worst residual z-score of 1.16.

Genetic Health

Population Size

Historical effective population size over time in the modern Arctic dogs was estimated from LD in GONE (Santiago et al. 2020) over the past 125 generations. After filtering the original 354 dog data set (AL3194 was not merged) to remove missing call rates above 1%, individuals missing more than 33% genotype data, and to exclude nonautosomal variants and the eight West Eurasian breeds/wolves in PLINK v1.9 (Purcell et al. 2007; Chang et al. 2015), the 135,451 variant ped and map files for each population were imported into GONE (Santiago et al. 2020). Default parameters were used other than cMMb, which was set to 0.97 (Wong et al. 2010). To control for unbalanced sampling, we used the same individuals as included in the balanced data set and only included those populations with at least nine individuals.

Inbreeding

IBD sharing was evaluated across all pairs of dogs within a population using the –genome flag in PLINK v.1.9 (Purcell et al. 2007; Chang et al. 2015). ROHs were generated in PLINK v.1.9 (Purcell et al. 2007; Chang et al. 2015) using the same 135,451 variants (genotyping rate of 0.999) as used in the GONE analysis but including all dogs in each population instead of balanced sampling. The parameters were the same as in Sams and Boyko (2019). The .hom file was then imported into the detectRUNS v.0.9.6 (Biscarini et al. 2019) package in R for ROH analysis. To measure the proportion of the genome in ROH (FROH), we used a minimum length of 500 kb (--homozyg-kb 500) and calculated FROH using the package detectRUNS v.0.9.6 in R (Biscarini et al. 2019). To investigate the distribution of inbreeding over time in each population, FROH was calculated for minimum lengths (using Froh_inbreedingClass in detectRUNS) ranging from 500 kb to >16 Mb. The mean FROH distribution for each population was determined across six size classes, 0.5 to 2, 2 to 4, 4 to 8, 8 to 16, and >16 Mb in detectRUNS. We tested for statistically significant differences between Siberian populations using a Wilcoxon test in R.

We also calculated observed (Ho) and unbiased expected heterozygosities (uHe) and estimated the inbreeding coefficient (Fis) as 1 − (mean(Ho)/mean(uHe)) using the gl.report.heterozygosity() function in the dartR Package v2.9.7 (Mijangos et al. 2022) in R.

Potentially Harmful Variants

Using the balanced data set, we leveraged the use of the dog-referenced version of Zoonomia's 240 mammal Cactus alignment (Zoonomia Consortium 2020; Christmas et al. 2023), as in Moon et al. (2023), to annotate positions that were evolutionarily constrained (phyloP score above the FDR 0.05 cutoff of 2.56) in the Siberian Husky and calculated the allele frequency of the mutated evolutionarily constrained positions. We also calculated the frequency of deleterious mutations within these populations using SnpEff annotation (Cingolani et al. 2012). Deleterious variants are high-impact mutations resulting in protein loss of function, truncation, or frameshifting. We then tested for statistically significant differences between Siberian populations in the frequencies of mutated evolutionarily constrained position and deleterious mutation using a Wilcoxon test in R.

Supplementary Material

evae190_Supplementary_Data

Acknowledgments

We thank all owners of the dogs included in this study for contributing DNA or raw genetic data. We would also like to thank Erika Tsogoeva for drawing the ancient Zhokhov dog for this manuscript.

Contributor Information

Tracy A Smith, Department of Biological Sciences, University of Maryland Baltimore County, Baltimore, MD 21250, USA.

Krishnamoorthy Srikanth, Department of Animal Sciences, Cornell University College of Agriculture and Life Sciences, Ithaca, NY 14853, USA.

Heather Jay Huson, Department of Animal Sciences, Cornell University College of Agriculture and Life Sciences, Ithaca, NY 14853, USA.

Supplementary Material

Supplementary material is available at Genome Biology and Evolution online.

Author Contributions

Conceptualization: T.A.S. and H.J.H. Methodology: T.A.S., H.J.H., and K.S. Investigation: T.A.S. and K.S. Visualization: T.A.S. and K.S. Funding acquisition: T.A.S. and H.J.H. Project administration: T.A.S. and H.J.H. Supervision: T.A.S. and H.J.H. Writing—original draft: T.A.S. Writing—review & editing: T.A.S., K.S., and H.J.H.

Funding

This work was supported by the University of Maryland Baltimore County College of Natural and Mathematical Sciences (Excellence Award 2021-22; T.A.S.), Neogen Genomics (H.J.H.), and the Siberian Husky Club of America (H.J.H.).

Conflict of Interest

T.A.S. has both show and racing Siberian Huskies and participates in AKC conformation and sled dog racing sports. H.J.H. has formerly participated in sled dog races with Alaskan sled dogs. K.S. declares no competing interests.

Data Availability

The data underlying this article are available in Dryad: https://doi.org/10.5061/dryad.8gtht76w4.

Literature Cited

  1. Albrechtsen  A, Nielsen  FC, Nielsen  R. Ascertainment biases in SNP chips affect measures of population divergence. Mol Biol Evol.  2010:27(11):2534–2547. 10.1093/molbev/msq148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alexander  DH, Lange  K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinfor.  2011:12(1):246. 10.1186/1471-2105-12-246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Alexander  DH, Novembre  J, Lange  K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res.  2009:19(9):1655–1664. 10.1101/gr.094052.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Ameen  C, Feuerborn  TR, Brown  SK, Linderholm  A, Hulme-Beaman  A, Lebrasseur  O, Sinding  MHS, Lounsberry  ZT, Lin  AT, Appelt  M, et al.  Specialized sledge dogs accompanied Inuit dispersal across the North American Arctic. Proc Biol Sci.  2019:286(1916):20191929. 10.1098/rspb.2019.1929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bannasch  DL, Kaelin  CB, Letko  A, Loechel  R, Hug  P, Jagannathan  V, Henkel  J, Roosje  P, Hytönen  MK, Lohi  H, et al.  Dog colour patterns explained by modular promoters of ancient canid origin. Nat Ecol Evol.  2021:5(10):1415–1423. 10.1038/s41559-021-01524-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bergström  A, Frantz  L, Schmidt  R, Ersmark  E, Lebrasseur  O, Girdland-Flink  L, Lin  AT, Storå  J, Sjögren  KG, Anthony  D, et al. Origins and genetic legacy of prehistoric dogs. Science. 2020:370(6516):557–564. 10.1126/science.aba9572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bergström  A, Stanton  DW, Taron  UH, Frantz  L, Sinding  MHS, Ersmark  E, Pfrengle  S, Cassatt-Johnstone  M, Lebrasseur  O, Girdland-Flink  L, et al.  Grey wolf genomic history reveals a dual ancestry of dogs. Nature. 2022:607(7918):313–320. 10.1038/s41586-022-04824-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Biscarini  F, Cozzi  P, Gaspa  G, Marras  G. detectRUNS: Detect Runs of homozygosity and runs of heterozygosity in diploid genomes. R package version 0.9.6. 2019; Retrieved from: https://cran.r-project.org/web/packages/detectRUNS  2019.
  9. Blondeau Da Silva  S, Mwacharo  JM, Li  M, Ahbara  A, Muchadeyi  FC, Dzomba  EF, Lenstra  JA, Da Silva  A. IBD sharing patterns as intra-breed admixture indicators in small ruminants. Heredity. 2024:132(1):30–42. 10.1038/s41437-023-00658-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bogoslovskaya  L.  2022. Sled dogs of Russia. In: Beregovoy  V, editor. The dogs of our ancestors .  India:  CATNUS Publishers. p. 459–462. [Google Scholar]
  11. Botigué  LR, Song  S, Scheu  A, Gopalan  S, Pendleton  AL, Oetjens  M, Taravella  AM, Seregély  T, Zeeb-Lanz  A, Arbogast  RM, et al.  Ancient European dog genomes reveal continuity since the Early Neolithic. Nat Commun.  2017:8(1):16082. 10.1038/ncomms16082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bouckaert  R, Heled  J, Kühnert  D, Vaughan  T, Wu  CH, Xie  D, Suchard  MA, Rambaut  A, Drummond  AJ. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2014:10(4):e1003537. 10.1371/journal.pcbi.1003537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bouckaert  R, Vaughan  TG, Barido-Sottani  J, Duchêne  S, Fourment  M, Gavryushkina  A, Heled  J, Jones  G, Kühnert  D, De Maio  N, et al.  BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2019:15(4):e1006650. 10.1371/journal.pcbi.1006650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Brown  SK, Darwent  CM, Sacks  BN. Ancient DNA evidence for genetic continuity in Arctic dogs. J Archaeol Sci. 2013:40(2):1279–1288. 10.1016/j.jas.2012.09.010. [DOI] [Google Scholar]
  15. Bryant  D, Bouckaert  RR, Felsenstein  J, Rosenberg  NA, RoyChoudhury  A. Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Mol Biol Evol. 2012:29(8):1917–1932. 10.1093/molbev/mss086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chang  CC, Chow  CC, Tellier  LC, Vattikuti  S, Purcell  SM, Lee  JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015:4(1):7. 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Christmas  MJ, Kaplow  IM, Genereux  DP, Dong  MX, Hughes  GM, Li  X, Sullivan  PF, Hindle  AG, Andrews  G, Armstrong  JC, et al.  Evolutionary constraint and innovation across hundreds of placental mammals. Science. 2023:380(6643):eabn3943. 10.1126/science.abn3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Cingolani  P, Platts  A, Wang  LL, Coon  M, Nguyen  T, Wang  L, Land  SJ, Lu  X, Ruden  DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly  2012:6(2):80–92. 10.4161/fly.19695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Crane  B. The rise and decline of Russia's Sled dogs. In: International Siberian Husky Club presents the Siberian Husky. Oregon: Lee's Print Shop; 1977. p. 25–26. [Google Scholar]
  20. Criscuolo  NG, Angelini  C. StructuRly: a novel shiny app to produce comprehensive, detailed and interactive plots for population genetic analysis. PLoS One. 2020:15(2):e0229330. 10.1371/journal.pone.0229330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. DePristo  MA, Banks  E, Poplin  R, Garimella  KV, Maguire  JR, Hartl  C, Philippakis  AA, Del Angel  G, Rivas  MA, Hanna  M, et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet.  2011:43(5):491–498. 10.1038/ng.806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Douglas  J, Zhang  R, Bouckaert  R. Adaptive dating and fast proposals: revisiting the phylogenetic relaxed clock model. PLoS Comput Biol.  2021:17(2):e1008322. 10.1371/journal.pcbi.1008322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Durand  EY, Patterson  N, Reich  D, Slatkin  M. Testing for ancient admixture between closely related populations. Mol Biol Evol.  2011:28(8):2239–2252. 10.1093/molbev/msr048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Elias  SA, Short  SK, Nelson  CH, Birks  HH. Life and times of the Bering land bridge. Nature. 1996:382(6586):60–63. 10.1038/382060a0. [DOI] [Google Scholar]
  25. Felsenstein  J. PHYLIP (phylogeny inference package), version 3.5 c. University of Washington, Seattle: Joseph Felsenstein; 1993. [Google Scholar]
  26. Feuerborn  TR, Carmagnini  A, Losey  RJ, Nomokonova  T, Askeyev  A, Askeyev  I, Askeyev  O, Antipina  EE, Appelt  M, Bachura  OP, et al.  Modern Siberian dog ancestry was shaped by several thousand years of Eurasian-wide trade and human dispersal. Proc Natl Acad Sci U S A.  2021:118(39):e2100338118. 10.1073/pnas.2100338118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Franklin  IA. Evolutionary change in small population. Sunderland: Sinauer Associates; 1980. [Google Scholar]
  28. Frantz  LA, Mullin  VE, Pionnier-Capitan  M, Lebrasseur  O, Ollivier  M, Perri  A, Linderholm  A, Mattiangeli  V, Teasdale  MD, Dimopoulos  EA, et al.  Genomic and archaeological evidence suggest a dual origin of domestic dogs. Science. 2016:352(6290):1228–1231. 10.1126/science.aaf3161. [DOI] [PubMed] [Google Scholar]
  29. Freedman  AH, Gronau  I, Schweizer  RM, Ortega-Del Vecchyo  D, Han  E, Silva  PM, Galaverni  M, Fan  Z, Marx  P, Lorente-Galdos  B, et al.  Genome sequencing highlights the dynamic early history of dogs. PLoS Genet. 2014:10(1):e1004016. 10.1371/journal.pgen.1004016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Green  RE, Krause  J, Briggs  AW, Maricic  T, Stenzel  U, Kircher  M, Patterson  N, Li  H, Zhai  W, Fritz  MHY, et al.  A draft sequence of the Neandertal genome. Science. 2010:328(5979):710–722. 10.1126/science.1188021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Handford  JM. Dog sledging in the eighteenth century: North America and Siberia. Polar Rec. 1998:34(190):237–248. 10.1017/S0032247400025705. [DOI] [Google Scholar]
  32. Heath  TA, Huelsenbeck  JP, Stadler  T. The fossilized birth–death process for coherent calibration of divergence-time estimates. Proc Natl Acad Sci U S A.  2014:111(29):E2957–E2966. 10.1073/pnas.1319091111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Herrando-Pérez  S, Tobler  R, Huber  CD. Smartsnp, an r package for fast multivariate analyses of big genomic data. Methods Ecol Evol. 2021:12(11):2084–2093. 10.1111/2041-210X.13684. [DOI] [Google Scholar]
  34. Huson  HJ, Parker  HG, Runstadler  J, Ostrander  EA. A genetic dissection of breed composition and performance enhancement in the Alaskan sled dog. BMC Genet. 2010:11(1):71–14. 10.1186/1471-2156-11-71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Huson  HJ, Vonholdt  BM, Rimbault  M, Byers  AM, Runstadler  JA, Parker  HG, Ostrander  EA. Breed-specific ancestry studies and genome-wide association analysis highlight an association between the MYH9 gene and heat tolerance in Alaskan sprint racing sled dogs. Mamm Genome. 2012:23(1-2):178–194. 10.1007/s00335-011-9374-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jombart  T. adegenet: A R package for the multivariate analysis of genetic markers. Bioinformatics. 2008:24:1403–1405. [DOI] [PubMed] [Google Scholar]
  37. Jombart  T, Ahmed  I. adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. 2011:27:3070–3071. 10.1093/bioinformatics/btr521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Jombart  T, Devillard  S, Balloux  F. Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet. 2010:11:1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kuzina  M. Laikas. In: Beregovoy  V, editor. The dogs of our ancestors .  India: CATNUS Publishers; 2022. p. 12–14. [Google Scholar]
  40. Li  H, Durbin  R.  Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009:25(14):1754–1760. 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lindblad-Toh  K, Wade  CM, Mikkelsen  TS, Karlsson  EK, Jaffe  DB, Kamal  M, Clamp  M, Chang  JL, Kulbokas  III  EJ, Zody  MC, et al.  Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005:438(7069):803–819. 10.1038/nature04338. [DOI] [PubMed] [Google Scholar]
  42. Maier  R, Flegontov  P, Flegontova  O, Işıldak  U, Changmai  P, Reich  D. On the limits of fitting complex models of population history to f-statistics. eLife. 2023:12:e85492. 10.7554/eLife.85492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Meadows  JRS, Kidd  JM, Wang  GD. Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture. Genome Biol. 2023:24. 10.1186/s13059-023-03023-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Mijangos  JL, Gruber  B, Berry  O, Pacioni  C, Georges  A. Dartr v2: an accessible genetic 2.5 analysis platform for conservation, ecology and agriculture. Methods Ecol Evol. 2022:13(10):2150–2158. 10.1111/2041-210X.13918. [DOI] [Google Scholar]
  45. Milanesi  M, Capomaccio  S, Vajana  E, Bomba  L, Garcia  JF, Ajmone-Marsan  P, Colli  L. BITE: an R package for biodiversity analyses. bioRxiv 18160. 10.1101/181610. 29 August 2017. [DOI] [Google Scholar]
  46. Monteath  AJ, Gaglioti  BV, Edwards  ME, Froese  D. Late Pleistocene shrub expansion preceded megafauna turnover and extinctions in eastern Beringia. Proc Natl Acad Sci U S A.  2021:118(52):e2107977118. 10.1073/pnas.2107977118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Moon  KL, Huson  HJ, Morrill  K, Wang  MS, Li  X, Srikanth  K, Zoonomia Consortium, Lindblad-Toh  K, Svenson  GJ, Karlsson  EK, et al.  Comparative genomics of Balto, a famous historic dog, captures lost diversity of 1920s sled dogs. Science. 2023:380(6643):eabn5887. 10.1126/science.abn5887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Ní Leathlobhair  M, Perri  AR, Irving-Pease  EK, Witt  KE, Linderholm  A, Haile  J, Lebrasseur  O, Ameen  C, Blick  J, Boyko  AR, et al.  The evolutionary history of dogs in the Americas. Science. 2018:361(6397):81–85. 10.1126/science.aao4776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Pang  JF, Kluetsch  C, Zou  XJ, Zhang  AB, Luo  LY, Angleby  H, Ardalan  A, Ekström  C, Sköllermo  A, Lundeberg  J, et al.  mtDNA data indicate a single origin for dogs south of Yangtze River, less than 16,300 years ago, from numerous wolves. Mol Biol Evol. 2009:26(12):2849–2864. 10.1093/molbev/msp195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Patterson  N, Moorjani  P, Luo  Y, Mallick  S, Rohland  N, Zhan  Y, Genschoreck  T, Webster  T, Reich  D. Ancient admixture in human history. Genetics. 2012:192(3):1065–1093. 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Perri  AR, Feuerborn  TR, Frantz  LA, Larson  G, Malhi  RS, Meltzer  DJ, Witt  KE. Dog domestication and the dual dispersal of people and dogs into the Americas. Proc Natl Acad Sci U S A. 2021:118(6):e2010083118. 10.1073/pnas.2010083118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Petr  M, Vernot  B, Kelso  J. admixr—R package for reproducible analyses using ADMIXTOOLS. Bioinformatics. 2019:35(17):3194–3195. 10.1093/bioinformatics/btz030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Phillips  CJ, Coppinger  RP, Schimel  DS. Hyperthermia in running sled dogs. J Appl Physiol. 1981:51(1):135–142. 10.1152/jappl.1981.51.1.135. [DOI] [PubMed] [Google Scholar]
  54. Pickrell  J, Pritchard  J. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet.  2012:8(11):e1002967. 10.1371/journal.pgen.1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Pitulko  VV, Kasparov  AK. Archaeological dogs from the early Holocene Zhokhov site in the eastern Siberian Arctic. J Archaeol Sci Rep. 2017:13:491–515. 10.1016/j.jasrep.2017.04.003. [DOI] [Google Scholar]
  56. Plassais  J, Parker  HG, Carmagnini  A, Dubos  N, Papa  I, Bevant  K, Derrien  T, Hennelly  LM, Whitaker  DT, Harris  AC, et al.  Natural and human-driven selection of a single non-coding body size variant in ancient and modern canids. Curr Biol.  2022:32(4):889–897.e9. 10.1016/j.cub.2021.12.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Potseluyeva  Y.Historic and climatic prerequisites of appearance of the sled dog population of the Chukotka Peninsula shoreline. In: Beregovoy  V, editor. The dogs of our ancestors. India: CATNUS Publishers; 2022. p. 914–918, 927–930. [Google Scholar]
  58. Purcell  S, Neale  B, Todd-Brown  K, Thomas  L, Ferreira  MA, Bender  D, Maller  J, Sklar  P, De Bakker  PI, Daly  MJ, et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007:81(3):559–575. 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Ramos Madrigal  J, Sinding  MHS, Carøe  C, Mak  SS, Niemann  J, Samaniengo Casruita  JA, Fedorov  S, Kandyba  A, Germonpré  M, Bocherens  H, et al.  Genomes of extinct Pleistocene Siberian wolves provide insights into the origin of present-day wolves. Curr Biol. 2021:31(1):1–12.e5. 10.1016/j.cub.2020.09.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Reich  D, Thangaraj  K, Patterson  N, Price  AL, Singh  L. Reconstructing Indian population history. Nature. 2009:461(7263):489–494. 10.1038/nature08365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Sams  AJ, Boyko  AR. Fine-scale resolution of runs of homozygosity reveal patterns of inbreeding and substantial overlap with recessive disease genotypes in domestic dogs. G3 Genes Genomes Genet. 2019:9(1):117–123. 10.1534/g3.118.200836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Santiago  E, Novo  I, Pardiñas  AF, Saura  M, Wang  J, Caballero  A. Recent demographic history inferred by high-resolution analysis of linkage disequilibrium. Mol Biol Evol. 2020:37(12):3642–3653. 10.1093/molbev/msaa169. [DOI] [PubMed] [Google Scholar]
  63. Savolainen  P, Zhang  YP, Luo  J, Lundeberg  J, Leitner  T. Genetic evidence for an East Asian origin of domestic dogs. Science. 2002:298(5598):1610–1613. 10.1126/science.1073906. [DOI] [PubMed] [Google Scholar]
  64. Schubert  M, Ermini  L, Sarkissian  CD, Jónsson  H, Ginolhac  A, Schaefer  R, Martin  MD, Fernández  R, Kircher  M, McCue  M, et al.  Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nat Protoc. 2014:9(5):1056–1082. 10.1038/nprot.2014.063. [DOI] [PubMed] [Google Scholar]
  65. Schubert  M, Lindgreen  S, Orlando  L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res Notes. 2016:9(1):88–87. 10.1186/s13104-016-1900-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Sinding  MHS, Gopalakrishnan  S, Ramos-Madrigal  J, de Manuel  M, Pitulko  VV, Kuderna  L, Feuerborn  TR, Frantz  LA, Vieira  FG, Niemann  J, et al.  Arctic-adapted dogs emerged at the Pleistocene–Holocene transition. Science. 2020:368(6498):1495–1499. 10.1126/science.aaz8599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Skoglund  P, Ersmark  E, Palkopoulou  E, Dalén  L. Ancient wolf genome reveals an early divergence of domestic dog ancestors and admixture into high-latitude breeds. Curr Biol. 2015:25(11):1515–1519. 10.1016/j.cub.2015.04.019. [DOI] [PubMed] [Google Scholar]
  68. Stange  M, Sánchez-Villagra  MR, Salzburger  W, Matschiner  M. Bayesian divergence-time estimation with genome-wide SNP Data of Sea Catfishes (Ariidae) supports miocene closure of the Panamanian isthmus. Syst Biol. 2018:67(4):681–699. 10.1093/sysbio/syy006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Suissa  JS, De La Cerda  GY, Graber  LC, Jelley  C, Wickell  D, Phillips  HR, Grinage  AD, Moreau  CS, Specht  CD, Doyle  JJ. et al.  et al.  Comparative phylogenomic analyses of SNP versus full locus datasets: insights and recommendations for researchers. bioRxiv 556036. 10.1101/2023.09.02.556036. Last Accessed 17 March 2024. [DOI] [Google Scholar]
  70. Sullivan  PF, Meadows  JR, Gazal  S, Phan  BN, Li  X, Genereux  DP, Dong  MX, Bianchi  M, Andrews  G, Sakthikumar  S, et al.  Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science. 2023:380(6643):eabn2937. 10.1126/science.abn2937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Thomas  B, Thomas  P. Leonhard Seppala: the Siberian Dog and the golden age of Sleddog Racing 1908–1941. Missoula, Montana: Pictorial Histories Publishing Company; 2015. [Google Scholar]
  72. Thorsrud  JA, Huson  HJ. Description of breed ancestry and genetic health traits in Arctic sled dog breeds. Canine Med Genet.  2021:8(1):8–13. 10.1186/s40575-021-00108-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Toyama  KS, Crochet  PA, Leblois  R. Sampling schemes and drift can bias admixture proportions inferred by structure. Mol Ecol Resour.  2020:20(6):1769–1785. 10.1111/1755-0998.13234. [DOI] [PubMed] [Google Scholar]
  74. Wickham  H. Ggplot2. Wiley Interdiscip Rev Comput Stat.  2011:3(2):180–185. 10.1002/wics.147. [DOI] [Google Scholar]
  75. Wong  AK, Ruhe  AL, Dumont  BL, Robertson  KR, Guerrero  G, Shull  SM, Ziegle  JS, Millon  LV, Broman  KW, Payseur  BA, et al.  A comprehensive linkage map of the dog genome. Genetics. 2010:184(2):595–605. 10.1534/genetics.109.106831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Zoonomia Consortium . A comparative genomics multitool for scientific discovery and conservation. Nature. 2020:587(7833):240–245. 10.1038/s41586-020-2876-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

evae190_Supplementary_Data

Data Availability Statement

The data underlying this article are available in Dryad: https://doi.org/10.5061/dryad.8gtht76w4.


Articles from Genome Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES