Abstract
Distinctive color patterns in dogs are an integral component of canine diversity. Color pattern differences are thought to have arisen from mutation and artificial selection during and after domestication from wolves but important gaps remain in understanding how these patterns evolved and are genetically controlled. In other mammals, variation at the ASIP gene controls both the temporal and spatial distribution of yellow and black pigments. Here we identify independent regulatory modules for ventral and hair cycle ASIP expression, and we characterize their action and evolutionary origin. Structural variants define multiple alleles for each regulatory module and are combined in different ways to explain five distinctive dog color patterns. Phylogenetic analysis reveals that the haplotype combination for one of these patterns is shared with arctic white wolves and that its hair cycle-specific module likely originated from an extinct canid that diverged from grey wolves more than 2 million years before present. Natural selection for a lighter coat during the Pleistocene provided the genetic framework for widespread color variation in dogs and wolves.
A central aspect of the amazing morphologic diversity among domestic dogs are their colors and color patterns. In many mammals, specific color patterns arise through differential regulation of Agouti (ASIP), which encodes a paracrine signaling molecule and antagonist of the melanocortin 1 receptor (MC1R) that causes hair follicle melanocytes to switch from making eumelanin (black or brown pigment) to pheomelanin (yellow to nearly white pigment) 1–4. In laboratory mice, Asip expression is controlled by alternative promoters in specific body regions, and at specific times during hair growth, and gives rise to the light-bellied agouti phenotype, with ventral hair that is yellow and dorsal hair that contains a mixture of black and yellow pigment 4. Genetic variation in ASIP affects color pattern in many mammals; however, in dogs, the situation is still unresolved, in large part due to the complexity of different pattern types, epistatic relationships with variants at other loci, and challenges in distinguishing whether genetic association of one or more variants truly represents causal variation or just close linkage5.
Here we investigate non-coding variation in ASIP regulatory modules and their effect on patterning phenotypes in domestic dogs. We expand our analysis to include modern and ancient wild canids and uncover an evolutionary history in which natural selection during the Pleistocene provided a molecular substrate for color pattern diversity today.
Results
Expression of ASIP promotes pheomelanin synthesis, therefore ASIP alleles associated with a yellow color are dominant to those associated with a black color. Although dominant yellow (DY) is common in dogs from diverse geographic locations, the most common coat pattern of modern wolves is agouti (AG)6, in which the dorsum has banded hairs and the ventrum is light. Three additional color patterns are recognizable, but all have been described historically by different, inconsistent and sometimes overlapping names that predate genomic analysis; we refer to these as shaded yellow (SY), black saddle (BS), and black back (BB) (Fig. 1, Supplementary Table 1).
We analyzed skin RNA-seq data available from dogs of dominant yellow and black back patterns, and identified three alternative untranslated first exons for dog ASIP (Fig. 2a, Extended Data Fig. 1, Supplementary Table 2). As described below, two of the three transcripts vary in abundance between dominant yellow and black back dogs, and the corresponding 5’-flanking promoters have sequence variation associated with dog pattern phenotypes. The 5’-flanking promoter regions for these two transcripts are orthologous to the ventral promoter (VP) and hair cycle promoter (HCP) in the laboratory mouse 4; however, our genetic analyses (Fig. 2) reveal that the dog VP and HCP give rise to more complex patterns than their mouse counterparts. Transcripts associated with the third promoter, which lies ~16 kb upstream of the VP (Fig. 2b) did not vary in abundance in our dataset.
To better understand the relationship between promoter usage and pattern phenotypes, we inspected whole genome sequence data from 77 dog and wolf samples with known color patterns (Supplementary Table 3). We used dogs that were homozygous at the ASIP locus to infer two VP haplotypes and five HCP haplotypes, consisting of multiple structural variants that lie within 1.5 kb of each transcriptional start site. VP1 contains a SINE element in reverse orientation relative to the transcription of ASIP and an A-rich expansion not found in VP2 (Fig. 2b, left, Supplementary Table 1); the five HCP haplotypes differ according to the number and identity of SINE elements, all in the same orientation as ASIP, as well as additional insertions and deletions (Fig. 2b, right, Supplementary Table 1). All structural variants were precisely delineated with Sanger sequencing.
These results were extended by developing PCR-based genotyping assays for the VP and HCP structural variants, examining their association with different pattern phenotypes in 352 dogs from 34 breeds, and comparing these results to previously published variants (Table 1, Extended Data Fig. 2, Supplementary Table 1, and Supplementary Table 4–7). As depicted in Fig. 2c and Table 1, diplotype combinations of VP1 or VP2 with HCP1, 2, 3, 4, or 5 are correlated perfectly with variation in ASIP pattern phenotype. For example, homozygotes for VP1-HCP1, VP2-HCP1, VP2-HCP2 are dominant yellow, shaded yellow and agouti, respectively (Supplementary Table 4–7). Black saddle and black back dogs differ in their VP configuration, but all carry HCP3, 4, and/or 5 in homozygous or compound heterozygous configurations. Because the level of ASIP activity is directly related to the amount of yellow pigment production, these genetic association results suggest that VP1 has greater activity than VP2, HCP1 has greater activity than HCP2, and HCP3, 4, and 5 all represent loss-of-function, since the HCP4 haplotype includes a large deletion of the hair cycle first exon (Fig. 2b), and fails to complement HCP3 or HCP5 (Fig. 3, Supplementary Table 6). Importantly, increased activity from the ventral promoter (VP1 vs. VP2) correlates with dorsal expansion of yellow pigment in black saddle compared to black back phenotypes (Fig. 1, 2c), which indicates that the VP and HCP haplotypes function separately from each other.
Table 1. ASIP diplotype association with pattern phenotype.
Pattern phenotype | Promoter diplotype | Concordant | Discordant |
---|---|---|---|
Dominant Yellow (n = 114)a | VP1-HCP1 / VP1,2-HCP1,3,4,5 | 113 | 1b |
Shaded Yellow (n = 64)a | VP2-HCP1 / VP2-HCP1,3,5 | 52 | 12b |
Agouti (n = 46) | VP2-HCP2 / VP2-HCP2,3,5 | 46 | 0 |
Black Saddle (n = 53) | VP1-HCP4 / VP1,2-HCP3,4 | 53 | 0 |
Black Back (n = 89) | VP2-HCP3,4,5 / VP2-HCP3,4,5 | 89 | 0 |
Previous studies did not differentiate between the dominant yellow and the shaded yellow phenotype.
These dogs had a eumelanistic masking pattern, which prevented reliable phenotype distinction between dominant yellow and shaded yellow.
The relationship between structural variation that delineates the different VP and HCP haplotypes and ASIP transcriptional activity was explored more directly in RNA-seq data from biopsies of dorsal and ventral dog skin (Supplementary Table 8, Extended Data Fig. 1). Read counts from the RNA-seq data were consistent with expectations from the genetic association results: VP1 has greater transcriptional activity and is spatially broadened relative to VP2 (which is only expressed ventrally), HCP1 has greater transcriptional activity relative to HCP2, and no reads are detected from HCP3 or HCP4 (Fig. 2b, Extended Data Fig. 1). Taken together, these results provide a molecular explanation for ASIP pattern variation in dogs in which the VP and HCP function independently and for which structural variants in close proximity to VP and HCP modulate promoter activity.
Genetic relationships between variant ASIP regulatory modules were examined by comparing haplotypes in 18 homozygous dogs (for the structural variants at the VP and HCP and coding sequences) to those from 10 contemporary grey wolves (Fig. 4a, Supplementary Table 9). Overall, agouti dog haplotypes were similar to those from grey wolves. However, dominant yellow and, to a lesser extent, shaded yellow dog haplotypes were similar to those from arctic grey wolves from Ellesmere Island and Greenland, where all wolves are white (Fig. 4a–c). Notably, white coat color in wolves represents pale pheomelanin, as in Kermode bears or snowshoe hares 7,8. In the 64 kb segment that contains the VP, HCP, and coding sequence, the arctic grey wolf haplotypes are identical except for one polymorphic site (Fig. 4a, chr24: 23,337,523), and are distinguished from dog dominant yellow haplotypes by only 6 SNVs (Supplementary Table 10). Taken together, these observations suggest a common origin of dominant yellow in dogs and white coat color in wolves without recent genetic exchange.
The evolutionary origin of ASIP haplotypes was explored further by constructing maximum likelihood phylogenetic trees for dogs, wolves, and 8 additional canid species (Supplementary Table 9). Based on differences in SNV frequency, the 48 kb VP segment was considered separately from the 16 kb HCP-exon 2/3/4 segment (see Supplementary text, Fig. 4a). In the VP tree, all dogs and grey wolves form a single clade, consistent with known species relationships 9. However, in the HCP tree, the dominant yellow and shaded yellow dogs lie in a separate clade together with arctic grey wolves; remarkably, this clade is basal to the golden jackal and distinct from other canid species (Fig. 4b, Extended Data Fig. 3, 4).
The pattern of derived allele sharing provides additional insight (Fig. 4d and Extended Data Fig. 5). As depicted in Fig. 2c and 4d, HCP2 is characterized by three small repeat elements that are shared by all canids and is therefore the ancestral form. In the branch leading to core wolf-like canids (golden jackal, coyote, Ethiopian wolf, and grey wolf), there are nine derived SNV alleles within the HCP2-exon 2/3/4 segment (Extended Data Fig. 5, Supplementary Table 11), four of which flank the repeat elements close to HCP2 (Fig. 4d, Extended Data Fig. 5). None of the nine derived alleles are present in the dominant yellow HCP1-exon 2/3/4 segment haplotype which also carries an additional SINE close to HCP1; therefore, this haplotype must have originated prior to the last common ancestor of golden jackals and other wolf-like canids >2 Mybp 10. Although the 16 kb HCP1-exon 2/3/4 segment haplotype could have originated on a branch leading to the core wolf-like canids, it would have had to persist via incomplete lineage sorting and absence of recombination for more than 2 million years and through three speciation events (Supplementary text). A more likely scenario is that HCP1 represents a ghost lineage from an extinct canid (Fig. 4d, 5b) that was introduced by hybridization with grey wolves during the Pleistocene (see below), as has been suggested for an ancestor of the grey wolf and coyote 9, and in high altitude Tibetan and Himalayan wolves 11.
We expanded our analysis of VP and HCP haplotypes to a total of 45 North American and 23 Eurasian wolves. The VP1-HCP1 haplotype combination is found mostly in the North American Arctic in a distribution parallel to that of white coat color (Extended Data Fig. 6a) 12 and is not observed in Eurasia. We also identified an ancestral HCP1 haplotype variant, referred to hereafter as HCP1A, that does not extend to exons 2/3/4 and lacks the 24 bp insertion found in arctic grey wolves and dominant yellow dogs (Fig. 4d, Extended Data Fig. 7). A haplotype combination similar to shaded yellow, VP2-HCP1A, was observed in seven light-colored wolves from Tibet or Inner Mongolia, representative of a distinct, high-altitude grey wolf population that is notably lighter than other Eurasian populations (Fig. 4d, Extended Data Fig. 6b) 13.
Additional insight into the demographic history of these haplotypes emerges from the analysis of ancient dog (n=5) and grey wolf (n=2) WGS data, dated 4,000 – 35,000 ybp (Supplementary text and Supplementary Table 12), in which both forms of the VP (VP1 and VP2), and four forms of the HCP (HCP1A, HCP1, HCP2, HCP4) were observed in various combinations (Fig. 5a, Extended Data Fig. 7). Ancient wolves from the Lake Taimyr and Yana River areas of Arctic Siberia had at least one HCP1 haplotype, while ancient dogs from central Europe, Ireland, and Siberia carried HCP1A, HCP1, and HCP4, respectively (Supplementary Table 12). Thus, diversity in ASIP regulatory sequences responsible for color variation today was apparent by 35,000 ybp in ancient wolves and by 9,500 ybp in ancient dogs.
Together with our phylogenetic results, comparative analysis of wolf and dog ASIP haplotypes suggests an evolutionary history in which multiple derivative haplotypes and associated color patterns arose by recombination and mutation from two ancestral configurations corresponding to a white wolf (VP1-HCP1) and a grey wolf (VP2-HCP2), both present in the late Pleistocene (Fig. 5a, Extended Data Fig. 7). The distribution of derivative haplotypes explains color pattern diversity not only in dogs but also in modern wolf populations across the Holarctic, including white wolves in the North American Arctic (VP1-HCP1) and yellow wolves in the Tibetan highlands (VP2-HCP1A), and is consistent with natural selection for light coat color. A likely timeline for the origin of modules driving high levels of ASIP expression is depicted in Fig. 5b and indicates a dual origin. The HCP1 haplotype represents introgression into Pleistocene grey wolves from an extinct canid lineage that diverged from grey wolves more than 2 Mybp. This introgression as well as the mutation from VP2 to VP1 occurred prior to 33,500 ybp, based on direct observation from an ancient wolf sample (Fig. 5a).
Discussion
A relationship between ASIP and dog color pattern was recognized more than a century ago by Sewall Wright 14, and explored in depth by the work of C. C. Little in the decades that followed 15. Previous studies have reported molecular variation in or around the ASIP region associated with some dog color patterns, including a 16 bp non-coding duplication associated with black saddle 16, a SINE insertion associated with black back and black saddle 16, and missense variants A82S and R83H associated with dominant yellow 17 (Fig. 2a, Extended Data Fig. 2). As shown here, availability of a broader dataset indicates that these previously reported associations represent linkage disequilibrium and/or breed structure rather than causal variation (Supplementary Table 7). Instead, our WGS-based comprehensive annotation of the region together with RNA expression data reveals a series of structural variants that define distinct haplotypes for each of two promoters that, in combination, explain five different pattern types (Fig. 2c).
In dogs, the key differences between VP1 and VP2 are a SINE element and a small insertion; similarly, the key differences between HCP1 and HCP2 are multiple SINE elements and a small insertion (Fig. 2b). In each case, we do not yet know if the transcriptional differences (VP1 > VP2, HCP1 > HCP2) are caused by the SINE element, the small insertion, or both. We note, however, that modularity of ASIP regulatory variation is a general theme in vertebrates, with non-coding changes driving adaptation in natural populations of deer mice 3, mountain hares 18, snowshoe hares 8, and several species of parulid warblers 19-22. Likewise, artificial selection in goats 23, domestic rabbits 24,25 and laboratory mice 26 is associated with structural variation in ASIP regulatory regions that may lead to acquisition of novel promoters that modulate region-specific expression of ASIP.
ASIP color pattern diversification was likely an early event during dog domestication, since our analysis of ancient DNA data reveals several different VP and HCP haplotypes in Eurasia by 4,800 ybp. This is consistent with the wide distribution of dominant yellow across modern dog breeds from diverse locations, as well as the dingo (Supplementary Table 9), a feral domesticate, frequently dominant yellow, introduced to Australia at least 3,500 ybp 27. Of particular interest is the Zhokov island dog from Siberia 28,29 Based on a haplotype combination of VP2-HCP4, this sled dog that lived 9,500 years ago exhibited a black back color pattern, allowing it to be easily distinguished from white colored wolves in an arctic environment.
In wolves, natural selection for VP1 and HCP1 are a likely consequence of Pleistocene adaptation to arctic environments and genetic exchange in glacial refugia, driven by canid and megafaunal dispersal during interglacial periods. Modern grey wolves are thought to have arisen from a single source ~25,000 ybp close to the last glacial maximum 30,31; during the North American glacial retreat that followed, the VP1-HCP1 haplotype combination was selected for in today’s white-colored arctic wolves. Our results show how introgression, demographic history, and the genetic legacy of extinct canids played key roles in shaping diversity in dogs and modern grey wolves.
Methods
Ethics Statement
All animal experiments were done in accordance with the local regulations. Experiments were approved by the “Cantonal Committee For Animal Experiments” (Canton of Bern; permits 48/13, 75/16 and 71/19).
Skin biopsies and total RNA extraction
Skin biopsies (6 mm punch) were recovered from three dogs (a black back Miniature Pinscher and a dominant yellow Border Terrier and Irish Terrier) at necropsy and/or surgery for reasons unrelated to this study. Biopsies were recovered from the ventral abdomen and dorsal thorax, and are not matched for age or hair growth cycle. The biopsies were immediately put in RNAlater (Qiagen) for at least 24 h and then frozen at −20°C. Prior to RNA extraction, the skin biopsies were homogenized mechanically with the TissueLyser II device from Qiagen. Total RNA was extracted from the homogenized tissue using the RNeasy Fibrous Tissue Mini Kit (Qiagen) according to the manufacturer’s instructions. RNA quality was assessed with a FragmentAnalyzer (Agilent) and the concentration was measured using a Qubit Fluorometer (ThermoFisher Scientific).
Whole transcriptome sequencing (RNA-seq)
From each sample, 1 μg of high quality total RNA (RIN >9) was used for library preparation with the Illumina TruSeq Stranded mRNA kit. The libraries were individually barcoded and pooled and sequenced on an S1 flow cell with 2x50 bp paired-end sequencing using an Illumina NovaSeq 6000 instrument. On average, 31.5 million paired-end reads per sample were collected. One publicly available Beagle sample was used (SRX1884098). All accession numbers and descriptive read statistics are given in Supplementary Table 8. All reads that passed quality control were mapped to the CanFam3.1 reference genome assembly using STAR aligner (version 2.6.0c) 34
Transcript coordinates
The STAR-aligned bam files were visualized in the IGV browser 35. Three different alternate untranslated first exons with splice junctions to the coding exons of ASIP were defined based on the visualizations of the read alignments in IGV based on the RNA-seq data just described. These exact transcripts have not been documented in NCBI; however, the three transcripts of NCBI annotation release 105 are virtually identical except minor differences regarding the length of the 5’-UTRs (XM_014106843.2, transcription start sire (TSS) 22 nucelotides upstream compared to our annotation; NM_001007263.1, VP1-TSS98 bp downstream of our annotation; XM_022408819.1 HCP-TSS 36 bp downstream of our annotation). Our visually curated gene models are given in Supplementary Table 2.
Identification of genomic variants
WGS data from 71 dogs and 6 wolves was used for variant discovery (Supplementary Table 3). They included 15 agouti dogs and wolves, 25 black back dogs, 11 black saddle dogs, 14 dominant yellow dogs, 11 shaded yellow dogs and one white wolf. The genomes were either publicly available or sequenced as part of related projects in our group 36. SNVs and small indels were called as described 36. The IGV software 35 was used for visual inspection of the three promoter regions based on the transcripts identified in the RNA sequencing data. Structural variants were identified and association with coat color phenotypes was verified by visual inspection in IGV. The pattern of CNV at the third promoter did not associate with the coat patterns as defined in Fig. 1.
DNA samples for Sanger sequencing and genotyping
Samples for variant discovery included two dogs from each color phenotype and are designated with asterisks in Supplementary Table 3. Samples from dogs listed in Supplementary Table 4 were used for genotyping. The coat color phenotype of all animals (Supplementary Tables 3 and 4) was assigned based on breed-specific coat color standards or photographs or owner reporting. Genomic DNA was isolated from EDTA blood samples using the Maxwell RSC Whole Blood DNA kit (Promega).
Sequencing of promoter regions
Sanger sequencing of PCR amplicons was carried out to validate and characterize structural variants in the promoter regions at the sequence level. All primer sequences and polymerases used are listed in Supplementary Table 5. PCR products amplified using LA Taq polymerase (Takara) or Multiplex PCR Kit (Qiagen) were directly sequenced on an ABI 3730 capillary sequencer after treatment with exonuclease I and shrimp alkaline phosphatase. Sequence data were analyzed with Sequencher 5.1 (GeneCodes). Interspersed repeat insertions were classified with the RepeatMasker program 37. Multiple copies of SINE elements from the same and different families were resolved this way. The CanFam3.1 reference genome assembly is derived from the Boxer Tasha, a dominant yellow dog, and represents a DY haplotype, VP1-HCP1, of the ASIP gene. Descriptions of the promoter variants and Genbank accession numbers for HCP2-5 are in Supplementary Table 1. Supplementary Table 1 lists the 7 combinations of VP and HCP regulatory modules observed in dogs. As HCP3, HCP4, and HCP5 all represent functionally equivalent loss-of-function alleles, the 7 listed combinations correspond to only 5 distinct phenotypes.
Genotyping assays
Five PCR-assays (ventral promoter assays 1, 2; hair cycle promoter assays 1, 2, 3) were required to unambiguously determine the VP and HCP haplotypes (Supplementary Table 5). The previously reported SINE insertion 32 was genotyped by fragment size analysis on an ABI 3730 capillary sequencer and analyzed with the GeneMapper 4.0 software (Applied Biosystems). The previously reported ASIP coding variants 17 were genotyped by Sanger sequencing of PCR products. The previously reported RALY intronic duplication 16 was genotyped by size differentiation of PCR products on a Fragment Analyzer (Agilent). Another primer pair was used for the amplification of the entire HCP (Supplementary Table 5). Genotyping results for all samples are shown in Supplementary Table 4. There is a perfect genotype-phenotype association in 352 dogs (see Table 1 and Supplementary Table 7). In the remaining 14 dogs, the presence of a eumelanistic mask prevented the reliable phenotypic differentiation of dominant yellow and shaded yellow dogs. Breeds and the different promoter haplotype combinations identified within each breed are indicated in Supplementary Table 6. In a few dogs that were heterozygous at both VP and HCP, the phasing of the VP and HCP haplotype combinations was performed based on haplotype frequency within the same breed as noted. A family of Chinooks were used to determine the segregation of extended haplotypes and the phenotypic equivalency of HCP3 and HCP5 (Figure 3). Summary of genotyping results and exclusion of previously associated variants is shown in Table 1 and Supplementary Table 7. This table lists the genotype-phenotype association in aggregated form. The table also contains the genotypes for variants that were previously reported to be associated with pattern phenotypes 16,17,32
Comparison of promoter haplotype effects on transcripts
Transcript data was generated from a second set of samples. Sample descriptions and colors are shown in Supplementary Table 8 for all RNA experiments. Skin samples were collected from a male Swedish Elkhound (agouti), female German Pinscher (dominant yellow) and male Rottweiler (black back) after euthanasia that was conducted due to behavioral or health problems not related to skin. Samples were collected in RNAlater Stabilization Solution and stored at –80°C. RNA was extracted using the RNeasy Fibrous Tissue Mini Kit (Qiagen) according to the manufacturer’s instructions. Integrity of RNA was evaluated with Agilent 2100 Bioanalyzer or TapeStation system (Agilent) and concentration measured with DeNovix DS-11 Spectrophotometer (DeNovix Inc.). The libraries for STRT (Single cell reverse tagged) RNA sequencing were prepared using the STRT method with unique molecular identifiers 38 and modifications including longer UMI’s of 8 bp, addition of spike-in ERCC control RNA for normalization of expression, and the Globin lock method 39 with LNA-primers for the canine alpha- and betaglobin genes. The libraries were sequenced with an Illumina NextSeq 500. Reads were mapped to the CanFam3.1 genome build using HISAT1 mapper version 2.1.0 40
The alignment-free quantification method Kallisto (version 0.46.0) 41 was used to estimate the abundance and quantified as transcripts per million mapped reads (TPM) data based on an index built from CanFam3.1 Ensembl transcriptome (release 99). The curated ASIP transcript isoform models based upon alignment visualizations in the IGV browser 35 were also included in the transcriptome. Results based on genotype of the promoter haplotypes are displayed in Extended Data Fig. 1 as TPM.
Haplotype construction
Haplotypes were constructed from two publicly available VCF files PRJEB32865 and PRJNA448733. The VCFs for selected dogs were merged using BCFtools merge tool (http://samtools.github.io/bcftools/) with the parameter --missing-to-ref, which assumed genotypes at missing sites are homozygous reference type 0/0. Only dogs homozygous for ASIP haplotypes (VP, HCP and coding exons) were used to visualize haplotypes (Supplementary Table 3). SNVs that had 100% call rate in these samples were color coded and displayed relative to the genome assembly and previously associated variants (Extended Data Fig. 2).
ASIP phylogenetic analysis in canids
Illumina whole genome sequence for 36 canids, including seven extant species and the dog, were downloaded from the NCBI short read archive as aligned (bam format) or unaligned (fastq format reads (Supplementary Table 9). Fastq data were aligned to the dog genome (CanFam3.1) using BWA (v.0.7.17) 42 after trimming with Trim Galore (v.0.6.4). SNVs within a 110 kb interval (chr24:23,300,000-23,410,000), which includes the ASIP transcriptional unit and regulatory sequences, were identified with Platypus (v.0.8.1) 43 and filtered with VCFtools (v.0.1.15) 44 to include 2008 biallelic SNVs. Phasing was inferred with BEAGLE (v.4.1) 45.
For phylogenetic analysis, the ASIP interval was partitioned in two regions, based on dog SNV density (Fig. 4a) and ASIP gene structure: a 48 kb region including the ventral first exon, extending to but excluding the hair cycle first exon (chr24:23,330,000-23,378,000), and a 16 kb region including the hair cycle first exon, extending to and including ASIP coding exons 2-4 (chr24:23,378,001-23,394,000). Consensus sequences of equal length were constructed for each inferred canid haplotype using BCFtools (v.1.9). Phylogenies were inferred using Maximum Likelihood method and Tamura-Nei model with 250 bootstrap replications, implemented in MEGAX 46,47, and including 34 canids (Fig. 4b, Extended Data Fig. 3, 4). For 34 of 36 individuals, consensus haplotype pairs were adjacent to each other or, in the case of a few wolf/dog haplotypes, were positioned in neighboring branches with weak bootstrap support. The exceptions were the African golden wolf, a species derived by recent hybridization of the grey wolf and Ethiopian wolf 9, and an eastern grey wolf from the Great Lakes region, which was also reported to have recent admixture with the coyote 48. The African golden wolf and the eastern grey wolf were removed from the alignments, and a single haplotype for each individual was selected arbitrarily for tree building and display.
Haplotype analysis of ASIP locus in ancient dogs and wolves
Whole genome sequencing data from several recent studies 9,13,29,49–53, including five ancient dogs, two ancient grey wolves, and 68 modern grey wolves (Supplementary Table 12) were downloaded as aligned (bam format) or unaligned (fastq format) reads. Fastq data was aligned to the dog genome (CanFam3.1) using BWA-MEM (v.0.7.17) 42, after trimming with Trim Galore (v.0.6.4). Coverage depth for each sample ranged from 1 -78x (Supplementary Table 12). Genotypes at five structural variants and six SNVs were determined by visual inspection using the IGV browser (Supplementary Table 1 and 12). Variants in or near the ventral promoter (n=2), the hair cycle promoter (n=6), and the coding exons (n=3) distinguished ventral and hair cycle promoter haplotypes (Supplementary Table 12). SNV genotypes were determined by allele counts; structural variants were genotyped by split reads at breakpoint junctions. The base maps used for plotting the geographic distribution of haplotypes (Fig. 5, Supplementary Fig. 6) were generated in R (v. 4.0.3) with ‘maps’ and ‘ggplot2’ packages.
For 67 of 75 wolves (or ancient dogs), the phase of ventral and hair cycle promoter haplotypes was unambiguous. Seven wolves and one ancient dog were heterozygous with respect to both the ventral and hair cycle promoter haplotypes, and for these samples, haplotype phase was inferred based on the linkage disequilibrium in the 67 unambiguous individuals.
Extended Data
Supplementary Material
Acknowledgements
This research was supported by grant no. 31003A_172964 from the Swiss National Science Foundation (TL), Maxine Adler Endowed Chair Funds (DB), the Jane and Aatos Erkko Foundation (HL), and the Academy of Finland (HL). We would like to acknowledge the Next Generation Sequencing Platform of the University of Bern and Biomedicum Functional Genomics Unit (FuGU), University of Helsinki, for sequencing services and the Interfaculty Bioinformatics Unit of the University of Bern and IT Center For Science Ltd. (CSC, Finland) for providing high performance computing infrastructure. We thank resources and members of the Dog Biomedical Variant Database Consortium and all other canine researchers who deposited genome sequencing data into public databases. We thank Tim Melling who provided the Tibetan wolf photograph.
Footnotes
Author Contributions
DLB: conceptualization, investigation, writing, visualization, formal analysis, CBK: investigation, visualization, formal analysis, writing, AL, PH, RL: validation, resources, VJ: software, PR, JH: validation, KMM and JRM: resources, MKH, AM, HL, MA, DoGA consortium: resources, STRT analyses, CD: supervision and resources, GSB: supervision, writing-review and editing, TL: conceptualization, funding acquisition, investigation, supervision, resources, writing-review and editing.
Competing Interest Declaration
Authors declare no competing interests except RL who is associated with a commercial laboratory that offers canine genetic testing.
Data Availability Statement
All data generated or analyzed during this study are included in this published article (and its supplementary information files). GenBank accession numbers for promoter sequence variants are MT319114.1, MT319115.1, MT319116.1, MT319117.1.
References
- 1.Barsh G, Gunn T, He L, Schlossman S, Duke-Cohan J. Biochemical and genetic studies of pigment-type switching. Pigment Cell Res. 2000;13(Suppl 8):48–53. doi: 10.1034/j.1600-0749.13.s8.10.x. [DOI] [PubMed] [Google Scholar]
- 2.Caro T, Mallarino R. Coloration in Mammals. Trends Ecol Evol. 2020;35:357–366. doi: 10.1016/j.tree.2019.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Linnen CR, et al. Adaptive evolution of multiple traits through multiple mutations at a single gene. Science. 2013;339:1312–1316. doi: 10.1126/science.1233213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Vrieling H, Duhl DM, Millar SE, Miller KA, Barsh GS. Differences in dorsal and ventral pigmentation result from regional expression of the mouse agouti gene. Proc Natl Acad Sci U S A. 1994;91:5667–5671. doi: 10.1073/pnas.91.12.5667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Dreger DL, et al. Atypical Genotypes for Canine Agouti Signaling Protein Suggest Novel Chromosomal Rearrangement. Genes (Basel) 2020;11 doi: 10.3390/genes11070739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kaelin CB, Barsh GS. Genetics of pigmentation in dogs and cats. Annu Rev Anim Biosci. 2013;1:125–156. doi: 10.1146/annurev-animal-031412-103659. [DOI] [PubMed] [Google Scholar]
- 7.Ritland K, Newton C, Marshall HD. Inheritance and population structure of the white-phased “Kermode” black bear. Curr Biol. 2001;11:1468–1472. doi: 10.1016/s0960-9822(01)00448-1. [DOI] [PubMed] [Google Scholar]
- 8.Jones MR, et al. Adaptive introgression underlies polymorphic seasonal camouflage in snowshoe hares. Science. 2018;360:1355–1358. doi: 10.1126/science.aar5273. [DOI] [PubMed] [Google Scholar]
- 9.Gopalakrishnan S, et al. Interspecific Gene Flow Shaped the Evolution of the Genus Canis. Curr Biol. 2018;28:3441–3449.:e3445. doi: 10.1016/j.cub.2018.08.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Koepfli KP, et al. Genome-wide Evidence Reveals that African and Eurasian Golden Jackals Are Distinct Species. Curr Biol. 2015;25:2158–2165. doi: 10.1016/j.cub.2015.06.060. [DOI] [PubMed] [Google Scholar]
- 11.Wang MS, et al. Ancient hybridization with an unknown population facilitated high altitude adaptation of canids. Mol Biol Evol. 2020 doi: 10.1093/molbev/msaa113. [DOI] [PubMed] [Google Scholar]
- 12.Gipson PS, B EE, Bailey Theodore N, Boyd Diane K, Dean Cluff H, Smith Douglas W, Jiminez Michael D. Color Patterns among Wolves in Western North America. Wildlife Sociaety Bulletin. 2002;30:821–830. [Google Scholar]
- 13.Zhang W, et al. Hypoxia adaptations in the grey wolf (Canis lupus chanco) from Qinghai-Tibet Plateau. PLoS Genet. 2014;10:e1004466. doi: 10.1371/journal.pgen.1004466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wright S. Color Inheritance in Mammals. Journal Of Heredity. 1917;8:224–235. [Google Scholar]
- 15.Little CC. The inheritance of coat color in dogs. Comstock Publishing Associates, Cornell University Press; Constable & Co. Ltd; London: 1957. [Google Scholar]
- 16.Dreger DL, Parker HG, Ostrander EA, Schmutz SM. Identification of a mutation that is associated with the saddle tan and black-and-tan phenotypes in Basset Hounds and Pembroke Welsh Corgis. J Hered. 2013;104:399–406. doi: 10.1093/jhered/est012. [DOI] [PubMed] [Google Scholar]
- 17.Berryere TG, Kerns JA, Barsh GS, Schmutz SM. Association of an Agouti allele with fawn or sable coat color in domestic dogs. Mamm Genome. 2005;16:262–272. doi: 10.1007/s00335-004-2445-6. [DOI] [PubMed] [Google Scholar]
- 18.Giska I, et al. Introgression drives repeated evolution of winter coat color polymorphism in hares. Proc Natl Acad Sci U S A. 2019;116:24150–24156. doi: 10.1073/pnas.1910471116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Baiz MD, Wood AW, Brelsford A, Lovette IJ, Toews DPL. Pigmentation Genes Show Evidence of Repeated Divergence and Multiple Bouts of Introgression in Setophaga Warblers. Curr Biol. 2021;31:643–649.:e643. doi: 10.1016/j.cub.2020.10.094. [DOI] [PubMed] [Google Scholar]
- 20.Kim KW, et al. Genetics and evidence for balancing selection of a sex-linked colour polymorphism in a songbird. Nat Commun. 2019;10:1852. doi: 10.1038/s41467-019-09806-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Toews DP, et al. Plumage Genes and Little Else Distinguish the Genomes of Hybridizing Warblers. Curr Biol. 2016;26:2313–2318. doi: 10.1016/j.cub.2016.06.034. [DOI] [PubMed] [Google Scholar]
- 22.Wang S, et al. Selection on a small genomic region underpins differentiation in multiple color traits between two warbler species. Evol Lett. 2020;4:502–515. doi: 10.1002/evl3.198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Henkel J, et al. Selection signatures in goats reveal copy number variants underlying breed-defining coat color phenotypes. PLoS Genet. 2019;15:e1008536. doi: 10.1371/journal.pgen.1008536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Fontanesi L, et al. Characterization of the rabbit agouti signaling protein (ASIP) gene: transcripts and phylogenetic analyses and identification of the causative mutation of the nonagouti black coat colour. Genomics. 2010;95:166–175. doi: 10.1016/j.ygeno.2009.11.003. [DOI] [PubMed] [Google Scholar]
- 25.Letko A, et al. A deletion spanning the promoter and first exon of the hair cycle-specific ASIP transcript isoform in black and tan rabbits. Anim Genet. 2020;51:137–140. doi: 10.1111/age.12881. [DOI] [PubMed] [Google Scholar]
- 26.Duhl DM, Vrieling H, Miller KA, Wolff GL, Barsh GS. Neomorphic agouti mutations in obese yellow mice. Nat Genet. 1994;8:59–65. doi: 10.1038/ng0994-59. [DOI] [PubMed] [Google Scholar]
- 27.Balme J, O’Connor S, Fallon S. New dates on dingo bones from Madura Cave provide oldest firm evidence for arrival of the species in Australia. Sci Rep. 2018;8:9933. doi: 10.1038/s41598-018-28324-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lee EJ, et al. Ancient DNA analysis of the oldest canid species from the Siberian Arctic and genetic contribution to the domestic dog. PLoS One. 2015;10:e0125759. doi: 10.1371/journal.pone.0125759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Sinding M-HS, et al. Arctic-adapted dogs emerged at the Pleistocene–Holocene transition. Science. 2020;368:1495–1499. doi: 10.1126/science.aaz8599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Fan Z, et al. Worldwide patterns of genomic variation and admixture in gray wolves. Genome Res. 2016;26:163–173. doi: 10.1101/gr.197517.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Loog L, et al. Ancient DNA suggests modern wolves trace their origin to a Late Pleistocene expansion from Beringia. Mol Ecol. 2019 doi: 10.1111/mec.15329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Dreger DL, Schmutz SM. A SINE insertion causes the black-and-tan and saddle tan phenotypes in domestic dogs. J Hered. 2011;102(Suppl 1):S11–18. doi: 10.1093/jhered/esr042. [DOI] [PubMed] [Google Scholar]
- 33.Freedman AH, Wayne RK. Deciphering the Origin of Dogs: From Fossils to Genomes. Annu Rev Anim Biosci. 2017;5:281–307. doi: 10.1146/annurev-animal-022114-110937. [DOI] [PubMed] [Google Scholar]
- 34.Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Robinson JT, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dog Biomedical Variant Database, C. Jagannathan V, Drogemuller C, Leeb T. A comprehensive biomedical variant catalogue based on whole genome sequences of 582 dogs and eight wolves. Anim Genet. 2019;50:695–704. doi: 10.1111/age.12834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.RepeatMasker Open-4.0. 2013 [Google Scholar]
- 38.Islam S, et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 2011;21:1160–1167. doi: 10.1101/gr.110882.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Krjutskov K, et al. Globin mRNA reduction for whole-blood transcriptome sequencing. Sci Rep. 2016;6:31584. doi: 10.1038/srep31584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc. 2016;11:1650–1667. doi: 10.1038/nprot.2016.095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34:525–527. doi: 10.1038/nbt.3519. [DOI] [PubMed] [Google Scholar]
- 42.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Rimmer A, et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet. 2014;46:912–918. doi: 10.1038/ng.3036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Danecek P, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. doi: 10.1093/bioinformatics/btr330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81:1084–1097. doi: 10.1086/521987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol Biol Evol. 2018;35:1547–1549. doi: 10.1093/molbev/msy096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Stecher G, Tamura K, Kumar S. Molecular Evolutionary Genetics Analysis (MEGA) for macOS. Mol Biol Evol. 2020;37:1237–1239. doi: 10.1093/molbev/msz312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.vonHoldt BM, et al. A genome-wide perspective on the evolutionary history of enigmatic wolf-like canids. Genome Res. 2011;21:1294–1305. doi: 10.1101/gr.116301.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Botigué LR, et al. Ancient European dog genomes reveal continuity since the Early Neolithic. Nat Commun. 2017;8:16082. doi: 10.1038/ncomms16082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Frantz LA, et al. Genomic and archaeological evidence suggest a dual origin of domestic dogs. Science. 2016;352:1228–1231. doi: 10.1126/science.aaf3161. [DOI] [PubMed] [Google Scholar]
- 51.Ni Leathlobhair M, et al. The evolutionary history of dogs in the Americas. Science. 2018;361:81–85. doi: 10.1126/science.aao4776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Skoglund P, Ersmark E, Palkopoulou E, Dalen L. Ancient wolf genome reveals an early divergence of domestic dog ancestors and admixture into high-latitude breeds. Curr Biol. 2015;25:1515–1519. doi: 10.1016/j.cub.2015.04.019. [DOI] [PubMed] [Google Scholar]
- 53.vonHoldt BM, et al. Identification of recent hybridization between gray wolves and domesticated dogs by SNP genotyping. Mamm Genome. 2013;24:80–88. doi: 10.1007/s00335-012-9432-0. [DOI] [PubMed] [Google Scholar]
- 54.Pitulko VV, Kasparov AK. Archaeological dogs from the Early Holocene Zhokhov site in the Eastern Siberian Arctic. Journal of Archaeological Science: Reports. 2017;13:491–515. doi: 10.1016/j.jasrep.2017.04.003. [DOI] [Google Scholar]
- 55.Huerta-Sanchez E, et al. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature. 2014;512:194–197. doi: 10.1038/nature13408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Miao B, Wang Z, Li Y. Genomic Analysis Reveals Hypoxia Adaptation in the Tibetan Mastiff by Introgression of the Gray Wolf from the Tibetan Plateau. Mol Biol Evol. 2017;34:734–743. doi: 10.1093/molbev/msw274. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data generated or analyzed during this study are included in this published article (and its supplementary information files). GenBank accession numbers for promoter sequence variants are MT319114.1, MT319115.1, MT319116.1, MT319117.1.