Skip to main content
Evolution Letters logoLink to Evolution Letters
letter
. 2017 Jun 16;1(3):155–168. doi: 10.1002/evl3.15

Genetic differences between willow warbler migratory phenotypes are few and cluster in large haplotype blocks

Max Lundberg 1, Miriam Liedvogel 1,2, Keith Larson 3, Hanna Sigeman 1, Mats Grahn 4, Anthony Wright 5, Susanne Åkesson 1, Staffan Bensch 1
PMCID: PMC6121796  PMID: 30283646

Abstract

It is well established that differences in migratory behavior between populations of songbirds have a genetic basis but the actual genes underlying these traits remains largely unknown. In an attempt to identify such candidate genes we de novo assembled the genome of the willow warbler Phylloscopus trochilus, and used whole‐genome resequencing and a SNP array to associate genomic variation with migratory phenotypes across two migratory divides around the Baltic Sea that separate SW migrating P. t. trochilus wintering in western Africa and SSE migrating P. t. acredula wintering in eastern and southern Africa. We found that the genomes of the two migratory phenotypes lack clear differences except for three highly differentiated regions located on chromosomes 1, 3, and 5 (containing 146, 135, and 53 genes, respectively). Within each migratory phenotype we found virtually no differences in allele frequencies for thousands of SNPs, even when comparing geographically distant populations breeding in Scandinavia and Far East Russia (>6000 km). In each of the three differentiated regions, multidimensional scaling‐based clustering of SNP genotypes from more than 1100 individuals demonstrates the presence of distinct haplotype clusters that are associated with each migratory phenotype. In turn, this suggests that recombination is absent or rare between haplotypes, which could be explained by inversion polymorphisms. Whereas SNP alleles on chromosome 3 correlate with breeding altitude and latitude, the allele distribution within the regions on chromosomes 1 and 5 perfectly matches the geographical distribution of the migratory phenotypes. The most differentiated 10 kb windows and missense mutations within these differentiated regions are associated with genes involved in fatty acid synthesis, possibly representing physiological adaptations to the different migratory strategies. The ∼200 genes in these regions, of which several lack described function, will direct future experimental and comparative studies in the search for genes that underlie important migratory traits.

Keywords: Divergent chromosome region, local adaptation, migration


Impact Summary.

How animals find their way when migrating between continents is one of the most fascinating phenomena in nature. It is well established that migratory behavior has a strong genetic basis in many bird species, and different routes and wintering areas are also likely to select for adaptations related to optimal migratory performance, such as changes in physiology. However, virtually nothing is known about the specific genes underlying these traits. Here we aim to detect migration genes by contrasting the genomes of two recently diverged populations of a small migratory songbird, the willow warbler, which are very similar in appearance but that differ markedly in migration routes and wintering areas in Africa. By assaying variation throughout the genomes of 18 willow warblers, and a fraction of the genome in >1100 samples, we found very few highly differentiated loci between the populations and extremely low genome‐wide differentiation even across samples collected over vast geographical distances. The few highly differentiated loci were almost exclusively localized within three large regions on chromosomes 1, 3, and 5, containing 146, 135, and 53 genes, respectively. The genetic variation in chromosome 3 was associated with breeding altitude and latitude, but not with the migratory phenotypes. The genetic variation on chromosomes 1 and 5 perfectly matched the geographical distribution of the migratory phenotype and is thus likely to contain genes important in shaping migratory traits. We find some evidence of divergent selection in genes related to fatty acid synthesis, which could represent physiological adaptations to the different migratory routes of the willow warblers. Future studies should investigate the genes within these regions in other closely related but differentially migrating bird populations. These analyses will address whether these large migration‐related genomic regions are unique to the willow warbler or whether the genes within them have a general effect on migration in birds.

Introduction

The seasonal migration between breeding and wintering areas, and in particular those journeys across continents undertaken by various animals such as birds and insects, represents one of the most fascinating phenomena in nature. Selective breeding and displacement experiments have clearly shown that the migratory behavior in songbirds must be encoded as an innate set of migratory directions and a schedule that provides sufficient information to reach a specific wintering area (Perdeck 1958; Berthold 1990; Helbig 1996). In addition to differences in timing and direction, differences in migratory routes and wintering areas may also select for several morphological and physiological adaptations (Pulido 2007). Even though migration‐related adaptations have generally been shown to have a strong genetic component, their underlying genetic architecture remains largely unknown (Liedvogel et al. 2011). Studies of birds using candidate gene approaches or genomic scans have found associations of several genes or genomic regions with differentially migrating populations (Mueller et al. 2011; Lundberg et al. 2013; Ruegg et al. 2014; Delmore et al. 2016) but there is so far no association that has been consistently shared across studies.

In birds, seasonal long‐distance migration has evolved and been lost repeatedly over the course of evolution (Rolland et al. 2014), but as long as we have not identified the actual genes underlying migratory traits, we cannot address whether these apparent parallel changes involve the same or lineage‐specific sets of genes. Migration genes could, in principle, be identified by comparing the genomes of closely related populations that significantly differ in their migratory strategy, such as differences in migratory directions exhibited across migratory divides. In this case different adaptations in migratory traits in each of the populations would be expected to lead to excessive allele frequency differences in the underlying loci as compared to the genome on average. Such excessive differences are predicted to be easier to detect when the background differentiation is low (Crawford and Nielsen 2013).

The willow warbler Phylloscopus trochilus and its two subspecies offer an excellent model for exploring the genetics of migratory traits. The subspecies breeding in western Europe (P. t. trochilus) migrates SW to wintering areas in West Africa, while birds breeding in northern and eastern Europe (P. t. acredula) migrate SSE to wintering areas in Eastern and Southern Africa (Bensch et al. 2006). Plumage color and size were originally used to describe the subspecies but these traits show extensive variation and overlap between the subspecies (Bensch et al. 2009). As differences in migratory strategy clearly separate both subspecies, we will below refer to them as southern (trochilus) and northern (acredula) migratory phenotypes. Previous studies have indicated that the willow warbler migratory phenotypes are extremely similar genetically (Lundberg et al. 2011; Lundberg et al. 2013). The lack of other apparent phenotypic and ecological differences suggests that genetic differences between the migratory phenotypes are likely to be enriched for adaptations associated with their different migratory strategies.

The most comprehensive study characterizing genetic variation in the willow warbler so far used transcriptome sequencing to identify sequence differences between the migratory phenotypes (Lundberg et al. 2013). Only a small fraction of SNPs, which clustered in three regions on chromosomes 1, 3, and 5, was found to be highly differentiated between the two phenotypes. However, the low number of samples from each migratory phenotype, subsequent pooling of samples prior to sequencing and large variation in coverage between different genes made it difficult to quantify allele‐frequency differences between the populations and hence detect potential migration genes. Here, we expanded on this approach by de novo assembling a reference genome for the willow warbler, performing whole‐genome resequencing of nine individuals from each migratory phenotype, and by designing a 6000 SNP array based on willow warbler transcriptome and restriction‐associated DNA (RAD) sequencing to explore genetic variation in more than 1100 willow warblers from Northern Europe and Asia. This comprehensive dataset allowed us to (i) precisely delineate the location and size of the chromosome regions that contained highly differentiated loci in previous studies, (ii) identify further sequence differences between the migratory phenotypes, (iii) quantify the association of these differences with phenotypic traits, including stable isotopes ratio in feathers, which is a proxy for migratory direction, and (iv) detect structural genomic differences between the migratory phenotypes.

The combined data from whole‐genome resequencing and the SNP array demonstrated that the genomes of the migratory phenotypes were virtually undifferentiated except for three differentiated chromosome regions (4.0–13.1 Mb) that previously had been shown to contain peaks of high differentiation in the transcriptome sequencing study (Lundberg et al. 2013). Here, we show that each of the differentiated regions contained two clusters of highly divergent haplotypes, whose frequencies for the regions on chromosome 1 and 5 matched perfectly with the geographic distribution of the migratory phenotypes.

Methods

DE NOVO GENOME SEQUENCING AND ASSEMBLY

We created a de novo assembly of the willow warbler genome in order to obtain a more unbiased reference than directly relying on the zebra finch genome, which had been the case in the previous transcriptome study, and to be able to efficiently map and assay variation in whole‐genome resequencing data. For de novo genome sequencing we used DNA from a single northern male caught in Stordalen, Northern Sweden (68.3°N, 19.1°E) in 2001. This DNA had been extracted from blood using a phenol‐chloroform protocol (Sambrook et al. 1989). Four different libraries were prepared with targeted insert sizes of 180 bp, 650 bp, 3 kb, and 7 kb, respectively, and sequenced on a Hiseq 2500 (Illumina, CA, USA). An assembly was constructed from 147 Gb trimmed sequence data using ALLPATHS‐LG (Gnerre et al. 2011) and subsequent gap closure and contamination removal (Supplementary methods). To arrange the scaffolds in a putative order along chromosomes in the willow warbler genome, we aligned them to the genomes of the zebra finch, collared flycatcher, and chicken using SatsumaSynteny (Grabherr et al. 2010).

ANNOTATION

A preliminary set of protein‐coding gene models was synthesized in MAKER (Holt and Yandell 2011) using gene predictions from augustus (Stanke et al. 2006) with chicken parameters and evidence from previous RNA sequence data (Lundberg et al. 2013) and protein and transcript sequences from other bird species (Supplementary methods). The gene models were functionally annotated based on homology to other vertebrate proteins or presence of known functional domains. Gene models that did not show homology to other proteins or lacked assignment to any functional domains were discarded.

WHOLE GENOME RESEQUENCING

Nine samples from each migratory phenotype were used for whole‐genome resequencing (Table S1). For each individual we sequenced libraries with a targeted insert size of 650 bp on a Hiseq2500 (Illumina, CA, USA), and mapped the reads to the reference genome using bwa mem (Li and Durbin 2009). Following removal of read duplicates using picardtools (http://broadinstitute.github.io/picard/), we used freebayes (Garrison and Marth 2012) to call genotypes from the filtered whole‐genome resequencing data, and a combination of vcftools (Danecek et al. 2011) and bcftools (Li 2011) to filter the raw genotype dataset based on SNP quality score, overlap with annotated repeats, site, and sample coverage and missingness of genotypes per site (Supplementary methods).

SNP ARRAY

A SNP array was designed prior to the willow warbler genome assembly to genotype 6000 SNPs in 1152 samples. These samples were primarily collected across the hybrid zones between the migratory phenotypes in central Sweden (N = 702) and in Poland/Lithuania (N = 106) but also included reference samples from each migratory phenotype from southern Scandinavia and Scotland (N = 111), Northern Scandinavia, Finland, and Estonia (N = 192) (Table S2). Our dataset further included four samples each from Yekaterinburg and Anadyr in Russia. This data allowed us to accurately estimate allele frequencies in allopatric populations and the change in allele frequencies across both of the hybrid zones.

The majority of probes (N = 5839) on the SNP array was designed from transcriptome reads (Lundberg et al. 2013) mapped to the zebra finch genome. We used a customized pipeline to extract and filter potential probe sequences surrounding each SNP. The remaining probes were manually designed from restriction‐associated DNA (RAD) data mapped to the zebra finch genome, from Sanger sequences from previous studies (Lundberg et al. 2011; Lundberg et al. 2013), or included as replicates. The raw array data was quality‐trimmed based on minor allele frequency and missingness per locus and genotype. Once the willow warbler genome assembly had been completed, we included only SNPs that had a probe sequence that could be successfully mapped to the assembly. With the filtered genotype data, we explored genetic structure among the sample locations using a PCA‐based clustering with smartpca in the Eigensoft package (Patterson et al. 2006) and by calculating FST (Weir and Cockerham 1984) in hierfstat (Goudet 2005).

POPULATION GENOMICS ANALYSES

For the whole‐genome resequencing data we estimated genetic differentiation and diversity within and between migratory phenotypes for each variant (SNPs and indels) and for biallelic SNPs in windows of 10 kb (see Supplementary methods for further details). The window estimates were filtered based on their number of callable sites, that is the number of sites for which a polymorphism could have been detected following variant filtering criteria (e.g., coverage). Genetic differentiation was quantified as FST (Weir and Cockerham 1984) in vcftools, with a weighted average over 10 kb windows. We further explored which genes and gene functions were associated with variants with FST≥0.7, the top 1% most differentiated windows and array SNPs with a FST≥0.1. Individual variants were annotated by using SNPeff (Cingolani et al. 2012) and the position of 10 kb windows relative to annotated genes was determined using bedtools (Quinlan and Hall 2010).

GENETIC STRUCTURE WITHIN DIFFERENTIATED REGIONS

We used a multidimensional scaling (MDS)‐based method in the R package invclust (Caceres and Gonzalez 2015) to visualize the genetic structure between southern and northern haplotypes (i.e., haplotypes most common in either of the migratory phenotype) in each differentiated region. With very restricted or absent recombination between southern and northern haplotypes, samples will fall into either of three equidistant clusters, with homozygote carriers of southern (SS) and northern (NN) haplotypes at each end, respectively, and heterozygous carriers (NS) clustering in the middle. This clustering is conceptually similar to the genetic structure of two genetically distinct populations that hybridize but only give rise to form F1 hybrids, which form a distinct equidistant cluster in between the parental populations. By limiting the analysis to particular chromosome regions, this method has been developed to identify inversion polymorphisms, that is regions of reduced recombination between distinct groups of haplotypes, in humans and classify individuals as carriers of collinear or inverted haplotypes. For this analysis, SNP genotypes within each differentiated chromosome region were analyzed jointly to provide coordinates of each sample along the two major axes of variation.

Once samples had been assigned a genotype of southern and northern haplotypes (SS, NS, or NN) for each of the three divergent regions, we investigated the geographical distribution of each haplotype within each chromosome region. Specifically, we fitted geographical clines to the frequency of the northern haplotypes at each sampling site using the R package HZAR (Derryberry et al. 2014). We also examined whether the migratory phenotypes are more strongly associated with variation in any of the three differentiated regions, and more closely associated with these regions than other phenotypic traits (e.g., size measurements) that show some difference between the willow warbler subspecies. For this purpose we calculated the frequency of the northern haplotypes for each geographical sampling site and differentiated chromosome regions, as well as the sampling site mean of the following routinely measured phenotypic traits: the ratio of stable nitrogen isotopes in feathers (proxy for migratory phenotype), wing length, tarsus length, bill‐head length, and a color score (1–9) quantifying the whiteness on the breast relative to three reference specimens (see Bensch et al. 2009 for details). Because of the high correlation between the three length measurements we combined them into a single size measurement using the first principal component (58% of variation) from a PCA. We restricted the analyses to males (N = 1029, Table S3) as females differ from males in size measurements. The strength of the relationship between allele frequency and phenotypes was quantified using a Pearson correlation coefficient in R. These analyses are based on mean values per sampling sites rather than on the individual samples because the migratory trait (measured from feather stable isotopes) shows large individual variation within each migratory phenotype (Bensch et al. 2006). However, this approach is accurate enough to assign a population sample to either of the two different African wintering areas, or to a population with either mixed or intermediate migration strategies (Bensch et al. 2009).

GENETIC VARIATION WITHIN DIFFERENTIATED CHROMOSOME REGIONS

The pattern of genetic variation within each group of northern and southern haplotypes combined with the genetic divergence between them provides additional information about the effect of selection, recombination rate, and time since the haplotype groups separated. For example, reduced variation in these regions compared to the rest of the genome, could be indicative of low recombination rate, which would make the variation‐reducing effect of background selection more pronounced. Similarly, low recombination rate could also lead to reduced variation by increasing the magnitude of selective sweeps arising from positive selection. We performed these analyses by dividing the resequencing data into two sets of individuals that were consistently southern (N = 8) or northern homozygous (N = 6) for all three differentiated chromosome regions (Table S1). This division was based on the same MDS approach as used above, but restricted to a subset of array SNPs that had been genotyped in the resequencing data. Once the resequenced samples had been assigned to groups of pure northern and southern individuals, we used vcftools to estimate nucleotide diversity and Tajima's D, and customized perl scripts to calculate dXY and dA in windows of 10 kb. We tested for significance between mean genetic variation estimates for each of the chromosome regions and the rest of the genome using an ANOVA and Tukey's HSD test in R (R Development Core Team 2014.) for pairwise significance between estimates.

BREAKPOINT ANALYSES AND DETECTION OF STRUCTURAL VARIANTS

We used delly version 0.65 (Rausch et al. 2012) with default settings to identify structural variants from discordantly aligned read pairs and split reads. Structural variant calls were further quality filtered using the delly script populationFilter.py.

Breakpoints could potentially be located in nonassembled parts of the genome between the scaffolds at the ends of the regions and adjacent scaffolds. We therefore designed primers that could amplify across gaps between the ends of the regions and their neighboring scaffolds as determined from alignments to other bird genomes. For this purpose, we used a long‐range PCR kit (Qiagen, CA, USA) following instructions from the manufacturer. For the PCR assays we used DNA from two northern samples (01L/19 and 01L/20) and one southern sample (01P/02) that were included in the whole‐genome sequencing (Supplementary methods).

Results

GENOME ASSEMBLY

The final assembly comprised 5996 scaffolds with a scaffold N50 of 3.2 Mb and total length of 1.07 Gb. Most of the larger scaffolds (N = 956, sum = 1.03 Gb) could at least partially be mapped to the genomes of chicken, zebra finch, and collared flycatcher, and many of them mapped to all three genomes (N = 667, sum = 994 Mb). Whole‐genome alignments demonstrated high conservation in chromosomal position between these species and suggested few large‐scale changes in genome structure that were unique to the willow warbler.

WHOLE‐GENOME RESEQUENCING

With ∼9× coverage whole‐genome resequencing data (excluding sequence duplicates) of nine individuals per migratory phenotype mapped to the assembly, we were able to detect 40 million SNPs or indels. The vast majority of variants (35.7 million) were biallelic SNPs, of which 14.2 million (40%) were present as singletons. Overall genetic differentiation between the migratory phenotypes was very low with a total weighted average FST of 0.007 among biallelic SNPs and a mean FST among nonoverlapping 10 kb windows of 0.0107 (95 CI: 0.0105–0.0109). Highly differentiated variants were clustered in three regions on chromosomes 1, 3, and 5 (Fig. 1), comprised of two (11.6 Mb), one (13.1 Mb), and at least 10 scaffolds (4.0 Mb), respectively (Fig. S1). Together, these regions comprise approximately 2.7% of the assembled genome. The highly differentiated regions on chromosome 3 and 5 were located close to predicted centromeres based on synteny to the zebra finch and chicken genomes, while on chromosome 1, the centromere was predicted to be ∼40 Mb further downstream of the differentiated region.

Figure 1.

Figure 1

Genetic differentiation (FST) across scaffolds ordered according to alignments to zebra finch chromosomes. Scaffolds have been assigned to a particular chromosome if the mapped interval spans at least 40% of their length and have otherwise been concatenated into an unplaced chromosome (Un). Top: estimates for 4063 SNPs included on the array, calculated between 111 southern and 181 northern individuals. Bottom: weighted average for nonoverlapping 10 kb windows between nine southern and nine northern resequenced individuals.

Only 33 SNPs and indels with FST≥0.7 were found outside the three differentiated chromosome regions and located on 29 scaffolds (Table S4), compared to 10,934 found in the three regions. The top 1% of the most differentiated filtered 10 kb windows (N = 690) were restricted to the differentiated chromosome regions. The most differentiated window outside the differentiated regions (FST = 0.39) was located in a scaffold that maps further downstream on chromosome 3 in the zebra finch. In the mitochondrial scaffold we identified 223 variable sites, all of which were undifferentiated (FST = 0) between the migratory phenotypes.

SNP ARRAY DATA

We obtained genotypes of 4063 SNPs from 1108 individuals sampled from a total of 74 locations with good representation of both migratory phenotypes and putative hybrids (Table S2). Overall genetic differentiation between the migratory phenotypes was very low, with a weighted average FST of 0.0157. Highly differentiated SNPs were extremely few (e.g., FST≥0.5, N = 24) and localized in the three differentiated chromosome regions identified in the resequencing data (Fig. 1). The highest FST estimate outside of the differentiated chromosome regions was 0.03. The PCA‐based population clustering separated migratory phenotypes along the first principal component, which was driven by variation in the three differentiated chromosome regions (Fig. 2). The second principal component separated samples within migratory phenotypes based on their genotypes in the differentiated region on chromosome 3, which from the top to bottom represent samples homozygous for the southern haplotype, heterozygotes, and homozygous northern haplotypes, respectively. The SNP array data further show that the clustering of genotypes of willow warblers in Northern Europe and Asia is driven by three highly differentiated chromosome regions (Fig. 2). If these regions are not considered, there is no apparent genetic difference between even the most distantly located sample sites in Britain and Far Eastern Russia (>6000 km apart) (Fig. 2).

Figure 2.

Figure 2

Principal component‐based clustering of genotypes from 4063 SNPs on the array. (A) Only samples outside of the hybrid zones (N = 416). (B) All samples (N = 1108). (C) Clustering of samples outside of the hybrid zones only including SNPs outside the divergent chromosome regions (N = 3777). (D) All samples clustered based on SNPs only outside of the divergent chromosome regions.

DETECTION OF STRUCTURAL VARIATION

Over the whole genome we identified 4817 deletions (median length: 1018 bp, range: 501–2,111,000 bp), 774 duplications (median length: 1682 bp, range: 505–4,511,000 bp), 449 inversions (median length: 1896 bp. range: 501–5,181,000 bp) and 487 translocations between scaffolds. The most differentiated structural variants (FST≥0.7, N = 4) were limited to the differentiated regions on chromosome 1 and 5. However, none of the structural variants coincided with the ends of the differentiated regions. For both the differentiated regions on chromosome 1 and 3, long‐range PCR resulted in amplification across the gap formed by one of the ends and an adjacent scaffold, but failed at the other end. We were only able to get PCR products of the expected gap sizes in the two northern samples, but in this case the amplification had been unspecific and resulted for each locus in multiple bands on an agarose gel. The sequences of these products could not be recovered with Sanger sequencing, and thus did not allow for a localization of the breakpoints.

GEOGRAPHICAL DISTRIBUTION OF DIFFERENTIATED HAPLOTYPES

We applied a multidimensional scaling (MDS) approach to the SNP array data to genotype the individuals for southern and northern haplotypes in each of the differentiated chromosome regions. This analysis revealed three distinct and equidistant genotype clusters for each chromosome region, with homozygotes for each haplotype group at the ends, and heterozygotes in between (Fig. 3). The clearly defined groups suggest very restricted recombination between haplotypes of each cluster.

Figure 3.

Figure 3

Multidimensional scaling of array genotypes from SNPs located within each of the three divergent chromosome regions. Each genotype cluster is based on the geographical origin of most of its samples and has been assigned as either southern homozygous (SS), northern homozygous (NN), or heterozygous (NS). Symbols referring to the specific locations are listed in the legend, in brief for reference: Sweden/Denmark (green circle), Hybrid Zone Sweden (green plus), Northern Sweden (blue circle), Finland, Norway, and Estonia (purple triangle), Hybrid zone Lithuania/Poland (red plus), Scotland (orange circle), Yekaterinburg (light blue circle), and Anadyr (black circle).

Variation in the regions on chromosomes 1 and 5 show the strongest correlation with the migratory phenotypes (measured as the site mean of stable nitrogen isotope ratios in feathers) compared to any of the other measured phenotypic traits that show some difference between the subspecies (Fig. S2, Table S5). Also, these correlations are stronger for the regions on chromosomes 1 and 5 compared to chromosome 3, and remained consistent when analyzed separately for the sampling sites in Sweden and east/south of the Baltic, that is along transects intersecting each of the two migratory divides located in central Scandinavia and eastern Poland, respectively (Fig. 4). In contrast, the geographical distribution of northern alleles in the region on chromosome 3 is more associated with high altitude and latitude environments (Fig. 4).

Figure 4.

Figure 4

Geographical distribution of southern (green) and northern (blue) haplotypes for each of the three divergent chromosome regions. White lines represent approximate borders between the distributions of the migratory phenotypes. The upper panel represents the North European distribution and the lower panel distribution found at two Russian sites. The breeding distribution of the species covers most of Eurasia north of 40 degree latitude.

All three chromosome regions showed narrow (271–688 km) sigmoid frequency clines of southern and northern haplotypes across the hybrid zone in central Sweden (Fig. S3, Table S6) compared to a scenario that would be expected with neutral diffusion given a dispersal distance of 50 km and 100 generations since secondary contact (Bensch et al. 1999). The migratory phenotype‐specific regions on chromosomes 1 and 5, showed a similar pattern in the hybrid zone in Poland and Lithuania with cline widths of 373 and 249 km, respectively. Many individuals in the hybrid zones were heterozygous for both of the regions on chromosome 1 and 5, which suggests that hybridization between migratory phenotypes is frequent. All other possible genotype combinations were also present, which is suggestive of later generation hybrids and backcrosses (Fig. S4).

GENETIC VARIATION WITHIN DIFFERENTIATED CHROMOSOME REGIONS

We characterized the pattern of genetic variation in and between each of the southern and northern haplotype groups for each divergent region to infer the potential effects of recombination and selection in the regions. Here, we used a subset of resequenced individuals that through MDS clustering had been genotyped as either homozygous northern (N = 6) or southern (N = 8) for all the three regions. Outside of the differentiated chromosome regions, the average nucleotide diversity in 10 kb windows was 0.005 in both migratory phenotypes, which is about twice as high as in other migratory songbirds (Ellegren et al. 2012; Irwin et al. 2016). Within the differentiated regions on chromosomes 1 and 5 the northern migratory phenotype showed slightly higher nucleotide diversity (0.007) than the southern migratory phenotype (0.005) and the genome on average (0.005) (Fig. 5, Fig. S4, Table S5). In the differentiated region on chromosome 3, individuals of both migratory phenotypes showed levels of diversity similar to the rest of the genome. There was a general excess of rare alleles, with an average Tajima's D of –0.87 in both migratory phenotypes. In the differentiated regions, Tajima's D was clearly higher for southern individuals (Fig. 5, Fig. S5, Table S7).

Figure 5.

Figure 5

Genetic variation (from the top FST for resequencing data, FST for array SNPs, nucleotide diversity (π), Tajima's D, and dXY, respectively) within each of the three divergent chromosome region (within dashed lines) and the nearest surrounding scaffolds. For nucleotide diversity and Tajima's D, estimates for southern and northern individuals have been marked with green and blue, respectively. With the exception of FST for SNPs on the array, estimates have been summarized for nonoverlapping 10 kb windows and been filtered for a minimum number 5000 callable sites, and in the case of FST, also for a minimum of 25 SNPs. The distance in between scaffolds (solid black lines) has been added for illustrative purposes. Due to the uncertainty regarding which scaffold is located upstream of the region, the plot of chromosome 5 starts with the first scaffold in the divergent region.

Absolute divergence (dXY) between migratory phenotypes was higher on average in the differentiated chromosome regions compared to the rest of the genome, especially prominent in the regions on chromosomes 1 and 5 (Fig. 5, Fig. S5). Similarly, the net number of nucleotide substitutions (dA) was higher on average in the differentiated regions compared to the rest of the genome (Fig. S5, Table S7). The highest window‐based estimates (0.005) were obtained for the regions on chromosome 1 and 3 and would, using substitutions rates from other bird genomes, correspond to a rough divergence time between the haplotype groups of 0.75 and 1.6 Myrs, that is long before the last glaciation.

FUNCTIONAL VARIATION

We were able to annotate 17,294 protein‐coding genes based on homology to proteins and transcripts from other birds, RNAseq data and ab initio predictions. The differentiated regions on chromosome 1, 3, and 5 contained 146, 135, and 53 genes, respectively. The most differentiated windows on chromosome 1 (FST = 0.80) were located 30–50 kb upstream of the FRAS1‐related extracellular matrix protein 2 (FREM2) gene, on chromosome 3 within the estrogen receptor (ESR1) gene and Ryanodine receptor 2 (RYR2) gene (FST = 0.62), and on chromosome 5, 23 kb downstream of the olfactomedin‐like 1 (OLFML1) gene (FST = 0.85) (Table S8). SNPs and indels with FST≥0.7 (N = 10,934) were mostly located several thousand bp or more from the closest gene or within introns with unknown functional consequences. A very small subset of the variants (N = 30) was predicted to lead to missense mutations or in‐frame deletions in the divergent regions (Table S3). These mutations were associated with 28 different genes, which were mainly found in the region on chromosome 5 and associated with fatty acid synthesis or olfaction.

Among the most differentiated SNPs included on the array (Table S8) chromosome 1 showed a region with considerably higher differentiation (FST = 0.80) than any other SNP on the array located 32 kb downstream of the Forkhead Box O1 (FOXO1) gene. On chromosome 3 the highest differentiation (FST = 0.41) was found in an intron of the Regulator of G‐protein signaling 7 (RGS7) gene, and on chromosome 5, the most differentiated SNP was located 22 kb upstream of the olfactomedin‐like 1 (OLFML1) gene (FST = 0.67).

Discussion

By using the willow warbler de novo genome assembly, whole‐genome resequencing and the large‐scale genotyping with a SNP array we have obtained a vastly improved resolution of the genomes of the two willow warbler migratory phenotypes. With this data we were able to identify two chromosome regions that are more strongly associated with the migratory phenotypes than other measured traits and one chromosome region that has genetic variation correlated with altitude and latitude. Each of these chromosome regions show a genetic structure that suggest recombination suppression from inversion polymorphisms and contrast with the overall genome that is extremely similar across the full distribution range of the species.

POPULATION STRUCTURE OF WILLOW WARBLERS

A striking result is the contrasting pattern found between the three highly differentiated chromosome regions and the otherwise virtually undifferentiated genome (Fig. 1). The absence of differentiation for most of the genome is also apparent across the full breeding distribution range of the species. For example, the 3, 777 loci on the SNP array that were from genomic regions outside of the differentiated regions could not distinguish between samples from Scotland and eastern Siberia, even though the populations are located more than 6000 km from each other (Fig. 2). The overall lack of population structure is also reflected in the mitochondrial genome, which typically differentiates faster between geographically isolated populations than nuclear loci (Avise et al. 1987; Zink and Barrowclough 2008). Despite being able to assess genetic variation for the entire mitochondrial genome, we were not able to detect any sequence differences between the migratory phenotypes. This absence of population structure in most of the genome is likely the result of a recent range expansion involving large effective population sizes and facilitated by high gene flow among expanding populations, although their relative effects remain to be investigated. However, the overall homogenous pattern strongly contrasts with the distinct geographic distributions of the haplotypes at the three differentiated chromosome regions, patterns that are unlikely to be maintained unless selection is operating. The high divergence times separating the haplotype groups (estimated to be 0.75–1.6 million years) suggest that they must have existed together during several glacial episodes and raises further questions about their origin.

In contrast to the willow warbler, two differentially migrating subspecies of Swainson's Thrush Catharus ustulatus and two species of Ficedula flycatchers show much more elevated levels of genomic background differentiation and a larger number of highly differentiated regions (Ellegren et al. 2012; Ruegg et al. 2014). The lack of differentiation over most of the genome suggests that the willow warbler migratory phenotypes are at a very early stage of divergence, which is reminiscent of the pattern seen between hooded crows Corvus cornix and carrion crows Corvus corone where low genome‐wide differentiation contrasts with a 2 Mb highly differentiated region on chromosome 18 (Poelstra et al. 2014). Among migratory birds, differentially migrating barn swallow Hirundo rustica populations in Europe show comparable levels of genome‐wide differentiation to the willow warblers (von Rönn et al. 2016). Further worthy of note is that the Z chromosome, which is believed to play an important role in reproductive isolation among birds (Price and Bouvier 2002; Qvarnström and Bailey 2009), did not show any evidence of increased differentiation between the migratory phenotypes. Finally, the early stage of divergence between the migratory phenotypes is further supported by a lack of assortative mating (Liedvogel et al. 2014) and the extensive mixing in the hybrid zones including putative backcrosses and later generation hybrids (Fig. S4).

ASSOCIATION BETWEEN GENETIC VARIATION AND MIGRATORY PHENOTYPES

We did not find any differences related to previously suggested candidate genes or chromosome regions for migration, such as ADCYAP1 in blackcaps (Mueller et al. 2011) or a large region (∼30 Mb) on chromosome 4 identified in the Swainson's Thrush (Delmore et al. 2016). With the new genomic dataset we were able to precisely delineate the three regions located on chromosomes 1, 3, and 5 that in previous studies had been shown to contain loci with exceptionally high differentiation (Lundberg et al. 2011; Lundberg et al. 2013). Each differentiated region spans 4.0 to 13.1 Mb and contains between 53 and 146 protein‐coding genes. Genotypes from a large number of samples around the Baltic Sea suggest that the spatial distribution of alleles within the regions on chromosome 1 and 5 mirror the geographical breeding distribution of the migratory phenotypes (Fig. 4). This relationship is further strengthened by correlations between allele frequency and mean phenotypic measurements at each site (Fig. S2, Table S5). Both chromosome regions show the strongest correlation with stable nitrogen isotope ratios in feathers, which is a proxy for wintering area. Importantly, this strong association with migratory phenotype is also observed when restricting the analysis to sites on the eastern/southern side of the Baltic Sea, which is previously known to differ more in cline shapes between the phenotypic traits than on the western side (Bensch et al. 2009). The region on chromosome 3 also shows high differentiation between the migratory phenotypes across the migratory divide in Scandinavia Sweden, albeit lower than in the other two regions. However, it is not correlated to the migratory phenotypes across the migratory divide located on the eastern/southern side of the Baltic. Instead, the variation within this chromosome region shows the highest correlation with altitude and latitude (Fig. 4). This environmental correlation was previously demonstrated for a single biallelic marker located within this chromosome region (Larson et al. 2014).

GENETIC STRUCTURE WITHIN DIFFERENTIATED REGIONS

The large size of the differentiated regions, each comprising several Mb and containing many coding genes, is intriguing. “Genomic islands of differentiation” have been attributed to processes such as background or positive selection in regions of low recombination, for example close to centromeres (Cruickshank and Hahn 2014), which has will lead to decreased within‐population diversity and possibly to increased relative differentiation (e.g., FST), but not increased absolute divergence (dXY). These processes have been associated with the formation of large differentiated regions in several lineages of birds (Burri et al. 2015; Irwin et al. 2016). Indeed, two of the regions (on chromosome 3 and 5) correspond to regions of the zebra finch genome located close to, or overlapping with centromeres (Fig. S1). However, we did not observe any clear reduction in nucleotide diversity or excess of rare alleles within the divergent regions in either migratory phenotype, suggesting that the recombination rate (within the migratory phenotypes) is not reduced in these regions. All three regions showed elevated mean absolute divergence (dXY) between migratory phenotypes compared the average of the genome, and was most pronounced for the regions on chromosomes 1 and 5. Taken together, these patterns suggest that it is unlikely that low recombination rate combined with background or positive selection is the main driving process behind the differentiated regions. The strong “block‐like” inheritance of southern and northern alleles rather suggests the presence of inversion polymorphisms, which would result in limited recombination between inverted and noninverted haplotypes, but not within each group of haplotypes.

The lack of recombination between southern and northern haplotypes is strongly supported by the well‐defined and equidistant genotype clusters (Fig. 3). Similar patterns have been observed in some experimentally validated inversions in human (Caceres and Gonzalez 2015) as well as in PCA‐based clustering of SNP genotypes in an experimentally validated inversion on chromosome 5 in the zebra finch genome (Knief et al. 2016). We were not able to confirm or refute the presence of breakpoints through our analyses of discordantly mapped reads and long‐range PCR across scaffolds. However, since many inversion breakpoints are located within repetitive regions, the applicability of these methods is limited (Lucas Lledo and Caceres 2013).

Inversions have recently received attention in evolutionary biology due to their potential for maintaining adaptations despite ongoing gene flow (Hoffmann and Rieseberg 2008). If gene flow is present between two populations that are locally adapted, and certain combinations of alleles at some linked loci are favored in each of them, an inversion spanning these loci would suppress recombination and ensure that the favorable alleles are inherited together, which consequently would drive the inversion to high frequency (Kirkpatrick and Barton 2006). Interestingly, inversion polymorphisms have been associated with migratory phenotypes in cod Gadus morhua (Kirubakaran et al. 2016) and mating strategies in ruff Philomachus pugnax (Lamichhaney et al. 2016) and white‐throated sparrows Zonotrichia albicollis (Thomas et al. 2008). Hence, it is tempting to speculate that several of the ∼200 genes located within the putative inversions on chromosomes 1 and 5 could be associated with the correlated suite of behavioral and physiological traits that are observed in migratory birds (Piersma et al. 2005) and facilitate their coordinated evolution to different migratory routes and wintering areas. The association between variation in the differentiated region on chromosome 3 and high altitude and latitude is also highly interesting as inversion polymorphisms have been associated with environmental gradients in insects (Kapun et al. 2016; Cheng et al. 2012). This is to our knowledge the first time a putative inversion has been associated with an environmental gradient in a bird population.

FUNCTIONAL VARIATION WITHIN DIFFERENTIATED REGIONS

Due to the high linkage disequilibrium found within the differentiated regions, it is difficult to pinpoint specific targets of selection. Nonetheless, in the case of inversions, limited recombination between haplotypes from each cluster may over time homogenize parts of the regions, leaving the locally adapted genes differentiated (Guerrero et al. 2012). To identify potential candidate genes under selection in each region, we explored the position of the most differentiated windows in relation to annotated genes and highly differentiated nonsynonymous mutations. On chromosome 5, one of the most differentiated 10 kb windows is overlapping the two fatty acid desaturase genes FADS1 and FADS2 (Table S8) involved in biosynthesis of essential fatty acids that have been associated with putative dietary adaptations in humans (Ameur et al. 2012). In addition, several of the genes associated with the small number of highly differentiated missense mutations were involved in fatty acid synthesis or carbohydrate and fatty acid metabolic processes (Table S4). The most differentiated array SNP on chromosome 1, which is noticeably higher than any other in the genome, is located 32 kb downstream of the Forkhead Box O1 (FOXO1) gene (Table S8). This gene encodes a transcription factor that has been demonstrated to play an important role in gluconeogenesis and adipogenesis in mice (Gross et al. 2008). Long‐distance migratory birds primarily use fat as energy and the composition of fatty acids has been associated with flight performance (Price 2010). The migratory phenotypes of the willow warbler differ significantly in the distance they migrate and it is tempting to speculate that these differences represent adaptations in fueling to their different routes.

The most differentiated 10 kb windows and array SNPs on chromosome 3 are located within the RYR2 gene. This gene encodes calcium channels that are involved in heart muscle contraction and has been shown to be under selection in high altitude vertebrates (Zhang et al. 2014; Wang et al. 2015). The authors explained this as an adaptation to hypoxia, which should not be a selective force in willow warblers as the breeding altitudes are typically below 1000 meters, but it is possible that changes within this gene reflect adaptations to increased metabolic demands necessary to cope with the colder climate at higher altitudes and latitudes. One of the two highly differentiated missense mutations in this chromosome region is located within mitochondrial fission regulator 2 (MTFR2) gene, which is involved in aerobic respiration (Table S4).

CONCLUSION AND FUTURE DIRECTIONS

Our study provides important and previously inaccessible genomic insights into how local adaptations could form and persist between recently diverged and hybridizing taxa, and particularly in the context of migratory divides. To determine whether high differentiation in the three chromosome regions is maintained by inversion polymorphisms will require further evidence, and promising approaches include long read sequencing technologies and optical maps. It is difficult to pinpoint specific targets of selection due to the high linkage disequilibrium in the regions. Nonetheless, the most differentiated intervals and missense mutations within the differentiated regions provide candidate genes involved in migration physiology and cold tolerance in birds, which are poorly understood from a genetic perspective. Future studies on the genomes of other birds showing similar adaptations, but that do not possess the same large differentiated regions, will be useful to further highlight particular genes and pathways that could have been involved in shaping the differentiated regions in the willow warbler.

Editor in Chief: J. Slate

Supporting information

Fig. S1. Order and orientation of scaffolds included in each of the three differentiated regions

based on their synteny to the zebra finch, collared flycatcher and chicken genome.

Fig. S2. Correlation between the frequency of northern haplotype in each of the three differentiated chromosome regions and the mean of phenotypic traits (nitrogen isotope ratios, size and color) per sampling site.

Fig. S3. Geographic clines for the northern haplotype frequency across the migratory divides in central Sweden (differentiated regions of chromosomes 1, 3 and 5) and Poland/Lithuania (chromosomes 1 and 5).

Fig. S4. Frequency distribution of the combined genotypes at the differentiated regions of chromosome 1 ‐ chromosome 5 in allopatric populations of the migratory phenotypes (A) and in the two hybrid zones (B).

Fig. S5. Genetic variation within each of the three differentiated chromosome regions compared to the rest of the genome.

Table S1. Samples used for whole‐genome resequencing and RAD sequencing, respectively. Average coverage refers to coverage at positions in the genome that is covered by at least one read from the sample and has been calculated following removal of sequence duplicates.

Table S2. Sampling sites of willow warblers used for the SNP array. All samples were collected from birds caught on breeding territories.

Table S3. Frequency of northern haplotype 76 for each differentiated chromosome region and the mean of phenotypic measurements per sampling site.

Table S4. Highly differentiated missense mutations and inframe deletions detected in the three divergent chromosome region and all highly differentiated SNPs and indels located on scaffolds outside of the divergent regions.

Table S5. Results of correlation analyses between mean phenotypic trait estimates and differentiated chromosome region northern haplotype frequency per sampling site.

Table S6. Estimated parameters for geographical clines of northern haplotypes in each of the three differentiated chromosome regions.

Table S7. Statistical significance from testing differences in the mean 10 kb window estimates of genetic variation (nucleotide diversity, Tajima's D, dXY and dA) among each differentiated chromosome region and the rest of the genome.

Table S8. Annotation of the 10 most differentiated filtered 10kb windows and array SNPs in each of the divergent chromosome regions. Up‐ and downstream refer to the orientation relative to the transcription of the gene.

Table S9. Primers used for long‐range PCR across the ends of the differentiated regions on chromosome 1 and 3 and their adjacent scaffolds determined from whole‐genome alignments to other bird genomes (Fig. S1).

AUTHOR CONTRIBUTIONS

M.Lu, S.B., M.G., A.W., and M.Li planned the project. M.Lu, M.Li, S.B., M.G., K.L., and S.Å. collected blood samples in the field. M.Lu designed the SNP array, prepared samples for the SNP array and whole genome sequencing, analyzed the SNP array data and sequence data and drafted the manuscript. M.Li prepared samples for RAD sequencing and was involved in the analysis of RAD sequence data. H.S. assisted in sequence analyses of the divergent chromosome regions. All authors provided comments on the manuscript. The authors declare no conflict of interest.

ACKNOWLEDGMENTS

We are grateful to the Science for Life Laboratory, the national genomics infrastructure (NGI) and Uppmax for providing sequencing and computational infrastructure. Jacques Dainat and Francesco Vezzi, both associated with Science for Life Laboratory, provided useful suggestions about genome annotation and assembly, respectively. We would also like to thank the staff at SCIBLU, Lund, for performing the hybridization of samples to the array, Ulrich Knief for providing information on the position of centromeres in the zebra finch genome and three anonymous reviewers for comments on earlier versions of the manuscript. Samples were acquired with ethical permission from Malmö/Lund Committee for Animal Experiment Ethics (M27‐08, M94‐08, M1601‐11), the Swedish Environmental Protection Agency and the Swedish Ringing Centre for capturing birds. Samples from Anadyr were collected within the research program Beringia 2005 supported by the Swedish Polar Research Secretariat. Funding was provided from the Royal Physiographic Society (MLu), Lars Johan Hiertas memorial foundation (MLu), Swedish Research Council (SB: 621‐2013‐4839, SÅ: 621‐2010‐5584). The Foundation for Baltic and East European Studies (MG, AW: A067‐08), a Feodor Lynen Fellowship from the Alexander von Humboldt Foundation (MLi) and the Centre of Animal Movement Research (CanMove) financed by a Linneaus grant (349‐2007‐8690) from the Swedish Research Council and Lund University.

DATA DEPOSITION

Sequence data from this study has been deposited in the Sequence read archive (SRA) under bioproject PRJNA319295. The genome assembly has been deposited at DDBJ/ENA/GenBank under the accession LXGM00000000.

Contributor Information

Max Lundberg, Email: max.lundberg@biol.lu.se.

Staffan Bensch, Email: staffan.bensch@biol.lu.se.

LITERATURE CITED

  1. Ameur, A. , Enroth S., Johansson A., Zaboli G., Igl W., Johansson A. C., Rivas M. A., Daly M. J., Schmitz G., Hicks A. A., Meitinger T., Feuk L., van Duijn C., Oostra B., Pramstaller P. P., Rudan I., Wright A. F., Wilson J. F., Campbell H., and Gyllensten U.. 2012. Genetic adaptation of fatty‐acid metabolism: a human‐specific haplotype increasing the biosynthesis of long‐chain omega‐3 and omega‐6 fatty acids. Am. J. Hum. Genet. 90:809–820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Avise, J. C. , Arnold J., Ball R. M., Bermingham E., Lamb T., Neigel J. E., et al 1987. Intraspecific phylogeography—the mitochondrial‐DNA bridge between population‐genetics and systematics. Ann. Rev. Ecol. Evol. S 18:489–522. [Google Scholar]
  3. Bensch, S. , Andersson T., and Åkesson S.. 1999. Morphological and molecular variation across a migratory divide in willow warblers Phylloscopus trochilus . Evolution 53:1925–1935. [DOI] [PubMed] [Google Scholar]
  4. Bensch, S. , Bengtsson G., and Åkesson S.. 2006. Patterns of stable isotope signatures in willow warbler Phylloscopus trochilus feathers collected in Africa. J. Avian Biol. 37:323–330. [Google Scholar]
  5. Bensch, S. , Grahn M., Müller N., Gay L., and Åkesson S.. 2009. Genetic, morphological, and feather isotope variation of migratory willow warblers show gradual divergence in a ring. Mol. Ecol. 18:3087–3096. [DOI] [PubMed] [Google Scholar]
  6. Berthold, P. 1990. Spatiotemporal programs and the genetics of migration. Experentia 46:363–371. [Google Scholar]
  7. Burri, R. , Nater A., Kawakami T., Mugal C. F., Olason P. I., Smeds L., Suh A., Dutoit L., Bures S., Garamszegi L. Z., Hogner S., Moreno J., Qvarnstrom A., Ruzic M., Saether S. A., Saetre G. P., Torok J., and Ellegren H.. 2015. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers . Genome Res. 25:1656–1665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Caceres, A. , and Gonzalez J. R. 2015. Following the footprints of polymorphic inversions on SNP data: from detection to association tests. Nucleic Acids Res 43:e53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cheng, C. , White B. J., Kamdem C., Mockaitis K., Costantini C., Hahn M. W., and Besansky N. J.. 2012. Ecological genomics of Anopheles gambiae along a latitudinal cline: a population‐resequencing approach. Genetics 190:1417–1432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cingolani, P. , Platts A., Wang le L., Coon M., Nguyen T., Wang L., Land S. J., Lu X., and Ruden D. M.. 2012. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso‐2; iso‐3. Fly (Austin) 6:80–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Crawford, J. E. , and Nielsen R.. 2013. Detecting adaptive trait loci in nonmodel systems: divergence or admixture mapping? Mol. Ecol. 22:6131–6148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cruickshank, T. E. , and Hahn M. W.. 2014. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Mol. Ecol. 23:3133–3157. [DOI] [PubMed] [Google Scholar]
  13. Danecek, P. , Auton A., Abecasis G., Albers C. A., Banks E., DePristo M. A., Handsaker R. E., Lunter G., Marth G. T., Sherry S. T., McVean G., Durbin R., and Genomes Project Analysis G.. 2011. The variant call format and VCFtools. Bioinformatics 27:2156–2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Delmore, K. E. , Toews D. P., Germain R. R., Owens G. L., and Irwin D. E.. 2016. The genetics of seasonal migration and plumage color. Curr. Biol. 26:2167–2173. [DOI] [PubMed] [Google Scholar]
  15. Derryberry, E. P. , Derryberry G. E., Maley J. M., and Brumfield R. T.. 2014. HZAR: hybrid zone analysis using an R software package. Mol. Ecol. Resour. 14:652–663. [DOI] [PubMed] [Google Scholar]
  16. Ellegren, H. , Smeds L., Burri R., Olason P. I., Backstrom N., Kawakami T., Kunstner A., Makinen H., Nadachowska-Brzyska K., Qvarnstrom A., Uebbing S., and Wolf J. B.. 2012. The genomic landscape of species divergence in Ficedula flycatchers . Nature 491:756–760. [DOI] [PubMed] [Google Scholar]
  17. Garrison, E. , and Marth G.. 2012. Haplotype‐based variant detection from short‐read sequencing. arXiv 1207–3907. [Google Scholar]
  18. Gnerre, S. , Maccallum I., Przybylski D., Ribeiro F. J., Burton J. N., Walker B. J., Sharpe T., Hall G., Shea T. P., Sykes S., Berlin A. M., Aird D., Costello M., Daza R., Williams L., Nicol R., Gnirke A., Nusbaum C., Lander E. S., and Jaffe D. B.. 2011. High‐quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl. Acad. Sci. USA 108:1513–1518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Goudet, J. 2005. HIERFSTAT, a package for R to compute and test hierarchical F‐statistics. Mol. Ecol. Notes 5:184–186. [Google Scholar]
  20. Grabherr, M. G. , Russell P., Meyer M., Mauceli E., Alfoldi J., Di Palma F., and Lindblad‐Toh K.. 2010. Genome‐wide synteny through highly sensitive sequence alignment: Satsuma. Bioinformatics 26:1145–1151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gross, D. N. , van den Heuvel A. P., and Birnbaum M. J.. 2008. The role of FoxO in the regulation of metabolism. Oncogene 27:2320–2336. [DOI] [PubMed] [Google Scholar]
  22. Guerrero, R. F. , Rousset F., and Kirkpatrick M. 2012. Coalescent patterns for chromosomal inversions in divergent populations. Philos. Trans. R Soc. Lond. B Biol. Sci. 367:430–438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Helbig, A. J. 1996. Genetic basis, mode of inheritance and evolutionary changes of migratory directions in palearctic warblers (Aves: Sylviidae). J. Exp. Biol. 199:49–55. [DOI] [PubMed] [Google Scholar]
  24. Hoffmann, A. A. , and Rieseberg L. H.. 2008. Revisiting the impact of inversions in evolution: from population genetic markers to drivers of adaptive shifts and speciation? Annu. Rev. Ecol. Syst. 39:21–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Holt, C. , and Yandell M.. 2011. MAKER2: an annotation pipeline and genome‐database management tool for second‐generation genome projects. BMC Bioinformatics 12:491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Irwin, D. E. , Alcaide M., Delmore K. E., Irwin J. H., and Owens G. L.. 2016. Recurrent selection explains parallel evolution of genomic regions of high relative but low absolute differentiation in a ring species. Mol. Ecol. 25:4488–4507. [DOI] [PubMed] [Google Scholar]
  27. Kapun, M. , Fabian D. K., Goudet J., and Flatt T.. 2016. Genomic evidence for adaptive inversion clines in Drosophila melanogaster . Mol. Biol. Evol 33:1317–1336 [DOI] [PubMed] [Google Scholar]
  28. Kirkpatrick, M. , and Barton N.. 2006. Chromosome inversions, local adaptation and speciation. Genetics 173:419–434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kirubakaran, T. G. , Grove H., Kent M. P., Sandve S. R., Baranski M., Nome T., M. C. De Rosa, B. Righino, T. Johansen, H. Ottera, A. Sonesson, S. Lien and O. Andersen. 2016. Two adjacent inversions maintain genomic differentiation between migratory and stationary ecotypes of Atlantic cod. Mol. Ecol 25:2130–2143. [DOI] [PubMed] [Google Scholar]
  30. Knief, U. , Hemmrich‐Stanisak G., Wittig M., Franke A., Griffith S. C., Kempenaers B., and Forstmeier W., 2016. Fitness consequences of polymorphic inversions in the zebra finch genome. Genome Biol. 17:199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lamichhaney, S. , Fan G., Widemo F., Gunnarsson U., Thalmann D. S., Hoeppner M. P., Kerje S., Gustafson U., Shi C., Zhang H., Chen W., Liang X., Huang L., Wang J., Liang E., Wu Q., Lee S. M., Xu X., Hoglund J., Liu X., and Andersson L.. 2016. Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax). Nat. Genet. 48:84–88. [DOI] [PubMed] [Google Scholar]
  32. Larson, K. W. , Liedvogel M., Addison B., Kleven O., Laskemoen T., Lifjeld J. T., M. Lundberg, S. Akesson, and S. Bensch, 2014. Allelic variation in a willow warbler genomic region is associated with climate clines. PLoS One 9:e95252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Li, H. 2011. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27:2987–2993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Li, H. , and Durbin R.. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics 25:1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Liedvogel, M. , Åkesson S., and Bensch S.. 2011. The genetics of migration on the move. Trends Ecol. Evol. 26:561–569. [DOI] [PubMed] [Google Scholar]
  36. Liedvogel, M. , Larson K. W., Lundberg M., Gursoy A., Wassenaar L. I., Hobson K. A., and Åkesson S.. 2014. No evidence for assortative mating within a willow warbler migratory divide. Front Zool. 11:52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lucas Lledo, J. I. , and Caceres M.. 2013. On the power and the systematic biases of the detection of chromosomal inversions by paired‐end genome sequencing. PLoS One 8:e61292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Lundberg, M. , Åkesson S., and Bensch S.. 2011. Characterisation of a divergent chromosome region in the willow warbler Phylloscopus trochilus using avian genomic resources. J. Evol. Biol. 24:1241–1253. [DOI] [PubMed] [Google Scholar]
  39. Lundberg, M. , Boss J., Canbäck B., Liedvogel M., Larson K. W., Grahn M., Åkesson S., Bensch, S. and Wright, A. 2013. Characterisation of a transcriptome to find sequence differences between two differentially migrating subspecies of the willow warbler Phylloscopus trochilus . BMC Genomics 14:330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Mueller, J. C. , Pulido F., and Kempenaers B.. 2011. Identification of a gene associated with avian migratory behaviour. Proc. R Soc. B Biol. Sci. 278:2848–2856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Patterson, N. , Price A. L., and Reich D.. 2006. Population structure and eigenanalysis. PLoS Genet. 2:e190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Perdeck, A. C. 1958. Two types of orientation in migrating starlings, Sturnus vulgaris L., and chaffinches, Fringilla coelebs, L. as revealed by displacement experiments. Ardea 46:1–37. [Google Scholar]
  43. Piersma, T. , Perez‐Tris J., Mouritsen H., Bauchinger U., and Bairlein F.. 2005. Is there a "migratory syndrome" common to all migrant birds? Ann. NY Acad. Sci. 1046:282–293. [DOI] [PubMed] [Google Scholar]
  44. Poelstra, J. W. , Vijay N., Bossu C. M., Lantz H., Ryll B., Muller I., Baglione V., Unneberg P., Wikelski M., Grabherr M. G., and Wolf J. B.. 2014. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows. Science 344:1410–1414. [DOI] [PubMed] [Google Scholar]
  45. Price, E. R. 2010. Dietary lipid composition and avian migratory flight performance: development of a theoretical framework for avian fat storage. Comp. Biochem. Physiol. A Mol. Integr. Physiol. 157:297–309. [DOI] [PubMed] [Google Scholar]
  46. Price, T. , and Bouvier M. M.. 2002. The evolution of F1 postzygotic incompatibilities in birds. Evolution 56:2083–2089. [PubMed] [Google Scholar]
  47. Pulido, F. 2007. The genetics and evolution of avian migration. Bioscience 57:165–174. [Google Scholar]
  48. Quinlan, A. R. , and Hall I. M.. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Qvarnström, A. , and Bailey R. I.. 2009. Speciation through evolution of sex‐linked genes. Heredity 102:4–15. [DOI] [PubMed] [Google Scholar]
  50. R Development Core Team . 2014.. R: a language and environment for statistical computing pp. R foundation for statistical computing, Vienna, Austria. [Google Scholar]
  51. Rausch, T. , Zichner T., Schlattl A., Stutz A. M., Benes V., and Korbel J. O.. 2012. DELLY: structural variant discovery by integrated paired‐end and split‐read analysis. Bioinformatics 28:I333–I339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Rolland, J. , Jiguet F., Jonsson K. A., Condamine F. L., and Morlon H.. 2014. Settling down of seasonal migrants promotes bird diversification. Proc. Biol. Sci. 281:20140473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Ruegg, K. , Anderson E. C., Boone J., Pouls J., and Smith T. B.. 2014. A role for migration‐linked genes and genomic islands in divergence of a songbird. Mol. Ecol. [DOI] [PubMed] [Google Scholar]
  54. Sambrook, J. , Fritsch E. F., and Maniatis T.. 1989. Molecular cloning, a laboratory manual. Vol. 2. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. [Google Scholar]
  55. Stanke, M. , Schoffmann O., Morgenstern B., and Waack S. 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7:62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Thomas, J. W. , Caceres M., Lowman J. J., Morehouse C. B., Short M. E., Baldwin E. L., D. L. Maney, C. L. Martin, 2008. The chromosomal polymorphism linked to variation in social behavior in the white‐throated sparrow (Zonotrichia albicollis) is a complex rearragement and suppressor of recombination. Genetics 179:1455–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. von Rönn, J. A. , Shafer A. B., and Wolf J. B.. 2016. Disruptive selection without genome‐wide evolution across a migratory divide. Mol. Ecol. 25:2529–2541. [DOI] [PubMed] [Google Scholar]
  58. Wang, M. S. , Li Y., Peng M. S., Zhong L., Wang Z. J., Li Q. Y., Tu X. L., Dong Y., Zhu C. L., Wang L., Yang M. M., Wu S. F., Miao Y. W., Liu J. P., Irwin D. M., Wang W., Wu D. D., and Zhang Y. P.. 2015. Genomic analyses reveal potential independent adaptation to high altitude in Tibetan chickens. Mol. Biol. Evol. 32:1880–1889. [DOI] [PubMed] [Google Scholar]
  59. Weir, B. S. , and Cockerham C. C.. 1984. Estimating F‐statistics for the analysis of population‐structure. Evolution 38:1358–1370. [DOI] [PubMed] [Google Scholar]
  60. Zhang, W. , Fan Z., Han E., Hou R., Zhang L., Galaverni M., Huang J., Liu H., Silva P., Li P., Pollinger J. P., Du L., Zhang X., Yue B., Wayne R. K., and Zhang Z.. 2014. Hypoxia adaptations in the Grey Wolf (Canis lupus chanco) from Qinghai‐Tibet Plateau. PLoS Genet. 10:e1004466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Zink, R. M. , and Barrowclough G. F.. 2008. Mitochondrial DNA under siege in avian phylogeography. Mol. Ecol. 17:2107–2121. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Fig. S1. Order and orientation of scaffolds included in each of the three differentiated regions

based on their synteny to the zebra finch, collared flycatcher and chicken genome.

Fig. S2. Correlation between the frequency of northern haplotype in each of the three differentiated chromosome regions and the mean of phenotypic traits (nitrogen isotope ratios, size and color) per sampling site.

Fig. S3. Geographic clines for the northern haplotype frequency across the migratory divides in central Sweden (differentiated regions of chromosomes 1, 3 and 5) and Poland/Lithuania (chromosomes 1 and 5).

Fig. S4. Frequency distribution of the combined genotypes at the differentiated regions of chromosome 1 ‐ chromosome 5 in allopatric populations of the migratory phenotypes (A) and in the two hybrid zones (B).

Fig. S5. Genetic variation within each of the three differentiated chromosome regions compared to the rest of the genome.

Table S1. Samples used for whole‐genome resequencing and RAD sequencing, respectively. Average coverage refers to coverage at positions in the genome that is covered by at least one read from the sample and has been calculated following removal of sequence duplicates.

Table S2. Sampling sites of willow warblers used for the SNP array. All samples were collected from birds caught on breeding territories.

Table S3. Frequency of northern haplotype 76 for each differentiated chromosome region and the mean of phenotypic measurements per sampling site.

Table S4. Highly differentiated missense mutations and inframe deletions detected in the three divergent chromosome region and all highly differentiated SNPs and indels located on scaffolds outside of the divergent regions.

Table S5. Results of correlation analyses between mean phenotypic trait estimates and differentiated chromosome region northern haplotype frequency per sampling site.

Table S6. Estimated parameters for geographical clines of northern haplotypes in each of the three differentiated chromosome regions.

Table S7. Statistical significance from testing differences in the mean 10 kb window estimates of genetic variation (nucleotide diversity, Tajima's D, dXY and dA) among each differentiated chromosome region and the rest of the genome.

Table S8. Annotation of the 10 most differentiated filtered 10kb windows and array SNPs in each of the divergent chromosome regions. Up‐ and downstream refer to the orientation relative to the transcription of the gene.

Table S9. Primers used for long‐range PCR across the ends of the differentiated regions on chromosome 1 and 3 and their adjacent scaffolds determined from whole‐genome alignments to other bird genomes (Fig. S1).


Articles from Evolution Letters are provided here courtesy of Oxford University Press

RESOURCES