Abstract
Urbanization results in pervasive habitat fragmentation and reduces standing genetic variation through bottlenecks and drift. Loss of genomewide variation may ultimately reduce the evolutionary potential of animal populations experiencing rapidly changing conditions. In this study, we examined genomewide variation among 23 white‐footed mouse (Peromyscus leucopus) populations sampled along an urbanization gradient in the New York City metropolitan area. Genomewide variation was estimated as a proxy for evolutionary potential using more than 10 000 single nucleotide polymorphism (SNP) markers generated by ddRAD‐Seq. We found that genomewide variation is inversely related to urbanization as measured by percent impervious surface cover, and to a lesser extent, human population density. We also report that urbanization results in enhanced genomewide differentiation between populations in cities. There was no pattern of isolation by distance among these populations, but an isolation by resistance model based on impervious surface significantly explained patterns of genetic differentiation. Isolation by environment modeling also indicated that urban populations deviate much more strongly from global allele frequencies than suburban or rural populations. This study is the first to examine loss of genomewide SNP variation along an urban‐to‐rural gradient and quantify urbanization as a driver of population genomic patterns.
Keywords: ddRADSeq, landscape genetics, Peromyscus, population genomics, rapid evolution, single nucleotide polymorphism, urbanization
Introduction
Humans exert an outsized influence on ecosystems (Vitousek et al. 1997) in the Anthropocene. This era began sometime between the late Pleistocene (Ellis et al. 2013; Ruddiman et al. 2015) and industrialization in the last few centuries (Steffen et al. 2007), but is always characterized by a global increase in human influence on biological and geochemical processes. Rapid urbanization is a key characteristic of the contemporary Anthropocene. The proportion of humans in cities increased from 16% to 50% in the last century and is projected to reach 70% by 2050 (Heilig 2011). Urban land conversion may occur at an even faster rate than population growth, thus resulting in accelerating encroachment of cities on reservoirs of biodiversity (Seto et al. 2012). Urban areas become ecologically homogeneous (Groffman et al. 2014) due to loss of vulnerable species that in turn enhances the probability of abrupt state shifts (Barnosky et al. 2012).
Most species are ‘urban avoiders’ that do not persist after urbanization, but ‘urban adapters’ and ‘exploiters’ are facultative or obligate users, respectively, of human‐dominated habitats (Blair 2001; McKinney 2002). Although Blair's (2001) conception of urban adapters did not explicitly include evolution after urbanization, local adaptation may have enhanced the ability of urban adapters to exploit human subsidies. Urban habitats are ‘novel ecosystems’ composed of unique ecological communities and processes (Hobbs et al. 2006, 2009), and thus, many species likely face new, strong selection pressures from urbanization. Humans have undoubtedly altered the evolutionary trajectory of crop, pest, and disease species (Palumbi 2001), but conclusive cases of human‐driven evolution in organisms that are not human commensals or pathogens have been difficult to identify (Merilä and Hendry 2014) outside of a few recent examples (Donihue and Lambert 2014).
Urbanization results in severe habitat fragmentation (Zipperer et al. 2012), a process that reduces genetic variation among animal populations (Keyghobadi 2007; Rivera‐Ortíz et al. 2014). Loss of variation may in turn reduce the evolutionary potential of populations experiencing rapidly changing ecological conditions (Etterson and Shaw 2001; Hoffmann and Sgrò 2011; Oakley 2013). Definitive evidence of adaptive evolution from standing variation in wild populations is still relatively scarce (Colosimo et al. 2005; Renaut et al. 2011; Domingues et al. 2012), but new user‐friendly approaches to generating large population genomic datasets have improved prospects for quantifying and analyzing genetic variation in the wild (Narum et al. 2013). Evolution can proceed at the nucleotide level through new mutations or changes in frequency of standing genetic variants (Orr 2005), but it is likely that standing variation is more important in cases of rapid evolution (Barrett and Schluter 2008). Results from laboratory experiments (Teotónio et al. 2009; Burke et al. 2010), ancestral variation in model organisms (Rockman 2008; Scarcelli and Kover 2009), humans (Pritchard et al. 2010), artificial selection in crops (Gibson and Dworkin 2004), viral pathogens (Pennings 2012), and invasive species (Prentis et al. 2008) all support this association.
Predicting the loci, genetic architecture, and additive effects involved in adaptive responses to urbanization will be difficult in most cases, but estimating genomewide variation as a general proxy for evolutionary potential is a useful alternative. Measures of allelic diversity and heterozygosity from genomewide markers have the advantages of accounting for traits influenced by many loci of small effect, and for predicting how much variation will be available for future adaptive responses to unknown selection pressures (Harrisson et al. 2014). One criticism of using genomewide single nucleotide polymorphism (SNP) datasets to estimate evolutionary potential is that many loci are not associated with functional genomic regions. However, average allele frequency divergence predicts the most extreme F ST outliers, and the geographic structure of neutral and selected alleles is nearly identical, in humans (Coop et al. 2009). Genomewide SNPs have also been successfully used to predict phenotypic improvement through ‘genomic selection’ methods in artificially selected species such as livestock (Meuwissen et al. 2013). The increased accessibility of genomewide markers for nonmodel organisms thus provides many opportunities for measuring and predicting evolutionary potential in an urbanizing world (Harrisson et al. 2014). Here, we examine genomewide variation in white‐footed mouse populations sampled along an urban‐to‐rural gradient and robustly quantify urbanization as a driver of population genomic patterns.
Previous microsatellite‐based analyses on this system showed substantial genetic structure between white‐footed mouse populations in NYC's forest fragments (Munshi‐South and Kharchenko 2010). Park area, age, or the extent of habitat within parks did not explain levels of genetic variation among these populations (Munshi‐South and Nagy 2014), but these studies focused exclusively on forest fragments within NYC that were highly isolated by surrounding urbanization. In a separate analysis, we identified SNPs from transcriptomes sequenced from urban and rural populations, and found that population structure was greater among the urban than the rural populations (Harris et al. 2015aa). This analysis was limited to only six sampling sites, however, and SNPs from coding regions may often deviate from neutral expectations. In this study, we expand these investigations using large, genomewide SNP datasets, and include populations sampled along an urban‐to‐rural gradient spanning 142 km from the urban core to extensive, rural protected areas. We specifically used a double‐digest RADseq protocol (Peterson et al. 2012) to generate over 10 000 SNPs for analyzing the population genomics of white‐footed mice.
Land use transformation and anthropogenic barriers may ultimately reduce gene flow between populations (Epps et al. 2007; Balkenhol and Waits 2009; Jha 2015), leading to greater genetic structuring and loss of genomewide variation in urbanized areas. Several statistical approaches have recently been developed to investigate the influence of landscapes on genetic structure, although they vary in ability to distinguish between isolation by distance (IBD) and ecological factors. Isolation‐by‐resistance (IBR) models examine the statistical association between genetic and ‘resistance’ distances, where the latter represent probabilities that individuals disperse between populations given the landscape ‘friction’ to dispersal (McRae 2006). The friction values are inferred from empirical movement data or optimized using model selection approaches. One drawback of IBR is that resistance distances are calculated across all paths that individuals may take between populations, and thus are not truly independent from IBD. Isolation‐by‐environment (IBE) processes, in contrast, are characterized by positive correlations between environmental differences around sites and genetic distances that are independent of the effects of geographic distance (Wang and Bradburd 2014). Both IBR and IBE patterns are influenced by the same biological processes that limit migration across landscapes (such as costs of movement or selection against dispersing genotypes), but IBE patterns are defined by their independence from IBD. IBR and IBE also differ in that the former is focused primarily on the influence of the landscape between sites or individuals, whereas the latter is concerned with the influence of the environment in and around sites.
We previously reported that gene flow between populations of white‐footed mice could be explained by IBR models based on patterns of urban vegetation cover in NYC (Munshi‐South 2012). Here, we investigate IBR in a much broader range of landscape conditions along an urban‐to‐rural gradient. In our earlier work, we used partial Mantel tests and a causal modeling approach to identify ecological distances between populations (based on vegetation cover) that best explained gene flow after factoring out IBD (Cushman and Landguth 2010). While this approach can be successful for landscapes composed of high‐contrast cover types (such as NYC), several authors have argued that partial Mantel tests have low statistical power and are prone to false positives (Legendre and Fortin 2010; Graves et al. 2013). Here, we use a partial Mantel test for our IBR model, as well as a new statistical IBE approach to quantify the relative contributions of urbanization and IBD to genetic differentiation between white‐footed mouse populations (Bradburd et al. 2013). Specifically, we model the strength of covariance in allele frequencies between populations as a function of IBD and IBE due to urbanization of the landscape.
In this study, we test the following interrelated predictions about the population genomics of white‐footed mice (Peromyscus leucopus Rafinesque) in the New York City (NYC) metropolitan area:
Genomewide variation within populations is inversely related to urbanization of the surrounding landscape.
Urbanization of the landscape results in greater genomic structure and differentiation between populations.
‘Resistance distances’ and ‘ecological distances’ resulting from urbanization are better predictors of genetic differentiation than geographic distances between populations.
Methods
Study species and sampling sites
White‐footed mice are one of the most widespread and abundant small mammals in eastern North America, and occupy a broad range of forest, meadow, and secondary growth habitats. They have served as model systems for population ecology for decades (Vessey and Vessey 2007; Brunner et al. 2013) because of their ubiquity and easy trappability. Peromyscus spp. more broadly have emerged as model systems for the genomics of adaptation (Bedford and Hoekstra 2015), and the first reference genomes are currently being assembled (Kenney‐Hunt et al. 2014).
For this study, we sampled white‐footed mice from 23 sites in the NYC metropolitan area (Table 1). The 12 sites within NYC limits were the same as in previous studies, although here we combined samples from Pelham Bay and Hunters Island in the Bronx, and Alley Pond and Cunningham Parks in Queens, because evolutionary clustering results showed that these pairs of sites were not strongly differentiated from each other (Munshi‐South and Kharchenko 2010). These urban sites contained secondary forest typically dominated by oaks, hickories, maples, and/or tulip trees, with very thick understories composed primarily of invasive plants. Suburban and rural sites contained similar canopies, but the understories were largely cleared of thick vegetation by rampant deer herbivory.
Table 1.
Characteristics of numbers of white‐footed mice sampled at 23 locations along an urban‐to‐rural gradient
| Site | Code | Type | N | Latitude | Longitude | Human population size | % Impervious surface |
|---|---|---|---|---|---|---|---|
| Alley Pond/Cunningham | AP | City | 13 | 40.747605 | −73.742505 | 1298.3, 59880.1 | 21.2, 50.8 |
| Central Park | CP | City | 9 | 40.798231 | −73.956198 | 15908.3, 351698.8 | 48.6, 60.2 |
| Flushing Meadows | FM | City | 10 | 40.721442 | −73.830805 | 1577, 125893.1 | 32.3, 58 |
| Forest Park | FP | City | 6 | 40.703356 | −73.850889 | 268.3, 123722 | 6.8, 61.3 |
| Fort Tilden | FT | City | 4 | 40.561031 | −73.887552 | 17.2, 2357.5 | 5.5, 8.5 |
| Inwood Hill Park | IP | City | 6 | 40.873295 | −73.925005 | 1073.6, 121354.2 | 7.5, 30 |
| Jamaica Bay | JB | City | 5 | 40.623063 | −73.824582 | 0, 1438.4 | 3.6, 3.2 |
| Kissena Park | KP | City | 9 | 40.746547 | −73.811192 | 2039.9, 107273.1 | 28.5, 62.3 |
| NY Botanical Garden | NYBG | City | 11 | 40.871613 | −73.874079 | 6875, 256359.1 | 44.4, 60.9 |
| Pelham Bay | PB | City | 11 | 40.879895 | −73.804063 | 1.8, 3508.1 | 0, 9.5 |
| Ridgewood Reservoir | RR | City | 10 | 40.687347 | −73.88711 | 2725.8, 143223.9 | 21.7, 66.9 |
| Van Cortlandt Park | VC | City | 6 | 40.902086 | −73.882341 | 0, 77541.7 | 10, 27.7 |
| Pleasant Valley | CPV | Suburb | 7 | 41.707149 | −73.796406 | 45.7, 720 | 0.5, 1 |
| Louis Calder Center | LCC | Suburb | 13 | 41.128624 | −73.73042 | 235.8, 3118.6 | 3.1, 10.6 |
| Mianus River Gorge | MRG | Suburb | 6 | 41.185234 | −73.622813 | 76.4, 1681.4 | 0.3, 0.6 |
| Saxon Woods Park | SW | Suburb | 7 | 40.988344 | −73.753852 | 45.7, 8164.7 | 0.6, 17.6 |
| C. Fahnestock St. Park | CFP | Rural | 9 | 41.468662 | −73.843355 | 1.9, 56 | 0.1, 0.1 |
| Cary Institute | CIE | Rural | 7 | 41.784468 | −73.735476 | 5.6, 263.8 | 0.5, 1.3 |
| Highpoint State Park | HIP | Rural | 8 | 41.262754 | −74.703121 | 4.2, 157.5 | 0.8, 0.4 |
| Harriman State Park | HP | Rural | 9 | 41.282946 | −74.068317 | 0, 2.8 | 0.1, 0.2 |
| Cornwall, CT | MH | Rural | 8 | 41.787584 | −73.385292 | 5.7, 183.5 | 2.2, 0.9 |
| Minnewaska Reserve | MR | Rural | 8 | 41.727843 | −74.260149 | 0, 9 | 0, 0.1 |
| Wildwood State Park | WW | Rural | 9 | 40.939296 | −72.828573 | 13.1, 2080 | 0.9, 3.5 |
Human population size and percent impervious surface were measured at two buffer sizes around these sites: 500 m and 2.0 km.
At each site, we trapped white‐footed mice over a period of one to three nights using two or more 7 x 7 grids of Sherman live traps (9″×3″×3″). Traps within grids were placed 15 m apart, and grids were located several hundred meters apart to avoid trapping close relatives. We collected ear punches or tail clippings from up to 25 mice at each site, and stored the tissue in 80% EtOH before transfer to a ‐20 C freezer in the laboratory. Animal handling procedures were approved by Fordham University's Institutional Animal Care and Use Committee (Protocol No. JMS‐13‐03).
We chose sites that qualitatively represented typical urban (n = 12 sites), suburban (n = 4), and rural (n = 7) sites in the NYC metropolitan area based on the levels of development (Fig. 1). However, suburban counties adjacent to NYC have higher human population densities than major cities in other parts of North America. To facilitate comparisons with other urban areas, we calculated the size of the human population and percent impervious surface in geographic buffers around each site as proxies for relative urbanization (Fig. 1B).
Figure 1.

(A) Geographic locations of the 23 sites at which we sampled white‐footed mice for this study. The map contains land cover categories and impervious surface levels from the 2011 National Land Cover Database at 30 m resolution. Red to purple colors represent increasing percentages of impervious surface cover. Site abbreviations correspond to Table 1. (B) Scatterplot of percent impervious surface cover vs. human population size for the 23 sampling sites. Both variables were log‐transformed to improve linearity, and were measured using 2‐km buffers around study sites as described in the text. Red dots represent urban sites, green dots represent suburban sites, and black dots represent rural sites.
We first incorporated GPS coordinates for our study sites into base layers (2010 TIGER/Line® Shapefiles—County or States and equivalent) available from the US Census Bureau. Buffers were created in ArcGIS v10.1 (ESRI, Redlands, CA, USA) around each site's GPS coordinates with radii of 500 m, 1000 m, 1500 m, and 2000 m. These buffers were chosen because they are relevant to the typical lifetime dispersal distance of many white‐footed mice; for example, no mice over a 40‐year study of one woodlot dispersed to the nearest neighboring woodlot <1.5 km away (Vessey and Vessey 2007). To determine human population size inside each buffer, we used US Census Blocks as these provide the smallest geographic unit with 100% census data. We first calculated the area of each census block and then intersected the census block and buffer layers. We interpolated human population size within each buffer based on the percentage of area for each census block that intercepted each buffer ([area of census block within buffer/area of census block] × population of census block). We then summed all of the interpolated population sizes that fell within each buffer.
To determine percent impervious surface within each study site buffer, we used the 2011 Percent Developed Imperviousness layer from USGS National Land Cover Data (Xian et al. 2011). We then calculated zonal statistics to measure the geometry of the raster file and summarize the cell values of the raster that fell within each buffer. From these data, we calculated the average percent impervious surface within each landscape buffer for each site.
These estimates of impervious surface and human population size could be highly correlated at the different spatial buffers, so we calculated Pearson correlation coefficients between all pairs of values (Table S1). To avoid including redundant information in downstream analyses, we removed any variables that exhibited r > 0.90 in pairwise comparisons with the other variables. The 1‐km and 1.5‐km buffers were removed because they were highly correlated with other estimates, and percent impervious surface and human population size estimated for the 500‐m and 2‐km buffers were retained.
ddRADSeq, SNP genotyping, and population genomic statistics
We generated SNP genotypes for 233 individuals using a double‐digest RADseq protocol (Peterson et al. 2012). We aimed to obtain genotypes from 10 individuals from each site, but sample dropout due to poor DNA yield during library preparation resulted in variability in the sample size for each site (Table 1). In brief, we extracted DNA from tail snips or ear punches using Qiagen DNEasy tissue kits (Qiagen, Valencia, CA, USA) with an RNAse treatment, and then digested 500–1000 ng of DNA using the enzyme combination of SphI‐HF and MluCI for 1 h following the manufacturer's instructions (New England Biolabs, Ipswich, MA, USA). A Qubit 2.0 fluorometer (Life Technologies, Norwalk, CT, USA) was used to quantify DNA concentration at several steps in the library preparation. Next, digested DNA was cleaned with 1.5X AMPure XP magnetic beads (Beckman‐Coulter, Brea, CA, USA) and custom in‐line ‘flex’ barcodes and P1/P2 adapters were ligated to 200–400 ng of digested DNA for each sample (Peterson et al. 2012). Up to 48 individual libraries with unique barcodes were then pooled in equimolar amounts and cleaned with 1.5X AMPure XP beads. Next we selected DNA fragments of known sizes (376–412 bp) from the pooled libraries using a Pippin Prep (Sage Science, Beverly, MA, USA). We then conducted multiple PCR amplifications using 20 ng of size‐selected DNA and Phusion High‐fidelity PCR reagents with manufacturers’ PCR conditions (New England Biolabs). This PCR step added a second, unique index sequence and Illumina sequencing primers to the pooled libraries so each individual sample contained a unique combination of the in‐line barcode and index. We then pooled the PCRs for each size‐selected library, cleaned the pools using 1.5X AMPure XP beads, and checked the libraries using an Agilent BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA) for DNA concentration and the correct distribution of fragment size. The libraries were sequenced using three lanes of Illumina HiSeq 2000 2 × 100 bp paired‐end sequencing at the New York University Center for Genomics and Systems Biology (New York, NY, USA).
As an initial check on the quality of our Illumina sequence data, we analyzed the raw reads in FastQC (Andrews 2010). Subsequent demultiplexing, quality filtering, and de novo SNP calling were conducted using the Stacks 1.21 pipeline (Catchen et al. 2013). First, we used the process_radtags script to filter out low‐quality reads and demultiplex the remaining reads according to their unique combination of in‐line barcode and index. Based on FastQC results, we trimmed all reads to 96 bp to remove poor‐quality base calls at the ends of reads. Next, we concatenated the single‐ and paired‐end reads for each individual into one fastq file because the two paired reads did not overlap and were not aligned to a reference genome. To identify RAD loci and call SNPs, we used the denovo_map.pl script in Stacks with default settings except for the minimum number of identical reads required to create a ‘stack’ (m = 7), number of mismatches allowed between RAD loci for a single individual (M = 3), and the number of mismatches allowed when building the catalog (n = 2). After building the initial catalog of loci, we used the populations script in Stacks to filter loci for those that occurred in at least 22/23 sampling sites (P = 22) and in at least 50% of individuals at each site (r = 0.5) to a minimum coverage of 7X (m = 7). These parameter values imply that each base pair position was sequenced at least 668X across the entire dataset. Stacks calculates the likelihood of the two most common genotypes at each site given a maximum‐likelihood estimate of the sequencing error rate, and then uses a likelihood ratio test to identify the most likely genotype at that position. Full details of the models underlying Stacks are given in Catchen et al. (2013). The populations script also produced genotype output in multiple formats (i.e., Genepop, Structure) with one randomly selected SNP from each locus (–write_random_SNP), and generated summary statistics such as observed heterozygosity, nucleotide diversity, and pairwise FST between all populations. These summary statistics were calculated for the sites that were polymorphic in at least one population (i.e., variant sites; Table 2), as well as for all the sites (Table S2). We used summary statistics for the variant sites in downstream analyses, although the two estimates were typically highly correlated (i.e., r < 0.99).
Table 2.
Summary genetic diversity statistics calculated by STACKS for nucleotide positions that were polymorphic in at least one population
| Site | Type | N | Sites | %Poly | Private | P | Hobs | π |
|---|---|---|---|---|---|---|---|---|
| AP | City | 10.4 | 14 859 | 0.388 | 154 | 0.944 | 0.080 | 0.089 |
| CP | City | 6.6 | 14 031 | 0.321 | 152 | 0.936 | 0.095 | 0.102 |
| FM | City | 7.7 | 14 812 | 0.229 | 90 | 0.953 | 0.061 | 0.071 |
| FP | City | 5.4 | 14 904 | 0.224 | 92 | 0.951 | 0.066 | 0.076 |
| FT | City | 3.7 | 14 880 | 0.235 | 78 | 0.948 | 0.082 | 0.086 |
| IP | City | 5.2 | 14 879 | 0.304 | 146 | 0.944 | 0.083 | 0.091 |
| JB | City | 4.3 | 14 750 | 0.230 | 54 | 0.952 | 0.075 | 0.079 |
| KP | City | 7.8 | 14 872 | 0.278 | 139 | 0.951 | 0.069 | 0.077 |
| NYBG | City | 7.9 | 12 657 | 0.321 | 138 | 0.925 | 0.117 | 0.106 |
| PB | City | 9.1 | 14 788 | 0.414 | 269 | 0.899 | 0.164 | 0.143 |
| RR | City | 7.7 | 14 648 | 0.209 | 88 | 0.957 | 0.054 | 0.066 |
| VC | City | 5.5 | 14 909 | 0.350 | 159 | 0.928 | 0.115 | 0.116 |
| CPV | Suburb | 6.0 | 14 712 | 0.383 | 273 | 0.903 | 0.164 | 0.142 |
| LCC | Suburb | 10.3 | 14 670 | 0.450 | 431 | 0.904 | 0.158 | 0.138 |
| MRG | Suburb | 5.1 | 14 830 | 0.371 | 198 | 0.903 | 0.165 | 0.146 |
| SW | Suburb | 5.7 | 14 698 | 0.303 | 126 | 0.945 | 0.081 | 0.090 |
| CFP | Rural | 7.0 | 14 814 | 0.380 | 269 | 0.914 | 0.141 | 0.122 |
| CIE | Rural | 5.6 | 14 393 | 0.300 | 134 | 0.946 | 0.078 | 0.088 |
| HIP | Rural | 7.0 | 14 886 | 0.416 | 515 | 0.910 | 0.148 | 0.129 |
| HP | Rural | 7.1 | 14 797 | 0.452 | 516 | 0.915 | 0.137 | 0.135 |
| MH | Rural | 6.9 | 14 906 | 0.432 | 362 | 0.894 | 0.181 | 0.150 |
| MR | Rural | 6.3 | 14 856 | 0.295 | 381 | 0.947 | 0.074 | 0.085 |
| WW | Rural | 6.6 | 14 630 | 0.327 | 131 | 0.920 | 0.132 | 0.113 |
| Mean | 6.7 | 14 660 | 0.331 | 212.8 | 0.930 | 0.110 | 0.106 |
N = average number of individuals genotyped at each locus; Sites = number of polymorphic nucleotide sites across the dataset; %Poly = percentage of polymorphic loci; Private = number of variable sites unique to each population; P = average frequency of the major allele; H obs = average observed heterozygosity per locus; π = average nucleotide diversity.
We loaded the RAD loci and individual data from Stacks into a MySQL database and visualized the output using the Stacks webserver. Based on the results of the first pipeline run, we removed 26 individuals because their small number of reads resulted in very small SNP datasets and excessive missing genotypes compared to other samples. Highly related individuals in our dataset could also bias downstream analyses. We avoided relatives at our urban sites using relatedness values from previous microsatellite studies of the same individuals (Munshi‐South and Kharchenko 2010). For the SNP dataset, we identified highly related individuals by calculating kinship coefficients in the software package KING (Manichaikul et al. 2010). Kinship analysis identified 16 pairs of individuals that were related at the half‐sib level or greater. We removed one of the pair members from the dataset, resulting in a final dataset comprised of 191 of the original 233 individuals. We then reran the Stacks pipeline on the screened dataset of 191 individuals.
Besides the filters applied by Stacks and the removal of highly related individuals, we also omitted outlier loci detected using the approach in BayeScan 2.1 (Foll and Gaggiotti 2008). Operating with default parameters and a false discovery rate <0.01, we identified 200 outliers using BayeScan. Five of these outliers showed a signature of positive selection, whereas 195 showed signatures of balancing selection. We omitted these outliers from the dataset because they did not fit assumptions of neutrality inherent to our downstream analyses.
Modeling genomewide variation within populations
We modeled four measures of genomewide variation within populations against our two urbanization proxies to test the hypothesis that urbanization is associated with reduced genetic variation. Summary statistics for each population included observed heterozygosity (H O), nucleotide diversity (π), number of private loci, and the percent of polymorphic loci. We examined genetic (dependent) variables using seven candidate general linear models (GLMs) consisting of combinations of human population size and percent impervious surface estimated using different geographic buffers as described above (500 m and 2000 m): an intercept‐only model, four univariate models with one variable measured at one buffer size, and two bivariate models including both human population and impervious surface estimated for the same buffer size. Human population size and percent impervious surface were ln‐transformed prior to analysis to improve normality. If a model performed substantially better than the intercept‐only model, then we interpreted that result as evidence of a statistical effect of the urbanization variable(s) on genetic diversity within white‐footed mouse populations. We calculated maximum‐likelihood estimates of model parameters for each model, and then ranked models using values of AICc, the corrected Akaike's information criterion (Burnham and Anderson 2002). The relative quality of models was further assessed based on ΔAICc, and the relative weight (w i) of each model. For the best GLMs, we examined the statistical significance of the urbanization coefficients. We also examined scatterplots (Fig. 2) and fitted regression lines to confirm that the model explained variation in the genetic parameter of interest. We conducted all statistical analyses in R 3.2.1 (R Development Core Team 2008), and used GLM regression in the AICcmodavg package for model selection (Mazerolle 2015).
Figure 2.

Scatterplots and trend lines for best GLMs identified using AIC c modeling that describe the relationship between urbanization and (A) heterozygosity, (B) nucleotide diversity, (C) percent polymorphic loci, and (D) number of private alleles for the 23 populations. Red dots represent urban sites, green dots represent suburban sites, and black dots represent rural sites. Site abbreviations correspond to Table 1.
Genetic structure and population differentiation
To investigate whether urbanization results in greater genetic differentiation between populations, we used discriminant analysis of principal components (DAPC) in the R package adegenet (Jombart and Ahmed 2011) to identify evolutionary clusters among 191 white‐footed mice. DAPC first reduces total genetic variation (i.e., variance in allele frequencies) into principal components, and then identifies discriminant functions that maximize differences between clusters while minimizing variation within clusters. We used cross‐validation in adegenet to identify the optimal number of principal components. This procedure uses a randomly generated training set and validation set of individuals to identify the optimal number of principal components that accurately predict group membership without overfitting. To visualize clusters, we used DAPC scatterplots and barplots (compoplot command in adegenet) to visualize membership of individuals in different clusters. Genomewide SNP datasets often have power to discriminate between all groups, and results from analyses such as DAPC may reflect hierarchical structure. Thus, we ran DAPC on the full dataset as well as subsets of sampling sites to investigate hierarchical structure.
We also used the model‐based evolutionary clustering approaches in fastSTRUCTURE (Raj et al. 2014) and ADMIXTURE (Alexander et al. 2009). fastSTRUCTURE uses approximate inference of the Bayesian model in the original STRUCTURE (J. Pritchard et al. 2000) whereas ADMIXTURE computes maximum‐likelihood estimates of parameters to estimate the most likely number of evolutionary clusters, K. We ran fastSTRUCTURE on our data for each value of K from 1 to 23 using the standard model with a simple prior. A fastSTRUCTURE script (chooseK.py) also calculates heuristic scores for detecting the range of the most likely values of K. After identifying likely values of K, we reran fastSTRUCTURE with the more computationally demanding logistic prior model on a smaller subset of K values. The most likely values of K were determined in ADMIXTURE using its cross‐validation procedure (Alexander and Lange 2011). To compare clustering results from fastSTRUCTURE and ADMIXTURE, we used the CLUMPAK (Cluster Markov Packager Across K) web server (Kopelman et al. 2015) to align and visualize bar plots for both programs at multiple values of K.
Landscape genomics
To investigate IBD, IBR, and IBE between white‐footed mouse populations, we first tested for IBD using a Mantel test in the ecodist package in R. The data were bootstrapped 10 000 times to generate 95% confidence intervals for the Mantel P value. We then used the BEDASSLE package (Bradburd 2014) in R to estimate the relative contributions of IBD and IBE to genetic differentiation between the sampling sites. We computed allele counts and sample sizes for each population using the—counts function in VCFtools 0.1.12b (Danecek et al. 2011). We examined the following IBE models using BEDASSLE: (i) a simple binary matrix indicating whether the population was located in NYC or outside the city; and (ii) a pairwise matrix of ‘resistance distances’ calculated between populations using the IBR approach in Circuitscape 4.0 (McRae and Beier 2007).
Circuitscape calculates a pairwise matrix of resistance distances between populations based on the ability of a simulated electrical current to flow between adjacent landscape cells connected by resistors with user‐defined resistance values. We used the 2011 Percent Impervious Surface layer from USGS National Landcover Data (Xian et al. 2011), and set the resistance value for each 30‐m cell as equivalent to its percent impervious surface unless the cell exceeded 70% impervious surface. For cells exceeding 70%, we set the resistance level to 100; in other words, any cell with >70% impervious surface was assumed to be 100X more resistant to migration than a cell with 1% impervious surface. The 70% cutoff for relatively high, quasi‐barrier resistance was based on results of an earlier IBR analysis we conducted in NYC (Munshi‐South 2012). To calculate resistance distances between all populations, we ran Circuitscape in pairwise mode with raster cells connected to all eight neighboring cells. The analysis also produced a cumulative current map to visualize hypothesized migration between all populations. As an initial check on the success of the resistance distances at explaining variation in pairwise F ST, we conducted a partial Mantel test with 10 000 bootstraps in ecodist that factored out the effects of Euclidean geographic distance. We then used these resistance distances for the IBE model in BEDASSLE.
BEDASSLE uses a Markov Chain Monte Carlo (MCMC) approach and includes several graphing functions for evaluating success of the MCMC posterior parameter estimation. To confirm adequate mixing and convergence of chains, we examined traces and marginal distributions for all parameters. We also examined acceptance rates for MCMC parameter estimation, and when these rates were too high or low, we adjusted the tuning parameters and reran the analysis for 5–10 million steps. For the final runs, we calculated the median and 95% credible intervals for the α E: α D ratio after discarding the first 20% as burn‐in. This ratio represents the effect size of ecological distance relative to geographic distance.
The simple BEDASSLE model assumes identical variance of allele frequencies about the global mean allele frequency. However, populations deviate from the global mean for a number of demographic reasons (i.e., bottlenecks, inbreeding), and outlier populations can have a strong influence on posterior distributions. To account for this variation, BEDASSLE includes a beta‐binomial model that estimates an additional parameter, ΦK, that measures the strength of drift and lack of fit of each population to the model. To address population history, we also ran the beta‐binomial model for the two IBE scenarios above. For these models, we examined the posterior distributions of the ΦK values (recalculated as FK = 1/1 + ΦK) to identify outlier populations.
To test the relative fit of the models to our data, we generated 1000 posterior predictive samples for each model and compared the simulated data to our observed data in BEDASSLE. BEDASSLE includes a function that randomly draws parameter values from the posterior distributions and simulates new datasets. These simulated datasets are then used to calculate FST between all pairs of populations. Simulated FST values can then be compared to the observed FST values to examine the models’ ability to describe patterns in the real data.
Results
ddRADSeq dataset and summary population genomic statistics
Three lanes of Illumina HiSeq 2 × 100 PE sequencing produced a total of 1.16 billion reads, of which 936.64 million reads (80.9%) passed initial quality filters. Reads were filtered out due to ambiguous barcode sequences (56.3%), low‐quality scores (40.9%), and ambiguous RAD tags (2.8%). The catalog generated by STACKS included 880 898 RAD loci with at least 7X coverage, but this number was reduced to 14 930 after requiring loci to be present in 22/23 populations and at least 50% of individuals within each population. All populations were well‐represented in the final dataset, with averages of 4 to 10 individuals genotyped per locus across the 23 populations (median = 6.6; Table 2).
For all loci that were polymorphic in at least one population, the major allele frequency (P) ranged from 0.894 to 0.957, and the average observed heterozygosity from 0.054 to 0.181 (Table 2). If all invariant positions are included, then the major allele frequency exceeded 0.999 in all populations and the heterozygosity ranged only from 0.001 to 0.002 (Table S2). Nearly all of the populations located within NYC exhibited lower genetic variation than suburban or rural populations as measured by the percentage of polymorphic sites, numbers of private alleles, major allele frequency, observed heterozygosity, and nucleotide diversity (π). The only exceptions to these trends were relatively high diversity values for the Pelham Bay (PB) population in NYC, and low values for suburban Saxon Woods (SW), the rural Cary Institute (CIE) and Minnewaska Reserve (MR; Table 2). Excluding those four outliers, heterozygosity ranged from 0.054 to 0.117 in the urban populations and 0.158 to 0.181 in the suburban and rural populations. Nucleotide diversity similarly ranged from 0.066 to 0.116 in NYC and 0.138 to 0.15 in the suburban and rural populations.
Modeling urbanization and genomewide variation within populations
Sampling sites could be distinguished by their combinations of human population size and percent impervious surface (Fig. 1B). The rural sites all clustered at very low values for both urbanization variables. Urban and rural sites exhibited little overlap in the scatterplot, with the exception of Fort Tilden (FT), Jamaica Bay (JB), and Pelham Bay in NYC. Figure 1B shows the relationship between the two urbanization variables measured for a 2‐km buffer around sampling sites, but results were qualitatively similar for the 500‐m buffer.
Model selection based on AIC c confirmed that nearly all GLMs with one or more urbanization variables described variation in genomic diversity better than intercept‐only models, except for human population size estimated at a 500‐m buffer (Table 3). Impervious surface cover, estimated at the 2‐km buffer, was the highest ranked model for all four genetic variables, with the second best models all including both impervious surface cover and human population size at the 2‐km buffer. The Akaike weights for the best models were all equal to or >0.55, and the delta AICc for moving from the second to third model were all >3.0. Heterozygosity was negatively associated with percent impervious surface, with the exception of the outlier populations mentioned above (i.e., rural populations with low genetic diversity; Fig. 2A). Nucleotide diversity was also negatively associated with percent impervious surface, and exhibited the same outlier populations as the heterozygosity scatterplot (Fig. 2B). The number of private alleles and percent polymorphic loci were negatively associated with percent impervious surface (Fig. 2C,D). All urbanization coefficients from the top GLMs were also significant at P < 0.05, with the coefficients for nucleotide diversity and number of private alleles significant at P < 0.001.
Table 3.
Results of selection among general linear models describing the influence of percent impervious surface cover (imprv) and human population size (pop) at 500‐m and 2‐km buffer sizes on a) observed heterozygosity, b) nucleotide diversity, c) the number of private alleles within each population, and d) the percentage of polymorphic loci within each population
| Models | AIC c | Δi | w i | Models | AIC c | Δi | w i |
|---|---|---|---|---|---|---|---|
| a) observed heterozygosity | b) nucleotide diversity (pi) | ||||||
| imprv2 km | −85.98 | 0.00 | 0.62 | imprv2 km | −104.59 | 0.00 | 0.57 |
| pop2 km + imprv2 km | −84.46 | 1.52 | 0.29 | pop2 km + imprv2 km | −103.55 | 1.04 | 0.34 |
| imprv500 m | −79.89 | 6.09 | 0.03 | imprv500 m | −98.43 | 6.17 | 0.03 |
| pop2 km | −79.47 | 6.52 | 0.02 | pop500 m + imprv500 m | −98.36 | 6.23 | 0.03 |
| pop500 m + imprv500 m | −78.95 | 7.04 | 0.02 | pop2 km | −97.61 | 6.98 | 0.02 |
| intercept | −78.57 | 7.41 | 0.02 | intercept | −96.94 | 7.66 | 0.01 |
| pop500 m | −76.65 | 9.34 | 0.01 | pop500 m | −94.84 | 9.75 | 0.00 |
| c) number of private alleles | d) percentage of polymorphic loci | ||||||
| imprv2 km | 291.03 | 0.00 | 0.56 | imprv2 km | −53.85 | 0.00 | 0.55 |
| pop2 km + imprv2 km | 293.82 | 2.79 | 0.14 | pop2 km + imprv2 km | −51.37 | 2.49 | 0.16 |
| imprv500 m | 294.51 | 3.47 | 0.10 | pop2 km | −50.09 | 3.76 | 0.08 |
| pop2 km | 294.67 | 3.64 | 0.09 | intercept | −49.89 | 3.96 | 0.08 |
| intercept | 295.90 | 4.86 | 0.05 | Imprv500 m | −49.82 | 4.03 | 0.07 |
| pop500 m + imprv500 m | 296.25 | 5.22 | 0.04 | pop500 m + imprv500 | −48.39 | 5.46 | 0.04 |
| pop500 m | 297.49 | 6.45 | 0.02 | pop500 m | −47.63 | 6.22 | 0.02 |
The best models were chosen based on the second‐order Akaike's information criterion (AIC c), the rate of change in AICc (Δi), and the Akaike weights (w i).
Genetic structure and population differentiation
Pairwise F ST calculated using STACKS ranged from a low of 0.033 between a rural (MH) and suburban (LCC) population, and a high of 0.145 between a suburban (MRG) and urban (RR) population. Most values were between 0.05 and 0.10. Some NYC populations exhibited many F ST > 0.10, particularly RR and JB, as did one suburban population (MRG). A simple Mantel test revealed no significant IBD (Mantel r = 0.041, 95% CI = −0.046–0.124, P = 0.32).
Cross‐validation identified n = 23 as the optimal number of principal components to retain for DAPC analysis (Figure S1). The first two discriminant functions distinguished two isolated NYC sites, Jamaica Bay (JB) and Fort Tilden (FT), from the other populations. Discriminant function three separates out a cluster of the other populations on Long Island (AP, FM, FP, KP, RR, WW), a cluster of NYC populations in the middle (IP, NYBG, PB, VC) with suburban populations just below them (LCC, MRG, SW, CPV), and rural populations (CIE, CFP, HIP, HP, MH, MR) at the bottom (Fig. 3A). This clustering largely recapitulates the spatial orientation of these populations along a north–south axis. One exception is the most urban population, Central Park (CP), which does not occur where it would be expected based on geography. This placement indicates that Central Park is one of the most isolated and unique NYC populations, along with Fort Tilden and Jamaica Bay. The DAPC compoplot (i.e., a barplot of membership probability) indicated that nearly all individuals could be assigned to their sampling site with high probability (Figure S2).
Figure 3.

Scatterplots resulting from discriminant analysis of principal components (DAPC) for (A) all 23 sampling sites, and (B) a subset of sites occurring on Manhattan and mainland North America. Insets represent the eigenvalues of retained principal components (top left), and the eigenvalues of discriminant functions portrayed in the scatterplots (bottom left). Site abbreviations correspond to Table 1.
To clarify relationships between major clusters of sampling sites, we reran the DAPC analysis on two subsets: all populations on the mainland and Manhattan, and all populations located on Long Island. Twelve principal components were retained for the mainland–Manhattan analysis, and the scatterplot of the first two discriminant functions revealed a major, central cluster that recapitulated the geography of the populations. However, three isolated, urban populations were distinct from this cluster (Fig. 3B): Central Park (CP), Inwood Hill Park (IP), and Van Cortland Park (VC). The Long Island analysis retained seven principal components, and confirmed that JB and FT are highly distinct populations, as well as the Ridgewood Reservoir (RR; Figure S3).
Both the heuristic analysis in fastSTRUCTURE (using the simple prior) and cross‐validation in ADMIXTURE identified K = 2 as the most likely number of evolutionary clusters among the 23 white‐footed mouse populations we sampled. CLUMPAK confirmed that individual assignment to the two clusters was highly correlated across the two methods (r = 0.93; Fig. 4A,B). One cluster (blue in Fig. 4A,B) contained all the NYC populations on Long Island, the two populations on Manhattan (CP and IP), and three other suburban/rural populations with atypically low genetic variation (CIE, MR, and SW). All other populations were assigned to the other cluster (orange in Fig. 4A,B), except for WW located in rural Long Island which was an admixture of the two clusters in almost equal proportions. When we reran the fastSTRUCTURE analysis for K = 1–4 using the more accurate logistic prior, the heuristic analysis identified K = 3 as the upper bound on the likely number of evolutionary clusters. Cross‐validation in ADMIXTURE also identified K = 3–5 as only slightly worse than K = 2 (Figure S4), so we created barplots for these numbers of evolutionary clusters. For K = 3, the blue cluster of urban Long Island, Manhattan, and outlier rural populations was maintained, but the other cluster was split into two largely based on location east or west of a north–south axis (Fig. 4C). The fourth cluster in the K = 4 ADMIXTURE analysis included a new cluster (Fig. 4D) with the two Manhattan populations, a Bronx population (NYBG), and a few distant populations adjacent to or west of the Hudson River (CFP and HIP). The additional cluster for K = 5 included two suburban populations in relative proximity (CPV and LCC; Fig. 4E).
Figure 4.

Bar plots resulting from evolutionary clustering analyses using (A) fastSTRUCTURE assuming K = 2, and (B) using ADMIXTURE assuming K = 2–5. Site abbreviations correspond to Table 1 and are ordered geographically in the bar plots. The first five sites are on Long Island (AP‐WW), the next two on Manhattan (CP and IP), the next 10 east of the Hudson River (VC‐MH, organized from North to South), and the last three west of the Hudson River (HP‐MR).
Landscape genomics
For the simple BEDASSLE model examining presence or absence in the city, the median α E: α D ratio was 7.14 and the 95% credible set was 6.90 to 7.36. The interpretation of this result is that being located in NYC has an impact of approximately 7 km of extra pairwise geographic distance on genetic differentiation. Comparison of posterior predictive samples for the simple and beta‐binomial model confirms that the latter is a much better fit to the data (Fig. 5A,B). For the beta‐binomial model, the median α E: α D ratio was 0.006 and the 95% credible set was 0.0002 to 0.038. This model indicates that presence in NYC has virtually no additional impact over geographic distance on genetic differentiation. However, values of F k were elevated for most of the urban populations (range in medians from 0.04–0.25; mean = 0.12) relative to the suburban (range = 0.01–0.06; mean = 0.03) and rural populations (range = 0.001–0.11; mean = 0.03), with the exception of two rural populations that also had high values (CIE and MR; Fig. 5D). These F k values indicate that the urban populations deviated substantially more from global allele frequency estimates than the suburban and rural populations.
Figure 5.

Posterior predictive sampling with 1000 simulated datasets in BEDASSLE, using pairwise FST as a summary statistic for (A) the simple IBE model with presence in or outside NYC as the environmental variable; (B) the beta‐binomial IBE model for presence in or outside NYC; and (C) the beta‐binomial IBE model with resistance distances based on percent impervious surface as the environmental variable. (D) Map of study sites with points scaled using the values of FK estimated by the beta‐binomial model in BEDASSLE. FK estimates the deviation of each population from global mean allele frequencies, and was highest for populations located within NYC.
Pairwise resistance distances from the IBR model (Figure S5) were significantly associated with FST, even after factoring out the effects of geographic distance (partial Mantel r = 0.521, 95% confidence interval = 0.438–0.621, P < 0.0001; Table S3). The simple IBE model using resistance distances did not converge after trying many different combinations of prior values. The beta‐binomial model did converge, producing a median α E: α D ratio of 0.0001 and 95% negligible impact on genetic differentiation compared to geographic distances, which contradicts the IBR results. Posterior predictive sampling indicated that this beta‐binomial model performed moderately well, but systematically underpredicted observed FST < 0.05 (Fig. 5C).
Discussion
This study is the first to our knowledge to examine the influence of urbanization on genomewide SNP variation in a city‐dwelling species. We found that populations of white‐footed mice along an urban‐to‐rural gradient exhibit a negative correlation between genomic variation and urbanization in and around their habitat. Populations within NYC also exhibited greater genetic differentiation from one another than pairs of populations in rural areas. IBR and IBE models based on urbanization explained a greater proportion of pairwise population differentiation overall than IBD by some metrics. NYC populations deviated more strongly from global mean allele frequencies than rural populations, indicating that urbanization has substantially altered the evolutionary trajectories of urban wildlife (Donihue and Lambert 2014).
Many recent studies have documented reduced migration and loss of heterozygosity at microsatellite loci among populations in urbanized landscapes (Gortat et al. 2014; Barr et al. 2015; Jha 2015). However, for logistical reasons the most variable microsatellite loci are often chosen for population genetic analysis, and thus may not represent unbiased samples of genomewide diversity (Väli et al. 2008). Heterozygosity measured using microsatellites and traits related to fitness are also often weakly correlated, even though there is a publication bias toward reporting only high heterozygosity–fitness correlations (Chapman et al. 2009). New approaches such as ddRAD‐Seq used here produce genomewide SNP markers that are more appropriate for assessing genomewide variation in relation to ecological factors such as urbanization.
Population genomic variation is a key to understanding and predicting evolutionary responses to environmental transformation. While urbanization may not directly cause extinctions of most species, it is likely to decrease the evolutionary potential of populations. Results from native species in cities are few. Urban bobcats maintained variation at immune‐linked loci due to balancing selection from disease pressures, even after a population bottleneck and population subdivision by freeways (Serieys et al. 2015). Blackbirds colonizing cities also exhibited polymorphisms in a candidate behavioral gene that was strongly associated with urban habitats (Mueller et al. 2013), indicating that functional variation is important for the evolutionary success of urban wildlife. We previously identified candidate genes that may be under selection in urban white‐footed mice (Harris et al. 2013), but our results here indicate that these populations have lost as much as half of their genetic diversity compared to nearby rural populations. While candidate genes of large effect may be relevant to understanding some responses to landscape change, screening genomewide variation offers many advantages for measuring evolutionary potential. Most adaptive traits are polygenic and influenced by many loci of small effect. Information on genomewide variation will also capture cryptic variation and quantify the amount of standing variation available for responses to future change (Harrisson et al. 2014). Loss of standing variation will thus make it less likely that urban populations will be able to adapt to local conditions, or to global phenomena such as climate change (Franks et al. 2014). Mutation is likely too slow of a process for rapid recovery of evolutionary potential in fragmented urban populations. However, increasing connectivity in cities would immediately boost genetic variation if migration between genetically differentiated populations could be re‐established.
Modeling urbanization and genomewide diversity
We found that percent impervious surface cover, and to a lesser extent human population size, estimated at 2‐km buffers around study sites were highly correlated with levels of genomewide diversity. Gradient studies have predominated in urban ecology, but terms such as ‘urban’ and ‘suburban’ have been used in many different contexts without standardization (Magle et al. 2012). Study sites are often defined based on subjective criteria, or chosen simply to represent a linear geographic gradient regardless of the actual pattern of urbanization (Ramalho and Hobbs 2012). We advocate that future landscape genetics studies report percent impervious surface cover in and around study sites to facilitate comparisons between landscapes and species. Impervious surface and human population size are highly correlated, but humans may be present in large numbers outside of cities for recreation in protected areas (Monz et al. 2013). Impervious surface can be readily used to track urbanization over time, and has relevance to both terrestrial and aquatic systems (Walsh et al. 2005). Extensive impervious surface in the form of roads, parking lots, and buildings characterizes all cities (Nowak and Greenfield 2012). Roads in particular have well‐characterized negative impacts on the connectivity of wildlife populations (Balkenhol and Waits 2009; Benítez‐López et al. 2010). Nearly 70% of global forest cover is now fragmented, and areas subject to urbanization and high‐intensity agriculture are most severely affected (Haddad et al. 2015). Reporting measures of impervious surface cover will provide much needed standardization. We found that estimates for 2‐km buffers were much better than 500‐m buffers, although the spatial effects will likely vary for different species. In this case, 500 m was likely too small of a buffer to capture the extent of urbanization's influence on study sites, whereas 2 km was near the maximum buffer size we could use around many of our study sites and still retain statistical independence from other study sites.
Population structure and differentiation
Urban and rural populations were differentiated from each other, with many F ST > 0.10. Some populations within NYC had pairwise F ST as high as urban–rural pairs that were much more distant, indicating that isolation within the city is quite high. Higher F ST values reported here are similar to those reported for endangered beach mice (Austin et al. 2015) and Channel Island deer mice (Ozer et al. 2011) that recently experienced strong genetic drift from extirpations and translocations. The lack of IBD reported here, but moderate‐to‐strong genetic population differentiation, suggests that dispersal is limited by barriers and high landscape resistance rather than geographic distance. Thus, evolutionary clustering and IBE (Sexton et al. 2014; Wang and Bradburd 2014) are more appropriate models for understanding patterns of genomewide diversity among these populations.
Discriminant analysis of principal components, and two model‐based clustering analyses, sorted individuals into 23 clusters that largely recapitulated geographic patterns. We identified one major split between Long Island and mainland populations, which historical demographic modeling indicated is likely related to glacial retreat and ecological succession in the region (Harris et al. 2015b). The other striking pattern was that several urban populations were outliers in their genetic divergence from the major clusters of sampling sites. Previous clustering analyses using microsatellites also identified most NYC sites as distinct populations (Munshi‐South and Kharchenko 2010). SNPs evolve more slowly than microsatellites, and there is likely still considerable ancestral variation in isolated urban populations. These NYC populations may also not have reached linkage and Hardy–Weinberg equilibrium at many SNP loci because not enough generations have elapsed since isolation, or the populations have not reached mutation–migration–drift equilibrium.
Landscape genomics
Our IBR and IBE analyses produced mixed results. Overall, IBR and IBE modeling indicated that urbanization drives genetic differentiation to a greater degree than geographic distance alone. Although the relative influence of IBE to IBD was modest in the BEDASSLE analyses, urban populations deviated to a much greater degree from global allele frequencies than suburban or rural populations. The most likely explanation for this deviation is substantial drift due to inbreeding or bottlenecks in isolated urban habitats. This scenario generally conforms to the strong structure observed among some of the urban populations. However, other factors that caused these populations to deviate from the BEDASSLE model cannot be ruled out, such as unsampled environmental variables (Bradburd et al. 2013).
IBR modeling in Circuitscape confirmed that variation in percent impervious surface is highly associated with variation in pairwise F ST, even after factoring out IBD. Partial Mantel tests have several known issues with false positives (Legendre and Fortin 2010), but we previously found that vegetation cover (typically the inverse of impervious surface in NYC) successfully described migration between populations using similar approaches (Munshi‐South 2012). The IBE models did not identify the same clear association between impervious surface and genetic differentiation. As above, the strong deviation of urban populations from the global mean allele frequencies indicated that the IBE model may have underperformed. Resistance distances from Circuitscape are integrated over all possible paths between points on the landscape, and thus, BEDASSLE also counts geographic distance twice with unknown consequences when using resistance distances (G. Bradburd, personal communication).
IBE is an active area of inquiry that holds great promise for understanding the processes that generate genetic variation. However, currently available approaches are relatively new, and several caveats apply to their application (Wang and Bradburd 2014). This study is the first use of BEDASSLE to model the relationship between genetic differentiation and human modification of the environment. In addition to the model adequacy issues raised above, some of the populations analyzed here may not have reached migration–drift equilibrium, or were dominated by idiosyncratic environmental processes. Previous BEDASSLE analyses that detected stronger IBE patterns were conducted over much larger geographic scales among much more strongly differentiated populations, such as sky island birds (Manthey and Moyle 2015), lizards occupying a SE Asian archipelago separated by deep ocean trenches (Barley et al. 2015), and a widespread bird occupying much of the Amazon basin (Harvey and Brumfield 2015). These results suggest that the IBE model in BEDASSLE may currently be best suited to populations that are deeply diverged in both time and space.
Conclusions & future research
The results presented here demonstrate for the first time that urbanization is associated with a pervasive reduction in genomewide variation among animal populations. Peromyscus spp. Are increasingly important models for investigating natural variation (Bedford and Hoekstra 2015). Given unchecked urbanization, particularly in the eastern United States, it is likely that many white‐footed mouse populations in metropolitan areas have experienced similar declines in standing genetic variation. This study examined only a single urbanization gradient. Replicate analyses on white‐footed mice in other metropolitan areas, and studies on additional taxa that may be isolated in urban fragments, are necessary to robustly establish a pattern of declining genomewide variation in urbanizing landscapes.
Preliminary evidence indicates that some loci have experienced selective sweeps in urban white‐footed mice (Harris et al. 2013), but the relative impacts of urbanization on historical demography and natural selection have not been fully disentangled for these populations. Using an expanded transcriptome dataset (Harris et al. 2015a), we recently identified signatures of selection in NYC populations using an approach that accounts for historical demographic patterns (S. E. Harris & J. Munshi‐South, unpublished manuscript). We identified dozens of candidate loci under selection that are associated with metabolic and immune processes. These patterns may reflect changes in diet and biotic pressures (i.e., disease and inflammation) in cities. Thus, the loss of genomewide diversity documented here does not necessarily preclude local adaptation to highly altered, stressful urban environments.
Evolutionary potential of urban populations could be improved by restoring connectivity between urban forest patches. Enhanced gene flow between urban forests, as well as between urban and suburban areas, would increase overall genomewide diversity (although may simultaneously break up local adaptation). Habitat networks that promote gene flow in cities can be constructed from even small gardens and green spaces not explicitly dedicated to biodiversity (Goddard et al. 2010; Vergnes et al. 2012). Microevolutionary processes are rarely integrated into urban conservation due to their perceived complex nature and variation between taxa, but could be useful metrics for assessing landscape connectivity (Stockwell et al. 2003; Kinnison et al. 2007). Converting ‘gray’ infrastructure into ‘green’ networks could also simultaneously address conservation goals while delivering co‐benefits to humans in the form of biodiversity experiences (Tanner et al. 2014). A rich literature indicates that humans benefit in myriad ways from access to nature (Fuller et al. 2007), but integrative approaches to urban wildlife conservation are a major area for future growth (Shwartz et al. 2014).
Data archiving statement
DNA sequencing reads produced in this study are available on NCBI's Short‐read Archive (SRA) under accession number SRP067131. Bi‐allelic SNP genotypes, allele counts, and sample sizes are available in various formats on the Dryad digital repository http://dx.doi.org/10.5061/dryad.d48f9.
Supporting information
Table S1. Correlation coefficients calculated between % impervious surface and human population sizes calculated at buffers around study sites of 500 m, 1 km, 1.5 km, and 2 km.
Table S2. Summary population genomic statistics calculated for all nucleotide positions (variant and fixed).
Table S3. Matrix of pairwise FST (above diagonal) and great‐circle geographic distance (km; below diagonal) calculated between all pairs of 23 populations.
Figure S1. Cross‐validation (i.e., a‐score optimization) to identify the optimal number of principal components to retain for DAPC without overfitting4.
Figure S2. Compoplot/bar plot result from DAPC analysis on all 23 populations.
Figure S3. Scatterplot of first two discriminant functions from DAPC for populations on Long Island.
Figure S4. Cross‐validation of results from ADMIXTURE for K = 1 – 12.
Figure S5. Cumulative current map from isolation by resistance (IBR) modeling in Circuitscape. Lighter areas represent landscape cells with higher cumulative predicted current (i.e., higher movement).
Acknowledgements
We thank Matthew Combs and Nicole Fusco for helpful comments on the manuscript, and Mike Hickerson for access to the Hicker‐cabin and its surrounding white‐footed mouse habitat. Miranda Gonzalez provided substantial assistance in the laboratory, and Gideon Bradburd offered useful advice on landscape genetic analyses. Three anonymous reviewers and the Associate Editor provided suggestions for revisions that improved the manuscript. The New York State Department of Environmental Conservation, New York City Department of Parks and Recreation, New York Botanical Garden, and the Connecticut Department of Energy and Environmental Protection granted permission to collect white‐footed mouse samples. Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R15GM099055 to JM‐S, and by a Graduate Research Fellowship from the National Science Foundation to SEH. The content is solely the responsibility of the authors and does not represent the official views of the National Institutes of Health.
References
- Alexander, D. H. , and Lange K. 2011. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics 12:246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alexander, D. H. , Novembre J., and Lange K. 2009. Fast model‐based estimation of ancestry in unrelated individuals. Genome Research 19:1655–1664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andrews, S. 2010. FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
- Austin, J. D. , Gore J. A., Greene D. U., and Gotteland C. 2015. Conspicuous genetic structure belies recent dispersal in an endangered beach mouse (Peromyscus polionotus trissyllepsis). Conservation Genetics 16:915–928. [Google Scholar]
- Balkenhol, N. , and Waits L. 2009. Molecular road ecology: exploring the potential of genetics for investigating transportation impacts on wildlife. Molecular Ecology 18:4151–4164. [DOI] [PubMed] [Google Scholar]
- Barley, A. J. , Monnahan P. J., Thomson R. C., Grismer L. L., and Brown R. M. 2015. Sun skink landscape genomics: assessing the roles of microevolutionary processes in shaping genetic and phenotypic diversity across a heterogeneous and fragmented landscape. Molecular Ecology 24:1696–1712. [DOI] [PubMed] [Google Scholar]
- Barnosky, A. D. , Hadly E. A., Bascompte J., Berlow E. L., Brown J. H., Fortelius M., Getz W. M. et al. 2012. Approaching a state shift in Earth's biosphere. Nature 486:52–58. [DOI] [PubMed] [Google Scholar]
- Barr, K. R. , Kus B. E., Preston K. L., Howell S., Perkins E., and Vandergast A. G. 2015. Habitat fragmentation in coastal southern California disrupts genetic connectivity in the cactus wren (Campylorhynchus brunneicapillus). Molecular Ecology 24:2349–2363. [DOI] [PubMed] [Google Scholar]
- Barrett, R. D. H. , and Schluter D. 2008. Adaptation from standing genetic variation. Trends in Ecology & Evolution 23:38–44. [DOI] [PubMed] [Google Scholar]
- Bedford, N. L. , and Hoekstra H. E. 2015. Peromyscus mice as a model for studying natural variation. eLife 4:e06813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benítez‐López, A. , Alkemade R., and Verweij P. A. 2010. The impacts of roads and other infrastructure on mammal and bird populations: a meta‐analysis. Biological Conservation 143:1307–1316. [Google Scholar]
- Blair, R. B. 2001. Birds and Butterflies Along Urban Gradients in Two Ecoregions of the United States: is Urbanization Creating a Homogeneous Fauna? In Lockwood J. L., and McKinney M. L. eds. Biotic Homogenization, pp. 33–56. Springer US, New York, NY. [Google Scholar]
- Bradburd, G. 2014. BEDASSLE: Quantifies effects of geo/eco distance on genetic differentiation (version 1.5). https://cran.r-project.org/web/packages/BEDASSLE/index.html.
- Bradburd, G. S. , Ralph P. L., and Coop G. M. 2013. Disentangling the effects of geographic and ecological isolation on genetic differentiation. Evolution 67:3258–3273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brunner, J. L. , Duerr S., Keesing F., Killilea M., Vuong H., and Ostfeld R. S. 2013. An experimental test of competition among mice, chipmunks, and squirrels in deciduous forest fragments. PLoS One 8:e66798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burke, M. K. , Dunham J. P., Shahrestani P., Thornton K. R., Rose M. R., and Long A. D. 2010. Genome‐wide analysis of a long‐term evolution experiment with Drosophila. Nature 467:587–590. [DOI] [PubMed] [Google Scholar]
- Burnham, K. P. , and Anderson D. R.. 2002. Model Selection and Multi‐Model Inference: A Practical Information‐Theoretic Approach. Springer, New York, NY. [Google Scholar]
- Catchen, J. , Hohenlohe P. A., Bassham S., Amores A., and Cresko W. A. 2013. Stacks: an analysis tool set for population genomics. Molecular Ecology 22:3124–3140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chapman, J. R. , Nakagawa S., Coltman D. W., Slate J., and Sheldon B. C. 2009. A quantitative review of heterozygosity–fitness correlations in animal populations. Molecular Ecology 18:2746–2765. [DOI] [PubMed] [Google Scholar]
- Colosimo, P. F. , Hosemann K. E., Balabhadra S., Villarreal G. Jr, Dickson M., Grimwood J., Schmutz J. et al. 2005. Widespread parallel evolution in sticklebacks by repeated fixation of ectodysplasin alleles. Science 307:1928–1933. [DOI] [PubMed] [Google Scholar]
- Coop, G. , Pickrell J. K., Novembre J., Kudaravalli S., Li J., Absher D., Myers R. M. et al. 2009. The Role of Geography in Human Adaptation. PLoS Genetics 5:e1000500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cushman, S. A. , and Landguth E. L. 2010. Spurious correlations and inference in landscape genetics. Molecular Ecology 19:3592–3602. [DOI] [PubMed] [Google Scholar]
- Danecek, P. , Auton A., Abecasis G., Albers C. A., Banks E., DePristo M. A., Handsaker R. E. et al. 2011. The variant call format and VCFtools. Bioinformatics 27:2156–2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Domingues, V. S. , Poh Y.‐P., Peterson B. K., Pennings P. S., Jensen J. D., and Hoekstra H. E. 2012. Evidence of adaptation from ancestral variation in young populations of beach mice. Evolution 66:3209–3223. [DOI] [PubMed] [Google Scholar]
- Donihue, C. M. , and Lambert M. R. 2014. Adaptive evolution in urban ecosystems. Ambio 44:194–203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ellis, E. C. , Kaplan J. O., Fuller D. Q., Vavrus S., Goldewijk K. K., and Verburg P. H. 2013. Used planet: a global history. Proceedings of the National Academy of Sciences 110:7978–7985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Epps, C. W. , Wehausen J. D., Bleich V. C., Torres S. G., and Brashares J. S. 2007. Optimizing dispersal and corridor models using landscape genetics. Journal of Applied Ecology 44:714–724. [Google Scholar]
- Etterson, J. R. , and Shaw R. G. 2001. Constraint to adaptive evolution in response to global warming. Science 294:151–154. [DOI] [PubMed] [Google Scholar]
- Foll, M. , and Gaggiotti O. 2008. A genome‐scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180:977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franks, S. J. , Weber J. J., and Aitken S. N. 2014. Evolutionary and plastic responses to climate change in terrestrial plant populations. Evolutionary Applications 7:123–139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuller, R. A. , Irvine K. N., Devine‐Wright P., Warren P. H., and Gaston K. J. 2007. Psychological benefits of greenspace increase with biodiversity. Biology Letters 3:390–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gibson, G. , and Dworkin I. 2004. Uncovering cryptic genetic variation. Nature Reviews Genetics 5:681–690. [DOI] [PubMed] [Google Scholar]
- Goddard, M. A. , Dougill A. J., and Benton T. G. 2010. Scaling up from gardens: biodiversity conservation in urban environments. Trends in Ecology & Evolution 25:90–98. [DOI] [PubMed] [Google Scholar]
- Gortat, T. , Rutkowski R., Gryczyńska A., Pieniążek A., Kozakiewicz A., and Kozakiewicz M. 2014. Anthropopressure gradients and the population genetic structure of Apodemus agrarius. Conservation Genetics 16:649–659. [Google Scholar]
- Graves, T. A. , Beier P., and Royle J. A. 2013. Current approaches using genetic distances produce poor estimates of landscape resistance to interindividual dispersal. Molecular Ecology 22:3888–3903. [DOI] [PubMed] [Google Scholar]
- Groffman, P. M. , Cavender‐Bares J., Bettez N. D., Grove J. M., Hall S. J., Heffernan J. B., Hobbie S. E. et al. 2014. Ecological homogenization of urban USA. Frontiers in Ecology and the Environment 12:74–81. [Google Scholar]
- Haddad, N. M. , Brudvig L. A., Clobert J., Davies K. F., Gonzalez A., Holt R. D., Lovejoy T. E. et al. 2015. Habitat fragmentation and its lasting impact on Earth's ecosystems. Science Advances 1:e1500052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris, S. E. , Munshi‐South J., Obergfell C., and O'Neill R. 2013. Signatures of rapid evolution in urban and rural transcriptomes of white‐footed mice (Peromyscus leucopus) in the New York Metropolitan Area. PLoS ONE 8:e74938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris, S. E. , O'Neill R. J., and Munshi‐South J. 2015a. Transcriptome resources for the white‐footed mouse (Peromyscus leucopus): new genomic tools for investigating ecologically divergent urban and rural populations. Molecular Ecology Resources 15:382–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris, S. E. , Xue A. T., Alvarado‐Serrano D., Boehm J. T., Joseph T., Hickerson M. J., and Munshi‐South J. 2015b. Urbanization shapes the demographic history of a city‐dwelling native rodent. BioRxiv. doi:10.1101/032979 [Epub ahead of print]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harrisson, K. A. , Pavlova A., Telonis‐Scott M., and Sunnucks P. 2014. Using genomics to characterize evolutionary potential for conservation of wild populations. Evolutionary Applications 7:1008–1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harvey, M. G. , and Brumfield R. T. 2015. Genomic variation in a widespread Neotropical bird (Xenops minutus) reveals divergence, population expansion, and gene flow. Molecular Phylogenetics and Evolution 83:305–316. [DOI] [PubMed] [Google Scholar]
- Heilig, G. K. 2011. World Urbanization Prospects: the 2011 Revision. United Nations, Department of Economic and Social Affairs (DESA), Population Division, Population Estimates and Projections Section, New York, NY. [Google Scholar]
- Hobbs, R. J. , Arico S., Aronson J., Baron J. S., Bridgewater P., Cramer V. A., Epstein P. R. et al. 2006. Novel ecosystems: theoretical and management aspects of the new ecological world order. Global Ecology and Biogeography 15:1–7. [Google Scholar]
- Hobbs, R. J. , Higgs E., and Harris J. A. 2009. Novel ecosystems: implications for conservation and restoration. Trends in Ecology & Evolution 24:599–605. [DOI] [PubMed] [Google Scholar]
- Hoffmann, A. A. , and Sgrò C. M. 2011. Climate change and evolutionary adaptation. Nature 470:479–485. [DOI] [PubMed] [Google Scholar]
- Jha, S. 2015. Contemporary human‐altered landscapes and oceanic barriers reduce bumble bee gene flow. Molecular Ecology 24:993–1006. [DOI] [PubMed] [Google Scholar]
- Jombart, T. , and Ahmed I. 2011. Adegenet 1.3‐1: new tools for the analysis of genome‐wide SNP data. Bioinformatics 27:3070–3071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kenney‐Hunt, J. , Lewandowski A., Glenn T. C., Glenn J. L., Tsyusko O. V., O'Neill R. J., Brown J. et al. 2014. A genetic map of Peromyscus with chromosomal assignment of linkage groups (a Peromyscus genetic map). Mammalian Genome 25:160–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keyghobadi, N. 2007. The genetic implications of habitat fragmentation for animals. Canadian Journal of Zoology 85:1049–1064. [Google Scholar]
- Kinnison, M. , Hendry A., and Stockwell C. 2007. Contemporary evolution meets conservation biology II: Impediments to integration and application. Ecological Research 22:947–954. [Google Scholar]
- Kopelman, N. M. , Mayzel J., Jakobsson M., Rosenberg N. A., and Mayrose I. 2015. Clumpak: a program for identifying clustering modes and packaging population structure inferences across K. Molecular Ecology Resources 15:1179–1191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Legendre, P. , and Fortin M.‐J. 2010. Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data. Molecular Ecology Resources 10:831–844. [DOI] [PubMed] [Google Scholar]
- Magle, S. B. , Hunt V. M., Vernon M., and Crooks K. R. 2012. Urban wildlife research: past, present, and future. Biological Conservation 155:23–32. [Google Scholar]
- Manichaikul, A. , Mychaleckyj J. C., Rich S. S., Daly K., Sale M., and Chen W.‐M. 2010. Robust relationship inference in genome‐wide association studies. Bioinformatics 26:2867–2873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manthey, J. D. , and Moyle R. G. 2015. Isolation by environment in White‐breasted Nuthatches (Sitta carolinensis) of the Madrean Archipelago sky islands: a landscape genomics approach. Molecular Ecology 24:3628–3638. [DOI] [PubMed] [Google Scholar]
- Mazerolle, M. J. 2015. AICcmodavg: Model Selection and Multimodel Inference Based on (Q) AIC (c). R Package Version 2.0‐3. [Google Scholar]
- McKinney, M. L. 2002. Urbanization, biodiversity, and conservation. BioScience 52:883–890. [Google Scholar]
- McRae, B. H. 2006. Isolation by resistance. Evolution 60:1551–1561. [PubMed] [Google Scholar]
- McRae, B. H. , and Beier P. 2007. Circuit theory predicts gene flow in plant and animal populations. Proceedings of the National Academy of Sciences of the United States of America 104:19885–19890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merilä, J. , and Hendry A. P. 2014. Climate change, adaptation, and phenotypic plasticity: the problem and the evidence. Evolutionary Applications 7:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meuwissen, T. , Hayes B., and Goddard M. 2013. Accelerating improvement of livestock with genomic selection. Annual Review of Animal Biosciences 1:221–237. [DOI] [PubMed] [Google Scholar]
- Monz, C. A. , Pickering C. M., and Hadwen W. L. 2013. Recent advances in recreation ecology and the implications of different relationships between recreation use and ecological impacts. Frontiers in Ecology and the Environment 11:441–446. [Google Scholar]
- Mueller, J. C. , Partecke J., Hatchwell B. J., Gaston K. J., and Evans K. L. 2013. Candidate gene polymorphisms for behavioural adaptations during urbanization in blackbirds. Molecular Ecology 22:3629–3637. [DOI] [PubMed] [Google Scholar]
- Munshi‐South, J. 2012. Urban landscape genetics: canopy cover predicts gene flow between white‐footed mouse (Peromyscus leucopus) populations in New York City. Molecular Ecology 21:1360–1378. [DOI] [PubMed] [Google Scholar]
- Munshi‐South, J. , and Kharchenko K. 2010. Rapid, pervasive genetic differentiation of urban white‐footed mouse (Peromyscus leucopus) populations in New York City. Molecular Ecology 19:4242–4254. [DOI] [PubMed] [Google Scholar]
- Munshi‐South, J. , and Nagy C. 2014. Urban park characteristics, genetic variation, and historical demography of white‐footed mouse (Peromyscus leucopus) populations in New York City. PeerJ 2(March):e310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Narum, S. R. , Buerkle C. A., Davey J. W., Miller M. R., and Hohenlohe P. A. 2013. Genotyping‐by‐sequencing in ecological and conservation genomics. Molecular Ecology 22:2841–2847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nowak, D. J. , and Greenfield E. J. 2012. Tree and impervious cover change in U.S. cities. Urban Forestry & Urban Greening 11:21–30. [Google Scholar]
- Oakley, C. G. 2013. Small effective size limits performance in a novel environment. Evolutionary Applications 6:823–831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orr, H. A. 2005. The genetic theory of adaptation: a brief history. Nature Reviews Genetics 6:119–127. [DOI] [PubMed] [Google Scholar]
- Ozer, F. , Gellerman H., and Ashley M. V. 2011. Genetic impacts of Anacapa deer mice reintroductions following rat eradication. Molecular Ecology 20:3525–3539. [DOI] [PubMed] [Google Scholar]
- Palumbi, S. 2001. Humans as the world's greatest evolutionary force. Science 293:1786–1790. [DOI] [PubMed] [Google Scholar]
- Pennings, P. S. 2012. Standing genetic variation and the evolution of drug resistance in HIV. PLoS Computational Biology 8:e1002527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peterson, B. K. , Weber J. N., Kay E. H., Fisher H. S., and Hoekstra H. E. 2012. Double digest RADseq: an inexpensive method for De Novo SNP discovery and genotyping in model and non‐model species. PLoS One 7:e37135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prentis, P. J. , Wilson J. R., Dormontt E. E., Richardson D. M., and Lowe A. J. 2008. Adaptive evolution in invasive species. Trends in Plant Science 13:288–294. [DOI] [PubMed] [Google Scholar]
- Pritchard, J. , Stephens M., and Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945–959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pritchard, J. K. , Pickrell J. K., and Coop G. 2010. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Current Biology 20:R208–R215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Development Core Team . 2008. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria: http://www.R-project.org/. [Google Scholar]
- Raj, A. , Stephens M., and Pritchard J. K. 2014. FastSTRUCTURE: variational Inference of Population Structure in Large SNP Data Sets. Genetics 197:573–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramalho, C. E. , and Hobbs R. J. 2012. Time for a change: dynamic urban ecology. Trends in Ecology & Evolution 27:179–188. [DOI] [PubMed] [Google Scholar]
- Renaut, S. , Nolte A. W., Rogers S. M., Derome N., and Bernatchez L. 2011. SNP signatures of selection on standing genetic variation and their association with adaptive phenotypes along gradients of ecological speciation in lake whitefish species pairs (Coregonus spp.). Molecular Ecology 20:545–559. [DOI] [PubMed] [Google Scholar]
- Rivera‐Ortíz, F. A. , Aguilar R., Arizmendi M. D. C., Quesada M., and Oyama K. 2014. Habitat fragmentation and genetic variability of tetrapod populations. Animal Conservation 18:249–258. [Google Scholar]
- Rockman, M. V. 2008. Reverse engineering the genotype–phenotype map with natural genetic variation. Nature 456:738–744. [DOI] [PubMed] [Google Scholar]
- Ruddiman, W. F. , Ellis E. C., Kaplan J. O., and Fuller D. Q. 2015. Defining the epoch we live in. Science 348:38–39. [DOI] [PubMed] [Google Scholar]
- Scarcelli, N. , and Kover P. X. 2009. Standing genetic variation in FRIGIDA mediates experimental evolution of flowering time in Arabidopsis. Molecular Ecology 18:2039–2049. [DOI] [PubMed] [Google Scholar]
- Serieys, L. E. K. , Lea A., Pollinger J. P., Riley S. P. D., and Wayne R. K. 2015. Disease and freeways drive genetic change in urban bobcat populations. Evolutionary Applications 8:75–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seto, K. C. , Güneralp B., and Hutyra L. R. 2012. Global forecasts of urban expansion to 2030 and direct impacts on biodiversity and carbon pools. Proceedings of the National Academy of Sciences 109:16083–16088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sexton, J. P. , Hangartner S. B., and Hoffmann A. A. 2014. Genetic isolation by environment or distance: which pattern of gene flow is most common? Evolution 68:1–15. [DOI] [PubMed] [Google Scholar]
- Shwartz, A. , Turbé A., Julliard R., Simon L., and Prévot A.‐C. 2014. Outstanding challenges for urban conservation research and action. Global Environmental Change 28:39–49. [Google Scholar]
- Steffen, W. , Crutzen P. J., and McNeill J. R. 2007. The Anthropocene: are humans now overwhelming the great forces of nature. Ambio 36:614–621. [DOI] [PubMed] [Google Scholar]
- Stockwell, C. A. , Hendry A. P., and Kinnison M. T. 2003. Contemporary evolution meets conservation biology. Trends in Ecology & Evolution 18:94–101. [Google Scholar]
- Tanner, C. J. , Adler F. R., Grimm N. B., Groffman P. M., Levin S. A., Munshi‐South J., Pataki D. E. et al. 2014. Urban ecology: advancing science and society. Frontiers in Ecology and the Environment 12:574–581. [Google Scholar]
- Teotónio, H. , Chelo I. M., Bradić M., Rose M. R., and Long A. D. 2009. Experimental evolution reveals natural selection on standing genetic variation. Nature Genetics 41:251–257. [DOI] [PubMed] [Google Scholar]
- Väli, Ü. , Einarsson A., Waits L., and Ellegren H. 2008. To what extent do microsatellite markers reflect genome‐wide genetic diversity in natural populations? Molecular Ecology 17:3808–3817. [DOI] [PubMed] [Google Scholar]
- Vergnes, A. , Viol I. L., and Clergeau P. 2012. Green corridors in urban landscapes affect the arthropod communities of domestic gardens. Biological Conservation 145:171–178. [Google Scholar]
- Vessey, S. H. , and Vessey K. B. 2007. Linking behavior, life history and food supply with the population dynamics of white‐footed mice (Peromyscus leucopus). Integrative Zoology 2:123–130. [DOI] [PubMed] [Google Scholar]
- Vitousek, P. , Mooney H., Lubchenco J., and Melillo J. 1997. Human domination of Earth's ecosystems. Science 277:494–499. [Google Scholar]
- Walsh, C. J. , Roy A. H., Feminella J. W., Cottingham P. D., Groffman P. M., and Morgan R. P. II 2005. The urban stream syndrome: current knowledge and the search for a cure. Journal of the North American Benthological Society 24:706–723. [Google Scholar]
- Wang, I. J. , and Bradburd G. S. 2014. Isolation by environment. Molecular Ecology 23:5649–5662. [DOI] [PubMed] [Google Scholar]
- Xian, G. , Homer C., Demitz J., Fry J., Hossain N., and Wickham J. 2011. Change of impervious surface area between 2001 and 2006 in the conterminous United States. Photogrammetric Engineering and Remote Sensing 77:758–762. [Google Scholar]
- Zipperer, W. C. , Foresman T. W., Walker S. P., and Daniel C. T. 2012. Ecological consequences of fragmentation and deforestation in an urban landscape: a case study. Urban Ecosystems 15:533–544. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Table S1. Correlation coefficients calculated between % impervious surface and human population sizes calculated at buffers around study sites of 500 m, 1 km, 1.5 km, and 2 km.
Table S2. Summary population genomic statistics calculated for all nucleotide positions (variant and fixed).
Table S3. Matrix of pairwise FST (above diagonal) and great‐circle geographic distance (km; below diagonal) calculated between all pairs of 23 populations.
Figure S1. Cross‐validation (i.e., a‐score optimization) to identify the optimal number of principal components to retain for DAPC without overfitting4.
Figure S2. Compoplot/bar plot result from DAPC analysis on all 23 populations.
Figure S3. Scatterplot of first two discriminant functions from DAPC for populations on Long Island.
Figure S4. Cross‐validation of results from ADMIXTURE for K = 1 – 12.
Figure S5. Cumulative current map from isolation by resistance (IBR) modeling in Circuitscape. Lighter areas represent landscape cells with higher cumulative predicted current (i.e., higher movement).
