Abstract
The federal Superfund site at New Bedford Harbor (Massachusetts, USA) is an example of an environment where pollution levels rose quickly and dramatically. Industrial waste containing polychlorinated biphenyls, heavy metals, and other organic pollutants was dumped into the harbor in the mid-20th century. The mummichog (Fundulus heteroclitus) is a widely distributed fish typically susceptible to polychlorinated biphenyl toxicity. However, the population in New Bedford Harbor is one of several that have evolved the ability to tolerate this category of toxicants. Constituents of the aryl hydrocarbon receptor system are linked to this adaptive pollution tolerance. Our population genetic analysis of 444 mummichogs from Massachusetts and Rhode Island estuaries using 55 SNP loci suggests that F. heteroclitus near New Bedford Harbor have large populations and restricted but meaningful levels of gene exchange among adjacent habitats. When comparing polluted to cleaner sites, we find strong evidence of genetic differentiation at a small geographic scale. Populations at the two most polluted sites form a genetically distinct cluster. Much of this differentiation is driven by allele frequency differences at loci associated with the aryl hydrocarbon receptor system. While allele frequencies at loci associated with pollution tolerance vary between clean and polluted habitats, putatively adaptive alleles are present at low frequency elsewhere in our study area.
Keywords: adaptation, polychlorinated biphenyls, aryl hydrocarbon receptor (AHR), Fundulus, mummichog, Atlantic Killifish
Introduction
In many urban environments, chemical pollution represents a novel environmental challenge. The interaction between industrial pollutants and populations of mummichogs (Fundulus heteroclitus) is a well-documented example of adaptation to rapid environmental change due to historical discharges of toxic wastes in several polluted U.S. east coast estuaries (Reid et al. 2016). Mummichogs in urban habitats ranging from the Elizabeth River in Virginia to New Bedford Harbor (NBH) in Massachusetts persist and thrive in environments with high concentrations of organic toxicants and heavy metals (Weis, 2002, Whitehead et al. 2017).
In the New Bedford Harbor, organic pollutants were released into the upper estuary in the 1940s as industrial waste from electronics manufacturers (Weaver 1982, Whitehead et al. 2017). Environmental polychlorinated biphenyl (PCB) concentrations in ‘hot spots’ were as high as 0.1 grams PCB per dry weight gram of sediment (Ho et al. 1997, Bergen et al. 2005). Polycyclic aromatic hydrocarbons (PAHs) and heavy metals are also present. The area was declared to be a federal Superfund site in 1982, and remediation efforts have been continuous since the 1990s (Pesch et al. 2011). As recently as 2019, pre-excavation sediment concentrations at estuary hotspots in NBH were measured to be 53000ng/g (Rigassio-Smith 2022). PCB levels in mummichog tissue in polluted sites can exceed 1370μg/g (Lake et al. 1995). While these pollutants cause severe developmental problems at early life stages, mummichogs persist in these highly polluted environments (Nacci et al. 2010).
Sediment PCBs are a strong selective force for mumichogs, causing high mortalities in embryos (Nacci et al. 2010, 2016). Fish from NBH and other polluted sites can tolerate much higher levels of organic toxicants than fish from cleaner sites. Importantly, this tolerance is heritable; lab-raised F1 and F2 embryos from contaminated sites develop normally when exposed to high levels of PCB, whereas embryos derived from fish from clean sites were sensitive to exposures orders of magnitude lower (Nacci et al. 1999, 2010).
Variants in the aryl hydrocarbon receptor (AHR) system and associated proteins are believed to be one mechanism that allows local adaptation to high levels of PCB contamination (Meyer et al. 2002, Nacci et al. 2010, Oleksiak et al. 2011, Reid et al. 2016). Homologs of this ancient system exist throughout the Bilateria and have several functions (Hahn et al. 2017). One current function of AHR and its numerous paralogs in vertebrates is to activate several loci whose products metabolize certain xenobiotics, including dioxin-like compounds (DLCs), such as PCBs (Hahn et al. 2017). Specifically, the AHR protein is a ligand-activated transcription factor that, when activated by PCBs or other molecules, binds to aryl hydrocarbon receptor nuclear transporter protein (ARNT) and other proteins. The AHR/ARNT complex regulates the transcription of several stress response enzymes, including Cytochrome P450 1A (CYP1A) (Reitzel et al. 2014, Di Giulio and Clark 2015, Mulero-Navarro and Fernandez-Salguero 2016). In F. heteroclitus, a connection between the ability to complete their life cycle in the presence of organic pollutants and the genetics of the detoxification pathway has been documented (Proestou et al. 2014, Reid et al. 2016). Importantly, tolerance of DLCs is associated with downregulation of the AHR system. Fish from polluted habitats express fewer downstream response proteins like CYP1A when exposed to pollutants (Elskus et al. 1999, Meyer et al. 2002, Oleksiak et al. 2011).
Mummichogs may have a population genetic advantage relative to many other vertebrates when faced with novel environments. Populations are large. Sweeney et al.’s (1998) population estimates approach 30000 individuals in a 300m section of tidal creek. Microsatellite-based estimates of the genetic effective population size (Ne) are large at northern North American habitats, including New Bedford Harbor. Adams et al. (2006) calculated average Ne to be between 5581 and 89958, depending on how many generations of “isolation” were used in their model. In contrast to direct population censuses, Ne reflects both local reproduction and connectivity to other habitats.
Mummichogs are also philopatric. Lotrich (1975) determined an average home range for mummichogs of 36 meters and that migration to and from the local habitat was rare. In a mark-recapture study, Skinner et al. (2005) tagged 4123 fish between May and November and found that 96.6% of fish captured were within 200m of the initial marking site with no sex bias observed in the 3.4% of fish who were recaptured at sites between 600 and 3600m from the original marking site (fish in northern latitudes spawn in May and June). Ehrlich et al. (2021) tagged 2200 fish in May and June at sites in New Jersey (USA). They then trapped 200 fish in August and September at the same habitats and observed that 95% of fish recaptured were found at their initial tagging site.
Population connectivity data are consistent with these observations. For example, McMillan et al. (2006) found small but highly significant differences among populations along the Massachusetts and Rhode Island coasts using an AFLP data set. At a fine (tens of km) scale, Roark et al. (2005) documented a pattern of isolation by distance along the southern coast of Massachusetts and Rhode Island (USA) using allozyme markers, and Brown and Chapman (1991) estimated that there were about a dozen migrants/generation between adjacent populations separated by a kilometer or more. Thus, mummichogs are numerous and philopatric. Their populations harbor large amounts of standing genetic variation while maintaining the potential for local differentiation due to moderate levels of between-habitat migration (Roark et al. 2005, Adams et al. 2006, McMillan et al. 2006).
As coastal/estuarine species, F. heteroclitus populations approximate a linear array of populations, similar to Kimura and Weiss’s (1964) stepping stone model in which gene exchange is far more likely to occur between adjacent habitats than it is between more distant habitats. Our null expectation is that most loci will follow a pattern of isolation by distance (IBD). The availability of a complete mummichog genome provides a new context for our understanding of how evolutionarily significant variants are distributed among populations and habitat patches (Miller et al. 2019; GenBank MU-UCD_Fhet_4.1 reference). Based on the AHR and Fundulus literature, we expect variants in AHR-associated loci (AHR and paralogs, ARNT, and CYP1A/3A) to enable local adaptation to PCBs. If so, these loci will deviate from the general pattern of isolation by distance due to local selective pressures at polluted sites. To test this prediction, we sampled 444 individual fish from 11 habitats in south-eastern New England (USA) along a 80km stretch of coastline. We developed a reliable set of 55 SNPs, 11 of which are known to be in coding regions for proteins associated with the AHR system, along with other coding and more “anonymous” SNPs which we expect to reflect broad geographic patterns of divergence in contrast to the AHR-associated SNPs.
Materials and methods
Fish sampling
We sampled F. heteroclitus populations at 11 locations along the south coast of Massachusetts and Rhode Island (USA). Sampling sites were at or near locations previously sampled by Roark (2005), and historical sediment PCB concentrations are known for several of these sites (Table 1). Sampling coincided with local high tide during the summer 2020 and 2021 field seasons because mummichogs are actively feeding during this period.
Table 1.
Fundulus heteroclitus collection site data arranged from west to east.
| Site Name | Site abbreviation | Latitude | Longitude | Sample size | Sediment [PCB] |
|---|---|---|---|---|---|
|
| |||||
| Jerusalem, Narragansett, RI | Jer | 41.378633 | −71.520329 | 32 | Ong/g |
| Wandsworth, Narragansett, RI | Wan | 41.42546 | −71.493284 | 37 | 2 ng/g |
| Middlebridge, Narragansett, RI | Mid | 41.457453 | −71.452356 | 33 | N/A |
| Horseneck Salt Marsh, Westport, MA | Hor | 41.509331 | −71.055776 | 25 | N/A |
| Slocum’s River | Slo | 41.540297 | −70.97305 | 43 | 7 ng/g |
| Apponagansett, Dartmouth, MA | Apo | 41.5836641 | −70.950807 | 50 | N/A |
| Beach St., Fairhaven, MA | Bch | 41.655335 | −70.914554 | 44 | 3762 ng/g |
| New Bedford Harbor, Acushnet, MA | NBH | 41.674565 | −70.912979 | 46 | 22 666 ng/g |
| Hacker St, Fairhaven, MA | Hak | 41.63155 | −70.883071 | 57 | 13 ng/g |
| Winsegassett Ave, Fairhaven, MA | Win | 41.596777 | −70.861887 | 31 | 634 ng/g |
| Mattapoisett, MA | Mat | 41.648567 | −70.824761 | 46 | 27 ng/g |
Dry weight PCB sediment concentrations from Nacci et al. (2010) and Roark et al. (2005).
Fish were caught in minnow traps baited with frozen squid or hot dogs. Traps remained in the water for at least 30min. Fish were anesthetized using tricaine, and a small piece of caudal fin tissue was removed on a sterilized cutting board with a sterile razor. Per our collecting permits (MA178809 and RI487) and IACUC protocol (JM201842-A), fish recovered in fresh seawater and were released back to their habitat. Tissue samples were placed on ice in the field and frozen once we returned to the lab.
DNA methods
Fin clips were extracted using a Qiagen DNeasy Blood and Tissue kit following the manufacturer’s instructions (Qiagen Corporation 2020). DNA was eluted in 200μl of AE buffer. DNA samples were quantified using a nanospectrophotometer, and concentrations were adjusted to 40ng/μl.
Fluidigm genotyping
SNP genotypes were collected using a Fluidigm (now Standard BioTools) EP-1 system. Briefly, this system relies on two SNP variant-specific TaqMan-style PCR probes labeled with different fluorescent tags that are amplified within the same microchamber. When PCR amplified, homozygotes have a strong signal for one of the two fluorescent probes, while heterozygotes have equal levels of signal from both. PCR-based assays are mixed, amplified, and interrogated on a single ‘chip’ that allows extremely low volume PCR reactions to be performed on 96 DNA samples using 96 different SNP probes sets on a custom plate with the same footprint as a 96 well PCR plate. This minimizes both the cost and handling times.
Normalized DNA samples were preamplified per the manufacturer’s directions using a pool of STA/LSP primers to increase the concentration of target DNA (Fluidigm Corporation 2016).
Primer/probe design and locus selection
Candidate loci were selected from several sources. These include loci previously used in genome mapping projects and genomic sequencing efforts (Oleksiak et al. 2011, Proestou et al. 2014, Waits et al. 2016) and nominal SNPs shared by colleagues (Karchner et al., pers. com). Our goal was to find assayable SNPs on as many chromosomes as possible. Because the AHR detoxification pathway has been extensively studied, we were keen to include several SNP from within coding regions associated with the AHR system.
To create sets of candidate primers and probes, Fluidigm’s D3 design program (Fluidigm Corporation 2020) requires 60b.p. of sequence on either side of the SNP to design oligonucleotide sets. In cases where adjacent sequences were short, the available sequence was BLASTed against the F. heterocltus genome (GenBank MU-UCD_Fhet_4.1) to obtain sufficient flanking sequence. Variation within flanking regions may prevent primers and probes from annealing, so SNPs near another known variable site were excluded early in the selection phase. This strategy allowed us to develop primer/probe sets for 192 candidate loci (two Fluidigm plate sets). Seventy-seven of these loci were in known coding regions. The others were ‘anonymous’ (but see results).
In an earlier pilot project (data not shown), we attempted to genotype each candidate locus in a panel of 74 F. heteroclitus DNA samples with a broad geographic distribution. These vintage DNA samples were collected between 2000 and 2011 from habitats along the eastern US coast ranging from Elizabeth River, VA to New Bedford Harbor, MA. They were genotyped at the 192 candidate loci. Because the Fluidigm chip can genotype up to 96 individuals simultaneously, several individuals were run twice to test for repeatability. Each plate had one sample slot reserved for a non-template (negative) control.
The 96 most reliable loci from this pilot study were assembled into a new genotyping set, and each of the recent DNA samples collected from southern New England were genotyped with this subset. Habitat samples were divided among Fluidigm plates for genotyping to control any plate-associated scoring biases. To ensure repeatability, two randomly chosen individuals were run repeatedly on each of the Fluidigm chips required to complete this project. Loci that did not produce repeatable genotypes on our positive controls were removed from the data set, leaving a set of 55 very reliable loci.
Basic population statistics
Allele frequencies, Hardy Weinberg equilibrium (HWE) tests, G’ST, and FST values were calculated using GenALEX Version 6.503 (Peakall and Smouse 2012). Geographic distance was calculated by tracing the shoreline between collection sites in Google Earth. The potential isolation by distance relationship was analyzed using the Isolation By Distance IBD 1.52 program (Bohonak 2002). To better understand the effects of selection on the pattern of IBD, a second analysis was conducted using only the nominally non-coding loci.
Cluster analyses
Cluster analyses were conducted using both Structure Version 2.3.4 (Pritchard et al. 2000) and Adegenet’s DAPC 2.1.0 package (Jombart et al. 2010). These two packages use distinct models to detect clusters of genetically similar individuals. The Structure program builds genetic clusters by maximizing HWE within groups and disequilibrium between them. The method is also sensitive to linkage disequilibrium (LD) between loci. Because ongoing selection could disrupt HWE, and a correlation between genotypes at coding or non-coding loci could generate a signal similar to LD, a second cluster analysis was conducted using the DAPC unit of the ADEGENET R package Version 2.1.0 (Jombart et al. 2010). This method detects clusters by maximizing between-group genetic variation and minimizing within-group variation.
Both Structure and DAPC results are sensitive to starting conditions and analytical choices (Miller et al. 2020, Thia 2023). For the Structure analysis, the program was initialized using the 11 collection locations. We used an admixture model with a burn-in of 25,000 followed by 75000 MCMC repetitions after the burn-in period. Our Structure results were further analyzed using Structure Harvester (Earl and VonHoldt 2012) or StructureSelector (Li and Liu 2018) and the Evanno et al. (2005) criteria to determine the number of genetic clusters. The graphical output was modified for clarity using Pophelper 2.3.1 (Francis 2017). A follow up analysis was conducted using a reduced data set which excluded AHR-associated loci.
The DAPC analysis was performed interactively with the “find. clusters()” command, which makes no a priori assumptions about collection localities. During the early stages, we retained 40 PCAs explaining ~90% of the variation which produced a Bayesian Information Criterion graph that indicated that seven clusters were most likely. Based on the A-score criteria, 10 PCAs were retained for discriminant analyses. As with the Structure analysis, a second model was created using a reduced data that did not include the 11 AHR-associated loci.
Discriminant axes are synthetic variables representing correlated sets of input variables (in this case SNP genotypes). In ecological data sets, these linear combinations may reflect external environmental factors which are driving divergences among sets of loci (Jombart and Collins 2015). To test the explanatory power of each DA, we calculated the correlation between DA1 and geographical distance from our westernmost site and the correlation between [PCB] (when known) and DA2.
Bioinformatics
The potential effect of each SNP variant was evaluated. This project benefited from the availability of a partially annotated genome (Miller et al. 2019). Each SNP and its flanking region was BLASTed against the annotated F. heteroclitus genome (GenBank MU-UCD_Fhet_4.1 reference). Where reading frames and nominal functions could be determined from annotations or individual records, we recorded the amino acid coded for each SNP variant. When the reading frame was not annotated, particularly for our “anonymous” loci, possible open reading frames were inferred based on start codon availability. However, many of these may simply exist in non-coding regions.
Results
Locus locations and functions
Out of 192 candidate loci, 55 were variable and reproducible across positive controls; these were used in subsequent analyses. The Supplementary Data file shows a list of the successfully genotyped loci, chromosomal locations, and their nominal functions as listed in GenBank (when known). SNPs are located on 15 of F. heteroclitus’ estimated 24 chromosomes (Miller et al. 2019). The file also includes a list of sequences for all loci tested, regardless of whether they were amplified successfully or passed our quality control standards. Eleven loci are within coding regions associated with the AHR system. Seven are in coding regions for AHR and its paralogs, one is within the aryl hydrocarbon receptor nuclear transporter protein (ARNT), three are within the transcription products CYP1A and CYP3A. Eight SNPs were within other previously known coding loci. Hypothesized functions (when available) for the remaining less well-characterized loci are also listed in Supplementary Data.
Genetic diversity and Hardy-Weinberg equilibrium
The overall heterozygosity at all 55 variable loci at all sites is 0.29, ranging from 0.26 at Slocum’s River to 0.33 at Jerusalem Point (Table 2). Samples from the three most polluted sites, NBH, Bch, and Win, have observed heterozygosities of 0.29, 0.28, and 0.27, respectively. T-test comparisons of average population Ho for each locus separately between the three most polluted sites and the others were not significant for all loci, AHR-associated loci, and non-AHR loci (Table 2).
Table 2.
Average observed heterozygosity (Ho) at each habitat, sorted by [PCB].
| Observed heterozygosity |
|||||||
|---|---|---|---|---|---|---|---|
| Habitat | [PCB] ng/g | All loci | Non-AHR associated | AHR associated | |||
|
| |||||||
| Mid | N/A | 0.30 | 0.29 | 0.30 | 0.29 | 0.30 | 0.30 |
| Hor | N/A | 0.28 | 0.28 | 0.28 | |||
| Apo | N/A | 0.28 | 0.28 | 0.31 | |||
| Jer | 0 | 0.33 | 0.34 | 0.27 | |||
| Wan | 2 | 0.32 | 0.33 | 0.28 | |||
| Slo | 7 | 0.26 | 0.25 | 0.31 | |||
| Hak | 13 | 0.29 | 0.28 | 0.34 | |||
| Mat | 27 | 0.28 | 0.27 | 0.31 | |||
| Win | 634 | 0.27 | 0.28 | 0.26 | 0.27 | 0.29 | 0.34 |
| Bch | 3,762 | 0.28 | 0.25 | 0.37 | |||
| NBH | 22,666 | 0.30 | 0.29 | 0.35 | |||
| Overall | 0.29 | P = .38 | .28 | P = .14 | 0.31 | P = .51 | |
Individual habitat averages and a combined average for subsets including the most and least polluted habitats are shown. AHR-associated SNPs include variants in AHR and its parlogs, ARNT, CYP1A, and CYP3A. P-values from twotailed T-tests are shown.
Most loci were in HWE within our sample sites, though some exhibited HWD within some populations (Supplementary Data) at a P < .05. These would almost all be insignificant after a Bonferroni correction. If we adjust for 11 populations at each locus, the cutoff would be P < .005, whereas it would be P < .0009 if we adjust for 55 loci within a population. At the two most polluted habitats (Bch and NBH), AHR-associated loci are nearly all in HWE, with only AHR_1029 and CYP1A_2150 having unadjusted P-values between .01 and .05—each within only one of the two populations. Three “anonymous” loci (xTC17025_152, xTC20553_244 and, x1172_59) are in HWD at all habitats with unadjusted P-values ranging from <.05 to <.001.
Population structure
An analysis of population structure using G’ST (Hedrick 2005) shows low but statistically significant differentiation between most adjacent populations (Table 3). The highest G’ST of 0.155 was observed between our westernmost site (Jer) and NBH. Comparisons between other distant sites were also high. Most pairwise comparisons are statistically distinguishable from zero at an unadjusted P < .05, with a few exceptions among the habitats in southern Massachusetts (Win + Apo, Win + Hak, Win + Mat, Mat + Apo, NBH + Bch). When all loci were analyzed together, an overall pattern of isolation by distance was observed (Fig. 1), and there is a significant relationship between G’ST values and geographic distance (Mantel Test r = 0.95, P < .001 for raw data—other transformation combinations are also statistically significant). When the known coding loci were not included in the data set, the Mantel Test r for the untransformed data was 0.96, P < .001.
Table 3.
G’ST matrix of all collection site comparisons (below diagonal) and associated P-values (above diagonal).
| Jer | Wan | Mid | Hor | Slo | Apo | Hak | Bch | NBH | Win | Mat | |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||||
| Jer | 0.009 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | |
| Wan | 0.009 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | |
| Mid | 0.035 | 0.039 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | |
| Hor | 0.086 | 0.085 | 0.066 | 0.002 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | |
| Slo | 0.122 | 0.109 | 0.091 | 0.017 | 0.261 | 0.001 | 0.001 | 0.001 | 0.001 | 0.005 | |
| Apo | 0.131 | 0.114 | 0.098 | 0.040 | 0.002 | 0.001 | 0.001 | 0.001 | 0.096 | 0.071 | |
| Hak | 0.135 | 0.119 | 0.110 | 0.048 | 0.032 | 0.015 | 0.005 | 0.001 | 0.065 | 0.001 | |
| Bch | 0.155 | 0.133 | 0.129 | 0.062 | 0.045 | 0.031 | 0.011 | 0.240 | 0.001 | 0.001 | |
| NBH | 0.156 | 0.137 | 0.142 | 0.078 | 0.066 | 0.049 | 0.029 | 0.002 | 0.001 | 0.001 | |
| Win | 0.145 | 0.134 | 0.113 | 0.053 | 0.017 | 0.005 | 0.006 | 0.024 | 0.047 | 0.719 | |
| Mat | 0.146 | 0.136 | 0.108 | 0.041 | 0.010 | 0.005 | 0.013 | 0.033 | 0.057 | −0.002 | |
Most comparisons are statistically significant at P < .05. Exceptions are highlighted in yellow. The P-values shown here are not adjusted for multiple comparisons.
Figure 1.

Isolation by distance scatterplot comparing G’ST values calculated using all loci to the geographic distance between sites. In this analysis, the Mantel test r = 0.95, P < .001. The result is similar when known coding loci were excluded from the data set.
Variation and divergence among loci
Our SNPs are a mixture of coding and anonymous markers. Eleven of our validated SNPs are within loci associated with the AHR system and are of particular interest given the role of AHR in pollution tolerance (Reid et al. 2016, Whitehead et al. 2017). These include AHR and its paralogs, its transcription cofactor ARNT, and two of its several transcription products, CYP1A and CYP3A (Supplementary Data).
Allele and genotype frequencies varied across habitats. Allele frequencies for each locus at each habitat are shown in Supplementary Data. The strongest patterns of genetic differentiation were observed between loci associated with the AHR signaling pathway. Figure 2 summarizes genotype frequencies at four AHR-associated loci across all habitats surveyed.
Figure 2.

Multilocus SNP genotype frequencies at four coding loci associated with the AHR system. Collection sites are arranged on the X-axes from west to east. When known, log sediment [PCB] is shown above the first graph (Jer is set to 0). New Bedford Harbor and the nearby Winsegansett and Beach Street habitats are among the most contaminated sites. Genotype codes reflect multiple SNPs within coding loci associated with pipes in the genotype keys separate diploid genotypes. Genotypes are in numerical order following the position in the coding sequence—AHR 1 includes positions 161, 948, 1530, and 2289. Positions 161 and 2289 result in amino acid substitutions. AHR2 includes data from positions 792 and 1929 (both silent mutations). The single ARNT SNP is associated with an amino acid substitution, and CYP1A has variable sites at positions 2140 and PP2 (59121), which would result in amino acid changes.
Loci associated with adaptation to locally unique environments are expected to diverge from the overall geographic patterns among populations. To highlight this divergence from IBD, we created a heatmap for pairwise divergence (measured as single locus FST) between the westernmost (and unpolluted) Jer site and each of the other populations (Fig. 3). SNPs within the AHR1, ARNT, and CYP1A coding loci show the strongest pattern of divergence between Jer and the most polluted habitats at NBH and Bch. Original single locus FST values and p-values are available in Supplementary Data.
Figure 3.

Heat Map depicting genetic divergence at each locus as estimated with FST. Single locus FST values were calculated between our westernmost “clean” site (Jer) and each of the other habitats. Habitats are arranged from west to east. Bch, NBH, and Win are the three most polluted habitats. A pattern of isolation by distance is indicated by a steady increase in FST values from west to east, for example, the unmapped locus xTC15539_236. Deviations from this pattern are seen at several loci. Site abbreviations are defined in Table 1.
Population clustering
Structure
When using the Evanno (2005) method in Structure Harvester (Earl and VonHoldt 2012), our Structure analysis found that the best model consisted of three clusters. The mean L(K) and standard deviations for two, three, and four clusters is −24077.8± 17.5, −23529.5±10.5, and −23604.78±108.1, respectively. DeltaK values for 2 clusters is 78.42, for 3 it is 1.1, and for 4 it is 0.46. One cluster combines the three habitats to the west of Narragansett Bay, and a broader eastern cluster unites coastal samples from the east of Narragansett Bay. A third cluster consists of individuals from NBH and the nearby Bch habitat (Fig. 4). When the AHR-associated loci were removed from the Structure data set, the best model supported only two clusters, one each on either side of Narragansett Bay (Supplementary Fig. S3).
Figure 4.

Structure results at 11 Fundulus heteroclitus collection sites in southern New England, USA. When known, dry-weight PCB sediment concentrations from Nacci et al. (2010) and Roark et al. (2005) are shown next to collection sites. Structure results for k = 3 are shown below the map along with site abbreviations. Pie charts represent pooled membership in inferred clusters for groups of collection localities as follows: West of Narragansett Bay, east of Narragansett Bay to Aponogansett (Central), New Bedford Harbor and Beach St, and finally habitats East of New Bedford Harbor. Collection site abbreviations, sample sizes, and GPS coordinates are listed in Table 1.
Adegenet/DAPC
The first two discriminant axes (DAs) explain 46.3% and 25.7% of the observed variation. The third DA explains only 12.5% of the variation. The best clustering model presents 7 clusters that partially overlap when plotted against the first two DAs to form three groups (Fig. 5). Clusters 6 and 7 include most samples from west of Narragansett Bay. Clusters 2 and 3 contain 63% of the samples from NBH and 57% of the Bch samples. These clusters also contain a handful of samples from cleaner sites along the Massachusetts shore. Clusters 1, 4, and 5 contain most of the samples from Hor, Slo, Apo, Hak, Win, and Mat. The fraction of individuals from each collection site that fit into these clusters is shown in the inset of Fig. 5. The positions of samples in DA space without clustering are shown in Supplementary Fig. S2. When the AHR-associated loci were removed from the data set, a model with 11 clusters was best supported. As in the analysis of the complete data set, samples from west of Narragansett bay form geographically reasonable clusters. The pattern is more complex for samples east of the bay, and the two clusters associated with the most polluted sites were not detected (Supplementary Fig. S3).
Figure 5.

The relative positions of each individual and inferred genetic clusters arrayed on the first (X) and second (Y) Discriminant Axes. The upper left inset shows the fraction of individuals from each habitat belonging to each of the inferred clusters in the DAPC model. Bold numbers indicate the cluster that has the highest membership for each habitat. The top quintile of loadings for each DA are shown in the lower left inset, sorted from most to least influential. Loci are sorted from most to least influential. Bold locus names indicate loci that heavily influence both DAs. Retained PCA eigenvalues (upper right) and DA eigenvalues (lower right) are also shown.
When the whole data set was analyzed, the arrangement of sample points along the first DA appears to be related to the east-west distribution of samples along the shoreline. Further, samples east of Narragansett Bay trended >0 for polluted sites on DA2 and <0 for individuals from less polluted sites. To quantify this observation, we calculated the correlation between DA1 and distance from the westernmost site and DA2 and measured [PCB]. Position along DA1 is strongly correlated with geography (Pearson’s r = 0.80, P < .0001). The second DA is weakly but significantly correlated with [PCB] (r = 0.25, P < .0001).
The top quintile of loadings for DA1 and DA2 are shown in Fig. 5. Four SNPs in the AHR-associated loci are influencing DA1, while 7 SNPs in AHR-associated loci are influencing DA2. Multiple SNPs are in the AHR1 locus and influence both DAs.
Amino acid changes
When possible, DNA sequences surrounding the SNPs were translated and amino acid substitutions resulting from the SNP variants were recorded (Supplementary Data). When potential functions of the more “anonymous” SNPs could be determined from Genbank data, these were noted. Variation at 27 of the loci we examined potentially caused changes in the amino acid sequence. Notably, some of our SNPs are associated with nonsense mutations in the AHR1 and putative Thioredoxin loci. We also found potential nonsense substitutions at two loci whose function (if any) is unknown: x259_62 and xTC15240_1242. We found several missense mutations at SNPs within the FABP, AHR, AHR2b, CYP1A, and LDH genes, in addition to potential missense mutations at 10 sequences whose function is unknown or not yet well documented.
Discussion
The F. heteroclitus system represents a unique opportunity to witness evolutionary processes in a species that has repeatedly and independently adapted to high levels of organic pollution in several urbanized environments (Whitehead et al. 2017). Consistent with our null expectation, we find a general correlation between geographic distance and genetic differentiation across the area surveyed, with evidence of gene flow between only some adjacent habitats. We also detect divergence in allele frequencies between clean and polluted sites at SNPs associated with the AHR system which has been hypothesized to contribute to PCB tolerance (Clark et al. 2010, Oleksiak et al. 2011, Reitzel et al. 2014). We do not observe the deviations from Hardy-Weinberg equilibrium, or the reduced genetic diversity expected from recent or ongoing selection within the polluted habitats, suggesting that the populations in the highly polluted sites have adapted to the new environmental conditions. Overall, we find apparent differences in allele and haplotype frequencies at several loci (Fig. 2) in polluted habitats which underpin both patterns of genetic clustering (Figs 4 and 5), and estimates of divergence among populations (Fig. 3, Table 2). These include variants within AHR-associated loci (Figs 2 and 3). When FST values were calculated between the cleaner westernmost Jer habitat and each of the other habitats, many loci followed the expected IBD pattern, or they were more irregular (Fig. 3). However, SNPs in loci associated with the AHR system nearly all have high FST between Jer and the two most polluted habitats, Bch and NBH, consistent with previous studies documenting the role of these loci in adaptation to PCBs (Clark et al. 2010, Oleksiak et al. 2011).
Cluster analyses and population structure
Cluster analyses provide a valuable heuristic for describing regional evolutionary patterns. Jombart and Collins (2015) suggest inferred clusters are a simplified “caricature” of biological reality. As such, the simplified models produced by DAPC and Structure produce patterns that may help us understand more complex evolutionary processes and frame new hypotheses. Our Structure model describes three distinct clusters. One is west of Narragansett Bay and another is east of the bay along the Massachusetts coast. This second cluster is bisected by a third cluster containing samples from the two most polluted habitats, NBH and Bch (Fig. 4). Similarly, our clustering analysis using DAPC detects seven partly overlapping genetic clusters, which fall into three groups that echo those formed by the structure analysis (Fig. 5). DAPC results reveal a more complex pattern of gene exchange and migration. Collection localities east of Narragansett Bay imperfectly map onto the sevn DAPC clusters (Fig. 5 inset). For example, 63% of NBH samples belong to the clusters we associate with urbanization and high sediment PCB, with the remaining 37% scattered among all but one of the other clusters. Fish from less polluted sites have some membership in the two “urban clusters”, and their other members are scattered among the remaining “non-polluted” eastern clusters. Therefore, admixture and migration are common among sites on either side of the New Bedford estuary, an observation parallel to both the Structure results and the pairwise G’ST matrix (Table 2, Fig. 4).
Cluster analyses and population genetic analyses raise an important question: If gene flow occurs among the less polluted habitats in southern Massachusetts outside of the New Bedford estuary, then why are Bch and NBH so distinct from the other habitats? Our leading hypothesis is that pollution at these sites is driving this pattern. This is supported by differences in allele frequencies observed at SNPs associated with the AHR system (Figs 3 and 4), the loadings on DA2 (Fig. 5), and laboratory studies demonstrating a heritable tolerance of PCBs at fish from polluted sites (Nacci et al. 2010). However, pollution only explains a small fraction of the variance in DA2. The New Bedford Estuary is the most urbanized habitat in our study area, and we cannot dismiss other factors. Infrastructure around the estuary (including a hurricane barrier completed in 1966), boat traffic, and natural environmental factors might create a unique selective environment or set of barriers that limit gene flow between Bch + NBH and the other habitats. For example, the NBH habitat is very far up in the estuary and has an average salinity of <5 ppt (Howes et al. 2006). Nevertheless, it is unclear how much of a barrier salinity alone would be for these euryhaline fish since NBH samples cluster with the nearby Bch habitat with an average salinity above 25 ppt.
The highest levels of divergence (measured through pairwise FST) occur between clean and polluted sites at AHR-associated loci which are linked to PCB tolerance (Proestou et al. 2014, Reid et al. 2016) (Fig. 3 and see below). The DAPC model itself provides additional support for the role of pollution in maintaining population structure. Because DAPC maximizes differences between orthogonal sets of variables (loci), each axis ideally represents the genetic response to uncorrelated sets of ecological variables (Jombart et al. 2010). In our data set, an individual’s position on the first discriminant axis is strongly correlated with geography (r = 0.80, P < .0001), while individual positions on DA2 are weakly but significantly correlated with sediment PCB concentration (r = 0.25, P < .0001).
Genetic diversity and Hardy Weinberg equilibrium
Surprisingly, SNP markers within AHR-associated coding regions are in HWE at most habitats, including the two most polluted sites (Supplementary Data). Further, we find no evidence of reduced genetic diversity at Bch or NBH. Ongoing selection should lead to Hardy-Weinberg Disequilibrium (HWD) within habitats at these loci. One hypothesis to explain the lack of HWD is that the pressures that operated in the mid-20th century when PCBs were first introduced have subsided as the population became adapted over several F. heteroclitus generations. The Aerovox corporation started using PCBs to manufacture electronics around 1947 (Weaver 1982), 73years before we sampled fish for this study. Because some F. heteroclitus breed in their first year, and most breed by their second (Abraham 1985), the populations sampled had been adapting for between 36 and 72 breeding cycles. Toxicology studies are consistent with this idea. Mortality rates for fish and their offspring from clean habitats are high when they are exposed to PCBs, yet descendants of NBH fish were shown to be very tolerant of PCB126 in a laboratory environment (Nacci et al. 2010), so there may be little remaining phenotypic variation in PCB tolerance at these habitats. In addition, PCB exposure has peaked and begun to decline due to ongoing dredging and capping in the New Bedford Harbor Superfund Site (United States Environmental Protection Agency 2024).
We find no evidence of lost genetic diversity at polluted sites, even at loci which are part of the AHR system. In our data set, the average Ho at all loci across all populations is 0.29. At NBH it is 0.30, and at Bch it is 0.28 (Table 2 and Supplementary Data). In NBH and Bch, the average Ho at the 11 AHR-system SNPs is 0.35 and 0.37, respectively, higher than the average of 0.30 at the other habitats. This increase in Ho is not statistically significant and is due to an artifact of allele distributions in SNPs which typically have a very common and a very rare allele. If the rare allele is favored but not fixed, then Ho will increase.
In this context, we note that embryonic fish are more susceptible to the effects of DLCs (Ikalainen and Allen 1989, Nacci et al. 1999, Clark et al. 2010) than adults. So while adult mummichogs might occasionally immigrate into the New Bedford estuary, their offspring would be susceptible to the toxic sediments. Therefore, the contribution of migrants to the gene pool would be limited. Mark-recapture studies from other habitats indicate that migrants would be rare enough to be missed by our sampling (Skinner et al. 2005). Further, PCB concentrations form a gradient from the most polluted northern end of New Bedford Harbor to the relatively cleaner sediments of Buzzard’s Bay, so the most likely immigrants to the NBH and Bch habitats from the southern harbor would not be completely naive to DLCs; therefore, occasional immigrants would not necessarily lead to detectable HWD.
Studies of pesticide resistance and population genetic theory indicate that the selection for pre-existing variants is more supportive of rapid adaptation than de novo mutations owing to the relative rarity and lower initial frequency of the latter (Hawkins et al. 2019). Regional genome-scale comparisons of F. heteroclitus between several pairs of polluted and cleaner habitats suggest that the evolution of PCB resistance arose from selection on standing genetic variation rather than de novo mutations because unique genotypes are found at multiple pollution-tolerant habitats (Proestou et al. 2014, Reid et al. 2016). The reasoning is that if a single new mutation were responsible for PCB tolerance, that new variant would be present at every polluted site (Hermisson and Pennings 2005). Instead, Reid et al. (2016) found distinct genotypes at each of several polluted sites. Parsimony suggests that this is most consistent with rare variants that are already present being swept to a higher frequency as opposed to several uniquely adaptive mutations occurring. Our data are also consistent with selection on standing variation rather than de novo mutations. We see evidence of genotype frequency differences between clean and polluted habitats for several loci in the AHR pathway (Figs 2 and 3), and potentially adaptive variants are present (though rare) at clean sites (e.g. Fig. 2 ARNT and Supplementary Data).
Population variation in AHR-associated loci
Tolerance of PCBs is linked to the aryl hydrocarbon receptor (AHR) pathway (Clark et al. 2010, Reitzel et al. 2014, Reid et al. 2016). AHR and its paralogs are ligand-activated transcription factors (Hahn et al. 2017). One of their functions in vertebrates is to induce gene products that metabolize certain xenobiotics, including dioxin-like compounds (DLCs) such as PCBs. When activated, the AHR protein binds with ARNT and other proteins which together induce the transcription of detoxifying enzymes such as CYP1A. DLCs have a strong affinity for the AHR protein, leading to a prolonged activation of the system (Hahn et al. 2017). Tolerance to DLCs is associated with reduced induction of CYP1A and other transcripts (Oleksiak et al. 2011).
If DLCs lead to overexpression of downstream products in susceptible individuals, then variants that reduce the efficiency of AHR-associated loci may be adaptive in some environments.
Multiple loci within the AHR system are associated with adaptive tolerance in F. heteroclitus in and around NBH and other polluted habitats (Oleksiak et al. 2011, Reitzel et al. 2014, Whitehead et al. 2017). Patterns of divergence among habitats at individual loci in southern New England fish confirm that alleles at several unlinked loci are associated with adaptation to polluted sediments (Figs 2 and 3 and Supplemental Data). A strong pattern of divergence between the two most polluted sites (Bch and NBH) and Jer occurs at SNPs within AHR1 and its paralog AHR2 (both on Chromosome 7) (Miller et al. 2019), ARNT (Aryl hydrocarbon nuclear transporter) (Chromosome 3), CYP1A (Cytochrome P450 1A) (Chromosome 4), CYP3A (location unknown), NADH10 (Chromosome 16), and Thioredoxin (Chromosome 15) coding loci. The anonymous locus x1223_147 shows a similar pattern of divergence. This locus resides in a region of the genome with no currently identified function. Other loci show little evidence of systematic variation OR they reflect the overall pattern trending towards isolation by distance—for example, the unmapped “anonymous” locus xTC15539_236.
In our data, variants that are more common at Bch and NBH are sometimes associated with amino acid substitutions (Supplementary Data). For example, the biggest difference in allele frequencies between Bch + NBH and other sites was at the ARNT locus (Fig. 2). ARNT binds with the AHR/Ligand complex and aids in transporting the complex to the nucleus where the complex then serves as a transcription initiator, ramping up the machinery needed to detoxify xenobiotics (Powell et al. 1999). The common variant codon codes for the positively charged amino acid arginine, while the rare variant codes for polar serine (Supplementary Data). This variant reaches a frequency of 26% at Bch and 44% at NBH. However, the serine variant is not present in our sample from the third most polluted site, Win. One possible explanation is that sediment [PCB] must reach a certain threshold to influence F. heteroclitus survival. Another is that this particular SNP has no direct effect and is merely linked to another adaptive variant we did not survey. A third possibility is that this particular variant at Win is diluted due to gene flow. FST between Win and Apo, Hak, and Mat are not significantly different from zero. Outside of Bch and NBH, the highest frequency of the serine variant is only 2% at Apo, though it is also found in low frequencies at two of the western sites (Fig. 2 and Supplementary Data). Another strong haplotype difference is seen in the PCB ligand AHR1, which contains four SNPs. Two of the observed SNPs would lead to an amino acid change. At position 161, the most common amino acid is the hydrophobic ambivalent alanine, while the less common variant codes glycine, a known secondary structure breaker (Imai and Mitaku 2005). Position 161 of AHR1 falls along an alpha helix (Uniprot O57452·O57452_FUNHE) and the variant amino acid, glycine, would cause a “kink” in that structure, impacting the protein folding. This variant reaches a frequency of 38% in NBH. It is rare west of Narragansett Bay but has higher frequencies at sites near NBH.
While the patterns described above are intriguing, additional studies are needed to determine whether these variants are directly adaptive or are incidental to selection happening on nearby sequence. Reitzel et al. (2014) note that the multi-SNP patterns like those observed here might result from either the observed SNP variants each having an effect on fitness directly, or it may be due to selection of variants tightly linked to multiple SNP haplotypes. In the second case, the locally beneficial variant is swept to near fixation, even though the multiple linked SNP markers are not. Without additional sequencing and biochemical analyses, we cannot rule out either scenario. Yet it is intriguing that several SNPs in our data set produce altered gene products, and none of the SNP variants are fixed at Bch and NBH.
Conclusions
The interconnected F. heteroclitus populations in southern New England and elsewhere represent an ideal substrate for rapid adaptation to novel pollutants because they are large enough to contain high levels of allelic diversity, yet inter-habitat gene flow is low enough to allow for genetic divergence in response to selection. The large effective population sizes of mummichogs (Adams et al. 2006), their genetic connection to adjacent habitats, and ample time for recombination appear sufficient to maintain genetic diversity in the two most polluted habitats. Our large sample sizes detect allele frequency differences at loci associated with PCB tolerance along the southern New England coast. Alleles at these loci with higher frequencies at the two most polluted sites are present but rare at other sites. Samples from the two most polluted habitats are genetically distinct from nearby sites, a pattern most pronounced at loci associated with the AHR system. Our data suggest that adaptation to the urban environment in New Bedford Harbor has changed allele frequencies, leading to a new and distinct genetic equilibrium.
Supplementary Material
Supplementary data
Supplementary data are available at JUECOL online.
Acknowledgements
We would like to acknowledge the contributions of Denise Champlin, Grace Foltz, Kamila Guerra, Emma Harrington, and Sam Lomax, who helped with collections and early lab work. Madison Francoeur and Paige Olausen helped us screen candidate loci. Mark Hahn, Sibel Karchner, Dina Proestou, and Paolo Ruggeri provided advice in identifying candidate markers. We appreciate the efforts of Kris Monahan and Dalila Alves from the Providence College Office of Sponsored Research. This work would not have been possible without their tireless support of and enthusiasm for our research culture. We also appreciate the efforts of Julie Coccia who kept our finances well organized and Pamela Snodgrass-Belt whose thoughtful systems streamlined the IACUC protocol process. Summer support for MTR was provided by a Providence College Walsh Fellowship. Grants from the Southeastern New England Educational and Charitable Foundation made this work possible. Our Providence College group greatly appreciates this keystone funding. We thank two anonymous reviewers and the Associate Editor who made a series of detailed and and helpful comments which improved the clarity of this article.
Funding
This work was supported by a Southeastern New England Educational and Charitable Foundation grant to JAM and a Providence College Walsh Fellowship to MTR.
Footnotes
Conflict of interest: None declared.
Data availability
Data are archived at Data Dryad doi: 10.5061/dryad.n8pk0p327.
References
- Abraham BJ. Species Profiles: Life Histories and Environmental Requirements of Coastal Fishes and Invertebrates (Mid-Atlantic): Mummichog and Striped Killifish. Department of the Interior, US Fish and Wildlife Service, Washington, DC, United States, 1985, 23. [Google Scholar]
- Adams SM, Lindmeier JB, Duvernell DD. Microsatellite analysis of the phylogeography, Pleistocene history and secondary contact hypotheses for the killifish, Fundulus heteroclitus. Mol Ecol 2006;15:1109–23. [DOI] [PubMed] [Google Scholar]
- Bergen BJ, Nelson WG, Mackay J et al. Environmental monitoring of remedial dredging at the new bedford harbor, MA, superfund site. Environ Monit Assess 2005;111:257–75. [DOI] [PubMed] [Google Scholar]
- Bohonak AJ. IBD (isolation by distance): a program for analyses of isolation by distance. J Hered 2002;93:153–4. [DOI] [PubMed] [Google Scholar]
- Brown BL, Chapman RW. Gene flow and mitochondrial DNA variation in the killifish, Fundulus heteroclitus. Evolution 1991;45:1147–61. [DOI] [PubMed] [Google Scholar]
- Clark BW, Matson CW, Jung D et al. AHR2 mediates cardiac teratogenesis of polycyclic aromatic hydrocarbons and PCB-126 in Atlantic killifish (Fundulus heteroclitus). Aquat Toxicol 2010; 99:232–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Di Giulio RT, Clark BW. The Elizabeth River story: a case study in evolutionary toxicology. J Toxicol Environ Health B Crit Rev 2015;18:259–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Earl DA, VonHoldt BM. Structure harvester: a website and program for visualizing structure output and implementing the Evanno method. Conservation Genet Resour 2012;4:359–61. [Google Scholar]
- Ehrlich MA, Wagner DN, Oleksiak MF et al. Polygenic selection within a single generation leads to subtle divergence among ecological niches. Genome Biol Evol 2021;13:evaa257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elskus AA, Monosson E, McElroy AE et al. Altered CYP1A expression in Fundulus heteroclitus adults and larvae: a sign of pollutant resistance? Aqu Toxicol 1999;45:99–113. [Google Scholar]
- Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol 2005;14:2611–20. [DOI] [PubMed] [Google Scholar]
- Fluidigm Corporation. SNP genotyping Users Guide: PN 68000098 M2. Fluidigm Corporation, South San Francisco, CA, United States. 2016. [Google Scholar]
- Fluidigm Corporation. D3 Assay Design Users Guide: PN 100 6812 Rev. 6. Fluidigm Corporation, South San Francisco, CA, United States. 2020. [Google Scholar]
- Francis RM. pophelper: an R package and web app to analyse and visualize population structure. Mol Ecol Resour 2017;17:27–32. [DOI] [PubMed] [Google Scholar]
- Hahn ME, Karchner SI, Merson RR. Diversity as opportunity: Insights from 600 million years of AHR evolution. Curr Opin Toxicol 2017;2:58–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hawkins NJ, Bass C, Dixon A et al. The evolutionary origins of pesticide resistance. Biol Rev Camb Philos Soc 2019;94:135–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hedrick PW. A standardized genetic differentiation measure. Evol 2005;59:1633–8. [PubMed] [Google Scholar]
- Hermisson J, Pennings PS. Soft sweeps: molecular population genetics of adaptation from standing genetic variation. Genetics 2005;169:2335–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ho KT, McKinney RA, Kuhn A et al. Identification of acute toxicants in new bedford harbor sediments. Enviro Toxic Chem 1997;16:551–8. [Google Scholar]
- Howes BL, Kelley SW, Ramsey JS et al. Linked Watershed-Embayment Model to Determine Critical Nitrogen Loading Thresholds for the Oyster Pond System, Falmouth, MA. Massachusetts estuaries project I.4. Boston, MA: SMAST/DEP Massachusetts Estuaries Project, Massachusetts Department of Environmental Protection. 2006. [Google Scholar]
- Ikalainen AJ, Allen DC. New Bedford Harbor superfund project. In: Contaminated Marine Sediments-Assessment and Remediation. Committee on Contaminated Marine Sediments, Marine Board, Commission on Engineering and Technical Systems, National Research Council. Washington, DC, United States: National Academy Press, 1989, 312–50. [Google Scholar]
- Imai K, Mitaku S. Mechanisms of secondary structure breakers in soluble proteins. Biophysics (Nagoya-Shi) 2005;1:55–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jombart T, Sebastien D., François B. Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC genetics 2010;11: 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jombart T, Collins C. A tutorial for Discriminant Analysis of Principal Components (DAPC) Using adegenet 2.0.0. 2015. https://adegenet.r-forge.r-project.org/files/tutorial-dapc.pdf. Accessed July 2024.
- Kimura M, Weiss GH. The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 1964;49:561–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lake JL, McKinney R, Lake CA et al. Comparisons of patterns of polychlorinated biphenyl congeners in water, sediment, and indigenous organisms from New Bedford Harbor, Massachusetts. Arch Environ Contam Toxicol 1995;29:207–20. [Google Scholar]
- Li YL, Liu JX. StructureSelector: a web based software to select and visualize the optimal number of clusters using multiple methods. Mol Ecol Resour 2018;18:176–7. [DOI] [PubMed] [Google Scholar]
- Lotrich VA. Summer home range and movements of Fundulus heteroclitus (Pisces: Cyprinodontidae) in a tidal creek. Ecology 1975;56:191–8. [Google Scholar]
- McMillan AM, Bagley MJ, Jackson SA et al. Genetic diversity and structure of an estuarine fish (Fundulus heteroclitus) indigenous to sites associated with a highly contaminated urban harbor. Ecotoxicology 2006;15:539–48. [DOI] [PubMed] [Google Scholar]
- Meyer JN, Nacci DE, Di Giulio RT. Cytochrome P4501A (CYP1A) in killifish (Fundulus heteroclitus): heritability of altered expression and relationship to survival in contaminated sediments. Toxicol Sci 2002;68:69–81. [DOI] [PubMed] [Google Scholar]
- Miller JM, Cullingham CI, Peery RM. The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method. Heredity (Edinb) 2020;125:269–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller JT, Reid NM, Nacci DE et al. Developing a high-quality linkage map for the Atlantic killifish Fundulus heteroclitus. G3 (Bethesda) 2019;9:2851–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mulero-Navarro S, Fernandez-Salguero PM. New trends in aryl hydrocarbon receptor biology. Front Cell Dev Biol 2016;4:45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nacci D, Coiro L, Champlin D et al. Adaptations of wild populations of the estuarine fish Fundulus heteroclitus to persistent environmental contaminants. Marine Biol 1999;134:9–17. [Google Scholar]
- Nacci DE, Champlin D, Jayaraman S. Adaptation of the estuarine fish Fundulus heteroclitus (Atlantic killifish) to polychlorinated biphenyls (PCBs). Estuaries Coasts 2010;33:853–64. [Google Scholar]
- Nacci D, Proestou D, Champlin D et al. Genetic basis for rapidly evolved tolerance in the wild: Adaptation to toxic pollutants by an estuarine fish species. Mol Ecol 2016;25:5467–82. [DOI] [PubMed] [Google Scholar]
- Oleksiak MF, Karchner SI, Jenny MJ et al. Transcriptomic assessment of resistance to effects of an aryl hydrocarbon receptor (AHR) agonist in embryos of atlantic killifish (Fundulus heteroclitus) from a marine superfund site. BMC Genomics 2011;12:263–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peakall R, Smouse PE. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 2012;28:2537–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pesch CE, Voyer RA, Copeland J et al. Imprint of the Past: Ecological History of New Bedford Harbor. U.S. Environmental Protection Agency, Office of Research and Development, National Health and Environmental Effects Research Laboratory, Atlantic Ecology Division, Narragansett, RI, United States, 2011. [Google Scholar]
- Powell WH, Karchner SI, Bright R et al. Functional diversity of vertebrate ARNT proteins: identification of ARNT2 as the predominant form of ARNT in the marine teleost, Fundulus heteroclitus. Arch Biochem Biophys 1999;361:156–63. [DOI] [PubMed] [Google Scholar]
- Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics 2000;155:945–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Proestou DA, Flight P, Champlin D et al. Targeted approach to identify genetic loci associated with evolved dioxin tolerance in Atlantic killifish (Fundulus heteroclitus). BMC Evol Biol 2014; 14:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qiagen Corporation. DNeasy Blood & Tissue Handbook. Germantown, MD, United States: Qiagen Corpoation, 2020. [Google Scholar]
- Reid NM, Proestou DA, Clark BW et al. The genomic landscape of rapid repeated evolutionary adaptation to toxic pollution in wild fish. Science 2016;354:1305–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reitzel AM, Karchner SI, Franks DG et al. Genetic variation at aryl hydrocarbon receptor (AHR) loci in populations of Atlantic killifish (Fundulus heteroclitus) inhabiting polluted and reference habitats. BMC Evol Biol 2014;14:6–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rigassio-Smith A Final East Zone 4 and East Zone 5 Remedial Action Report. Bourne, MA, United States: Jacobs Engineering, 2022. [Google Scholar]
- Roark SA, Nacci D, Coiro L et al. Population genetic structure of a nonmigratory estuarine fish (Fundulus heteroclitus) across a strong gradient of polychlorinated biphenyl contamination. Environ Toxicol Chem 2005;24:717–25. [DOI] [PubMed] [Google Scholar]
- Skinner MA, Courtenay SC, Parker WR et al. Site fidelity of mummichogs (Fundulus heteroclitus) in an Atlantic Canadian estuary. Water Quality Research Journal 2005;40:288–98. [Google Scholar]
- Sweeney J, Deegan L, Garritt R. Population size and site fidelity of Fundulus heteroclitus in a macrotidal saltmarsh creek. Biol Bull 1998;195:238–9. [DOI] [PubMed] [Google Scholar]
- Thia JA. Guidelines for standardizing the application of discriminant analysis of principal components to genotype data. Mol Ecol Resour 2023;23:523–38. [DOI] [PubMed] [Google Scholar]
- United States Environmental Protection Agency. General Information About the New Bedford Harbor Cleanup. 2024. https://www.epa.gov/new-bedford-harbor/general-information-about-new-bedford-harbor-cleanup#sdah. Accessed July 2024.
- Waits ER, Martinson J, Rinner B et al. Genetic linkage map and comparative genome analysis for the Atlantic killifish (Fundulus heteroclitus). OJGen 2016;6:28–38. [Google Scholar]
- Weaver G PCB Pollution in the New Bedford, Massachusetts Area: A Status Report. Boston, MA, United States: Massachusetts Coastal Zone Management, 1982. [Google Scholar]
- Weis JS. Tolerance to environmental contaminants in the mummichog, fundulus heteroclitus. Hum Ecol Risk Assess 2002;8:933–53. [Google Scholar]
- Whitehead A, Clark BW, Reid NM et al. When evolution is the solution to pollution: key principles, and lessons from rapid repeated adaptation of killifish (Fundulus heteroclitus) populations. Evol Appl 2017;10:762–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data are archived at Data Dryad doi: 10.5061/dryad.n8pk0p327.
