Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Dec 29.
Published in final edited form as: Dev Dyn. 2020 Oct 14;250(6):822–837. doi: 10.1002/dvdy.257

Large-scale variation in single nucleotide polymorphism density within the laboratory axolotl (Ambystoma mexicanum)

Nataliya Timoshevskaya 1, S Randal Voss 2,3,4, Caitlin N Labianca 5,6, Cassity R High 2,3, Jeramiah J Smith 1
PMCID: PMC8715502  NIHMSID: NIHMS1764885  PMID: 33001517

Abstract

Background:

Recent efforts to assemble and analyze the Ambystoma mexicanum genome have dramatically improved the potential to develop molecular tools and pursue genome-wide analyses of genetic variation.

Results:

To better resolve the distribution and origins of genetic variation with A mexicanum, we compared DNA sequence data for two laboratory A mexicanum and one A tigrinum to identify 702 million high confidence polymorphisms distributed across the 32 Gb genome. While the wild-caught A tigrinum was generally more polymorphic in a genome-wide sense, several multi-megabase regions were identified from A mexicanum genomes that were actually more polymorphic than A tigrinum. Analysis of polymorphism and repeat content reveals that these regions likely originated from the intentional hybridization of A mexicanum and A tigrinum that was used to introduce the albino mutation into laboratory stocks.

Conclusions:

Our findings show that axolotl genomes are variable with respect to introgressed DNA from a highly polymorphic species. It seems likely that other divergent regions will be discovered with additional sequencing of A mexicanum. This has practical implications for designing molecular probes and suggests a need to study A mexicanum phenotypic variation and genome evolution across the tiger salamander clade.

Keywords: axolotl, genome, hybrid, salamander, SNPs

1 ∣. INTRODUCTION

The axolotl (Ambystoma mexicanum) has been a model organism for more than 150 years. During this time, laboratory stocks were established to facilitate studies in experimental embryology, development, genome evolution, physiology, neurology, and most notably, tissue and whole organ regeneration.1 Descendants of A mexicanum that were first propagated in Europe were imported to the United States (US) in the 1930's to establish a population that continues to this day, to provide living stocks in support of biological research. The well-documented pedigree of the US A mexicanum population (Ambystoma Genetic Stock Center—AGSC),2 indicates a complex history of introgression of both laboratory and wild-caught A mexicanum, and introgression of genomes from a related tiger salamander species (A tigrinum). Against this backdrop, a number of axolotl mutants were discovered and integrated into the population. Exactly how these events have shaped genome-wide, DNA sequence variation of existing A mexicanum stocks is largely unknown.

The continuous improvement of A mexicanum genome resources over the last two decades has permitted increasingly powerful analyses of transcriptional responses, particularly during development and regeneration1,3-5 as well as genetic associations that inform genetic linkage mapping, QTL analysis and the identification of Mendelian mutations within the species' 32 Gb genome.6-10 Recently these efforts have led to the development of a chromosome-scale genome assembly for A mexicanum, which provides critical context for genetic studies and permits analyses that consolidate information across the genome.8 This study also revealed a number of large genomic intervals that contain substantially lower densities of fixed single nucleotide polymorphisms (SNPs) between A mexicanum and A tigrinum. One possible explanation for the deficiency of fixed variants within these intervals is that they demark regions of A tigrinum DNA introgression, masking differences between the two species. Indeed, one of these regions is associated with a known introgression event, wherein an albino A tigrinum was intentionally hybridized to A mexicanum through a heroic effort that, unlike other published hybrid crosses, required the use of cytonuclear transfer and inter-embryo grafting of primordial germ cells.11 Somewhat surprisingly we found that this nearly 60-year old introgression event is currently associated with a large footprint of A tigrinum DNA that spans approximately 7 cM (~0.13% of the 32 Gb genome), much longer than would be expected if the region was freely recombining in the lab strain.7 This study raised a clear and critical need to more precisely detail A mexicanum genetic variation, across the genome and among laboratory strains/individuals, to better ensure the design of molecular probes, and understand the biology and selection history of laboratory strains.

Here, we report results of a more in-depth study of A mexicanum genomic and genetic variation. We leveraged existing whole genome DNA sequence data for a wildtype female AGSC A mexicanum8,12 and a male white (d/d strain) A mexicanum,13 and compared these to higher coverage DNA sequence data from an A tigrinum male that was previously used to identify interspecific SNPs.8 We note that the d/d strain male came from a laboratory population that was established several decades ago from AGSC founders. These data, along with low coverage sequence data from 48 A mexicanum X A mexicanum/A tigrinum backcross hybrids, were used to identify polymorphisms segregating between and among individuals and to assess variation in copy number of sequences in these regions. These analyses identified seven large intervals, ranging from 48 to 243 Mb, that show an excess of polymorphisms within or between the two A mexicanum. These intervals cover more than 650 Mb and 632 annotated genes, accounting for ~2 to 3% of the axolotl genome. Analyses of repeat content revealed that these regions contain shared lineages of repetitive elements that have undergone recent expansions in A tigrinum, indicating that all of these regions likely trace to an intentional introgression into the laboratory strain of A mexicanum. As all but one of these regions were identified as being heterozygous in one of the two A mexicanum, the only individuals sequenced to date, it seems likely that other polymorphic regions will be found segregating within and among laboratory populations.

2 ∣. RESULTS

Alignment of whole genome shotgun sequencing data to the current draft assembly identified 702 million high confidence polymorphisms distributed across the 32 Gb genome, including 73 million polymorphisms within a d/d strain A mexicanum male, 143 million polymorphisms between the d/d strain male and wildtype AGSC A mexicanum female, and 486 million polymorphisms between the d/d strain male and a male A tigrinum. Individual polymorphisms and density tracks can be accessed and viewed using the UCSC interface14 at the SalSite15,16 genome browser (https://ambystoma.uky.edu/genomeresources). Based on patterns of fixation within the d/d strain male and between this individual and the other two sampled for this project, we estimate that ~12% of within-reference polymorphisms (8.8 million sites) likely trace to sequence errors in the existing assembly, the bulk of which appear to be missing bases: a common artifact in assemblies that are derived from older PacBio datasets.

Examining the distributions of fixed and segregating polymorphisms within the two A mexicanum individuals revealed numerous regions of decreased polymorphism (particularly within the d/d strain male) as well as several large regions with higher than average densities of polymorphism (Figures 1 and 2). Relatively large patches of decreased polymorphism within the d/d strain male appear to reflect a recent history of inbreeding, likely due to the establishment and ongoing maintenance of the d/d strain. In the d/d strain male, 12 Gb show low heterozygosity (<1SNP/kb), whereas 8.8 Gb are within this range in the wildtype AGSC female. These distributions also reveal seven large patches (>10 Mb) of excess polymorphism in that are located on six chromosomes. These patches included three that are heterozygous in the d/d strain male (chromosomes 4, 5, and 9), two that are heterozygous in the wildtype AGSC female (chromosomes 4 and 7), a small region of locally-enriched homozygous polymorphism on chromosome 2Q, and a large interval on chromosome 1 that is highly differentiated between the two sampled A mexicanum. Within highly differentiated regions, the degree of polymorphism between heterozygous alleles or between the two A mexicanum individuals is similar to or greater than that observed between A mexicanum and A tigrinum (Figures 2 and 3). It seems possible that the presence of highly polymorphic regions could be explained by the retention of polymorphisms from the A mexicanum individuals that were used to found colony strains in the face of inbreeding, or polymorphisms that were introduced to the population during the introduction of the albino locus.11

FIGURE 1.

FIGURE 1

Examining variation in the density of polymorphisms identifies regions with an excess or dearth of polymorphisms. A, Homozygous polymorphisms between two sequenced A mexicanum individuals: a d/d strain male and a wildtype AGSC female. Note a region of excess polymorphism on chromosome 1. B, Heterozygous polymorphisms identified from low-coverage sequencing of the d/d strain male. Note regions of excess polymorphism on chromosomes 4, 5, and 9, and several stretches with decreased polymorphism on all chromosomes presumably due to recent history of inbreeding. C, Heterozygous polymorphisms identified for the wildtype AGSC female. Note regions of excess polymorphism on chromosomes 4, and 7, as well as a few small stretches with decreased polymorphism (eg, chr8). Scalebar: 1 Gb

FIGURE 2.

FIGURE 2

Histogram of SNP densities within sequenced individuals. Plots show densities of homozygous (red) and heterozygous (blue) SNPs identified in A, the d/d strain male, B, the AGSC female, C, the sampled A tigrinum, and D, Three regions of high differentiation that were characterized in this study

FIGURE 3.

FIGURE 3

Patterns of polymorphism in a wild caught Ambystoma tigrinum provide perspective on those observed in laboratory axolotls. A, Fixed homozygous polymorphisms are higher on average between A tigrinum and A mexicanum than between the two sampled A mexicanum). B, On average this individual has higher levels of heterozygosity than either of the sampled A mexicanum. Notably, both homozygous and heterozygous SNPs show decreases in frequency overlapping regions of excess heterozygosity in the d/d A mexicanum at chr 4, 5, and 9. Careful examination of signal on chromosome 3 indicates that decreased polymorphism rate is due to the presence of several large gaps in this region

Examination of polymorphisms assessed from A tigrinum shotgun sequence data reveals that this wild captured individual is significantly more polymorphic than either of the laboratory A mexicanum sampled here. Notably, there appear to be more polymorphic sites within this individual than there are fixed differences between A mexicanum and A tigrinum. This observation is generally consistent with the recent rapid divergence of the tiger salamander species complex, which includes A mexicanum, A tigrinum and at least 16 other phenotypically diverse named species.17 By extension these results indicate that many of the polymorphisms identified in this study (across all animals) are likely to be segregating in A tigrinum and other members of the species complex.

2.1 ∣. Homozygous polymorphisms on chromosome 1

Among the most striking regions of excess polymorphism is a ~150 Mb interval on the P arm of chromosome 1 (150.9-298.9 Mb) that contains 161 annotated genes (Figure 4). In total, the two sequenced A mexicanum individuals differ by a total of 1.66 million fixed polymorphisms across this interval. This corresponds to an average difference of 11.2 SNPs/kb across this region and contains a higher frequency of polymorphism than 99.99% of 1 Mb intervals in interspecific comparisons between A mexicanum and A tigrinum (all but two). Notably though, this degree of differentiation is only slightly higher than the difference between A tigrinum alleles. Interestingly, neither the wildtype AGSC female nor d/d strain male shares more fixed polymorphisms with A tigrinum, although segregating polymorphisms in the d/d strain male are more likely to be present in the sampled A tigrinum (Figure 4).

FIGURE 4.

FIGURE 4

Differences in polymorphism frequency and copy number over a 150 Mb interval on chromosome 1. The frequency of fixed homozygous polymorphisms between the d/d strain male and wildtype AGSC female is higher over this region in comparison to the rest of the genome. Heterozygous sites detected from the d/d strain male are more likely to match alleles found in A tigrinum. Notably, depth of sequence coverage analyses reveal that A tigrinum has substantially increased read depth over this region (due to the presence of additional repetitive element copies elsewhere in the A tigrinum genome). Increased copy number is also observed in pooled sequence data from 48 backcross hybrids between A mexicanum and A tigrinum. The top half of the chromosome (roughly the P arm) is shown here - larger chromosomes were split in half to circumvent length limitations when originally submitted to NCBI

To validate detected differences and estimate the frequency of d/d strain male vs wildtype AGSC female variants of this region, we designed a primer that spans a large insertion/deletion variant in the cold inducible RNA-binding protein (CIRBP) gene. This primer pair permits rapid diagnosis of both alleles (Figure 5). A survey of 162 A mexicanum sampled from the AGSC identified both variants, including several heterozygous individuals. These assays demonstrate that both alleles segregate at appreciable frequencies in the stock center, with genotype frequencies conforming to expectations under Hardy-Weinberg equilibrium.18,19 Based on this survey, the wildtype AGSC female variant occurs at a high frequency among AGSC axolotls (~82%). Two copies of the wildtype AGSC female variant also appear to be present in RNAseq data that were generated from a single axolotl that was more recently derived from the native population from lake Xochimilco20 (Figure 6). These RNAseq analyses also indicate that a second nearby polymorphic interval may be segregating in other cohorts that have been used for RNAseq studies.

FIGURE 5.

FIGURE 5

Identification of alternate haplotypes by PCR. A, Example of CIRBP amplicons showing CIRBPAGSC homozygotes (single band) and CIRBPAGSC/CIRBPdd heterozygotes (two bands). B, Genotype and allele frequencies within a sample of 162 individuals from the Ambystoma Genetic Stock Center

FIGURE 6.

FIGURE 6

Evidence for widespread presence of the colony-like variant of the chromosome 1 polymorphic region. Analysis of polymorphism frequencies from published transcriptome datasets identifies six published studies that contain only animals with the colony-like variant, including one (BioProject: PRJNA354434) that sampled animals from their native population. Three other studies contain animals that are homozygous for the colony-like variant and other animals that are either heterozygous or homozygous for the d/d-like variant. Samples that are homozygous for the colony-like genotypes or possessing d/d-like genotypes are separated into upper and lower panels (respectively) for these three projects. Notably, a second potential variant was observed in two of these studies (PRJNA312389 and PRJNA300706): marked by an asterisk

A particularly notable feature of the chr1P 150.9-298.9 region was revealed by analysis of read coverage. In A tigrinum, this region is represented by approximately four times more reads than the corresponding A mexicanum region and adjacent (presumably diploid) regions, and coverage is also increased across this interval in sequence data from 48 A mexicanum X A mexicanum/A tigrinum backcross hybrids21 (Figure 4). This observation was initially interpreted as evidence that A tigrinum had experienced one or more large-scale duplications over this region. However, more careful examination of read coverage revealed that the region contained several intervals at approximately modal coverage (ie, diploid) that are interleaved with intervals of exceedingly high coverage in A tigrinum. Given this pattern, we surmised that the chr1P polymorphic interval might contain high-identity repetitive elements that are present at high copy number in A tigrinum, but relatively lower copy number in A mexicanum. To test this idea, we extracted and re-assembled high coverage intervals to identify candidate repetitive elements that occur at high copy number within the region and within A tigrinum. These efforts identified three elements: (1) a 6498 bp Harbinger DNA transposon, (2) a 1528 bp Epsilon LTR retroelement, and (3) a 9353 bp Gypsy LTR retroelement. These consensus repeats were used to perform more focused coverage analyses using A mexicanum and A tigrinum shotgun data, and revealed that A tigrinum contains a large number of additional recently expanded elements, including: 25-30 thousand additional Harbinger elements, 7-10 thousand additional Epsilon elements, and 5-10 thousand additional Gypsy elements. These would be expected to account for 200 to 300 Mb of additional inserted genomic sequence in A tigrinum (Figure 7). Phylogenetic analyses of elements extracted from the full length of the A mexicanum reference genome reveal that individual elements from the 1P region cluster together and with elements found within smaller regions of high heterozygosity in the d/d strain male (chr 4, 5, and 9, Figure 8). Taken together, the observed differences in repeat copy number, shared evolutionary history of repeat families in defined regions of chromosomes 1, 4, 5, and 9, and high polymorphism rates within these regions strongly suggest that the d/d strain male carries large, introgressed A tigrinum segments over these four intervals.

FIGURE 7.

FIGURE 7

Estimated copy number of repetitive elements in A mexicanum and A tigrinum. Counts of repetitive elements are estimated across the length of each of three consensus repeats that were identified in the chr1P polymorphic region

FIGURE 8.

FIGURE 8

Young repeat families found in the chr1P polymorphic region and other polymorphic regions. A-C, Counts of full-length elements extracted from the A mexicanum assembly, with varying sequence identity with the chr1 consensus A, Harbinger, B, Epsilon, and C, Gypsy. D-F, Phylogenetic trees of D, 500 Harbinger elements, E, 1000 Epsilon elements and F, 400 Gypsy elements. Trees include all elements in polymorphic regions and additional randomly selected members from age classes that are closest to the chr1P consensus (rightmost peaks). Clades with members in the chr1P polymorphic region are highlighted in red

2.2 ∣. The albino introgression region (chromosome 7)

Our results further clarify previous analyses of the historical introgression of the albino mutation into laboratory stocks.11 Specifically, these analyses identify a large divergent region surrounding the tyrosinase locus, a mutant version of which underlies the albino phenotype,7 that spans approximately 243 Mb (Figure 9). This region is the largest interval identified in this study and includes 178 annotated genes. As this region is known to have been introgressed from A tigrinum, it provides an important perspective on patterns of polymorphism that can be confidently tied to a well-defined hybridization event. The wildtype AGSC female is known to be a heterozygous carrier of the albino mutation, and the d/d strain male appears to carry two wildtype A mexicanum alleles. As such, the d/d strain male sequence underlying this region would not be expected to carry A tigrinum polymorphisms or be enriched for the aforementioned repetitive elements in the A mexicanum assembly.

FIGURE 9.

FIGURE 9

Differences in polymorphism frequencies over a 243 megabase interval on chromosome 7. The frequency of heterozygous polymorphisms between the d/d strain male and AGSC individuals is higher over this region in comparison to the rest of the genome

As with the chr1P interval, the maintenance of this large interval of A tigrinum DNA over the intervening 57 years since the initial hybridization suggests that some factor has either prevented recombination or promoted the retention and co-inheritance of A tigrinum alleles in this region. To test this idea, we extracted recombination frequencies from a backcross mapping family A mexicanum X A mexicanum/A tigrinum that was previously used to perform QTL analyses of metamorphosis and sex, and more recently to generate scaffolding information necessary to achieve chromosome-scale contiguity of the A mexicanum reference genome.8,21,22 Curiously, this region appears to recombine at only slightly reduced frequencies. The albino interval recombined at a rate of 8.2 Mb/centimorgan (cM) vs 5.9 Mb/cM across the remainder of chromosome 7 (Figure 10). By comparison the chromosome 1 polymorphic interval recombined at a rate of 4.2 Mb/cM, with the remainder of the chromosome recombining at a rate of 5.0 Mb/cM (Figure 10). Based on data from hybrid crosses, these regions appear to undergo recombination, indicating that the persistence of large A tigrinum footprints is likely not explained by the presence of simple inversions between A tigrinum and A mexicanum (discussed in more detail below).

FIGURE 10.

FIGURE 10

Patterns of recombination across two chromosomes. A, chromosome 7 and B, chromosome 1. The relationship between map distance (centiMorgans) and assembly length is shown. The albino introgression region is highlighted in red in panel A and the chr1P polymorphic region is highlighted in red in panel B

2.3 ∣. Other regions of high heterozygosity

In addition to those discussed above, our analyses identified one additional (relatively small) region of fixed differences between the d/d strain male and wildtype AGSC female on the end of the Q arm of chr2 (50 Mb, 22 genes), as well as four other regions that showed an excess of heterozygous polymorphism and were generally similar to patterns seen for the albino introgression region (Figures 11 and 12). Each of these polymorphic intervals was identified in one, but not both individuals. Two of these highly polymorphic intervals were identified on the P arm of chr4: a 51 Mb interval containing 19 genes was identified as being heterozygous in the d/d strain male and a nearby 120 Mb interval containing 75 genes was identified as being heterozygous in the wildtype AGSC female (Figure 11). While it is unclear whether these two adjacent intervals represent fragments of the same introgression event, the densities of SNPs in the more proximal 51 Mb region seem to be consistent with introgression from A tigrinum. Specifically, the numbers of fixed and segregating differences between the d/d strain male and A tigrinum is lower over this region in comparison to adjacent regions.

FIGURE 11.

FIGURE 11

Differences in polymorphism frequency over a 120 Mb interval on chromosome 4 and a smaller 51 Mb interval on the same chromosome. The frequency of heterozygous polymorphisms between the d/d strain male and AGSC individual is higher over a 120 Mb region (yellow highlight) in comparison to the rest of the genome. By contrast, there are fewer differences between the d/d strain male and A tigrinum over a second interval (pink). The top half of the chromosome (roughly the P arm) is shown here

FIGURE 12.

FIGURE 12

Other regions with an excess of polymorphism. A-B, Regions on A, chr5 and B, chr9 have increased rates of heterozygosity in the d/d strain male. The bottom half of chr5 (roughly the Q arm) is shown here. C, A small region of chr2 with an increased rate of homozygous polymorphism between the d/d strain male and AGSC wildtype female. The bottom half of chr2 (roughly the Q arm) is shown here

Variable regions on chromosome 5 (48 Mb, 12 genes) and 9 (100 Mb, 60 genes) show similar decreases in the frequency of A tigrinum polymorphisms in these regions of higher heterozygosity (Figure 11). The presence of young Harbinger and Epsilon repeats that appear to have undergone recent expansion in A tigrinum and share evolutionary history with copies in the chr1P polymorphic region, suggests that the d/d strain male is heterozygous for relatively large A tigrinum introgressions over regions of chromosome 4, 5, and 9. However, it is worth noting that this level of variation (in the absence of associated repeats) is within the range of heterozygosity that would be expected for A tigrinum, which leaves open the possibility that some intervals might represent variation that was present in the ancestral A mexicanum population and retained within laboratory populations.

3 ∣. DISCUSSION

The analyses presented here provide an in-depth characterization of polymorphisms and polymorphism frequencies across the 32 Gb genome of two individuals from the laboratory strain of A mexicanum. Sampling one individual from a related species that was used to introduce genetic material into laboratory stocks (including the albino mutation) provides critical perspective on patterns of polymorphism in modern laboratory stocks. Importantly, these findings suggest that other similarly divergent regions might be identified by sequencing other A mexicanum individuals from the laboratory and wild populations. These analyses also reveal the extent of polymorphism that is harbored by natural tiger salamander populations and the recent dynamics of repeat expansions in the tiger salamander lineage.

These findings suggest that there are likely to be several other highly polymorphic regions segregating in the AGSC and other laboratory populations across the globe that were founded from this same stock. Our analyses of the chr1P polymorphic region demonstrate that polymorphisms that are relatively rare in the population as a whole can rise to high frequency in individual lab subpopulations, presumably due to founder effects in establishing individual colonies or inbred strains. This has practical implications for the design of molecular probes by investigators that maintain independent lines and suggests a need to investigate A mexicanum phenotypic variation among laboratory populations.

An important caveat of this work is that our analyses rely on an imperfect reference assembly (something that is true for most genome assemblies) for an organism that contains a large number of repetitive elements, and that inferences of repeat diversity are hampered by difficulties in assembling the reference genome over these intervals and multiple mapping of the short reads that correspond to recently expanded repetitive elements. To account for these issues, we generated consensus repeat sequences from multiple elements and performed focal analyses of coverage following out initial observations. However, this is also an imperfect solution. In general, it can be taken as a certainty that future improvements in the accuracy of the A mexicnaum reference assembly, as well as the accuracy/throughput of long read sequencing will permit more precise analyses of repeat content evolution and the history of introgression and selection in A mexicanum.

To assess whether the presumptive A tigrinum introgression regions identified in this study are likely to contribute biologically meaningful variation we annotated polymorphisms with respect to their predicted effect on gene function. These analyses revealed that gene bodies in these intervals were enriched for nonsynonymous substitutions, even when accounting for the higher density of polymorphism (Figure 13). Standardized rates of nonsynonymous substitution in introgression intervals are higher than the average rates of all chromosomes. Rates of polymorphism in 5 kB upstream regulatory regions are also higher in these intervals (Figure 13). The standardized rate of upstream regulatory polymorphism in the chr1P region is higher than the genome wide average, and rates of upstream regulatory polymorphism in other introgression regions are higher than the rates of all individual chromosomes. These analyses suggest that A tigrinum introgression intervals may have a disproportionate impact on biological variation in A mexicanum, although the precise estimates of this impact should be interpreted with caution. The presence of insertion deletion artifacts in assemblies with older (in particular) long read chemistries can impact coding frame predictions and the assessment of nonsynonymous sites. However, it does not seem that variation in the indel error rate should be generally higher in introgression intervals. There also remains much to be learned with respect to the size and distribution of regulatory sequences in large salamander genomes. Ongoing studies of chromatin state, chromatin conformation and promotor recruitment should better resolve upstream regulatory regions across the genome and the regulatory impacts of transposable element insertion in these intervals.

FIGURE 13.

FIGURE 13

Assessment of potentially biologically relevant polymorphisms. A-C, Proportion of substitutions resulting in nonsynonymous amino acid changes across chromosomes/chromosome arms and within candidate introgression intervals. D-F, Proportion of substitutions resulting in changes within 5kp upstream of annotated genes. Values are normalized by the density of coding bases, A-C or genes, D-F, annotated to each chromosome or region and are plotted separately for polymorphism classes: A,D, the chr1P region with an excess of homozygous differences; B,E, chr 4P and 7 regions with an excess of heterozygous polymorphism in the AGSC female; C,F, chr 4P, 5Q and 9 regions with an excess of heterozygous polymorphism in the d/d male

The maintenance of long stretches of A tigrinum DNA in the albino and chr1P polymorphic regions suggests that some factor(s) may have inhibited the recombinational breakup of these regions over the last 60 years. Segregation patterns in a hybrid mapping family suggests that these are not simply due to fixed inversions between A tigrinum and A mexicanum, as measurable recombination was observed across these intervals. It seems plausible that the A tigrinum individual that was used to introduce the albino mutation may have carried inversions that inhibit recombination, although the observed footprints might also be the product of more modest inhibition of recombination due to the accumulation of repetitive elements. Differences in repeat abundance and local structural differences might explain previously-reported banding differences between a large pair of homologous chromosomes in interspecific A tigrinum/A mexicanum hybrids.23 Additionally, it seems likely that selection for inheritance of linked polymorphisms might contribute to the maintenance of this large interval of A tigrinum DNA in laboratory populations, given that recombination is not completely inhibited over these intervals and that these intervals (particularly chr1P) are present in multiple individuals and laboratory populations.

The identification of clades of repetitive elements that are diagnostic of A tigrinum DNA, highly identical to one another and expanded in A tigrinum, implies that these repeats have expanded recently in the A tigrinum lineage. To test for contemporary activity of these elements we examined patterns of RNAseq read mapping across individual genomic intervals that trace their ancestry to A tigrinum. A survey of multiple RNAseq studies that are available through the A mexicanum genome browser reveals that several of these elements show evidence of contemporary transcriptional activity, suggesting the possibility that they are still actively moving within the genome (Figure 14). In the case of Harbinger, these observations add to a growing, but still very small list of taxa wherein there is evidence for recent or ongoing activity. Active Harbingers have been observed in coelacanth24 and Iberian ribbed newt,25 but active members are apparently absent from multiple other vertebrate lineages. As Harbingers and other repetitive elements often carry their own promotor and enhancer activity, these elements have the potential to contribute to regulatory differences between A mexicanum and A tigrinum, and among laboratory A mexicanum populations that carry varying fractions of the A tigrinum genome.

FIGURE 14.

FIGURE 14

Evidence for expression of repetitive elements from regeneration and gonadal RNAseq studies. Four genomic intervals are shown for each repeat family, Mapped RNAseq reads are color coded by study: orange,38 green,4 and blue13

Our comparisons of fixed and segregating differences also have implications for studies of the evolution and diversification of species within the tiger salamander species complex.17 In the case of A tigrinum (the only fully sequenced wild-captured salamander to date), it is notable that the number of segregating polymorphisms within this individual is approximately double the number of polymorphisms that are inferred to be fixed between this A tigrinum and the two sampled A mexicanum. Given that this study sampled a fraction of the alleles that are present in A mexicanum and A tigrinum populations, it seems likely that many of the fixed polymorphisms identified in this study are segregating in A mexicanum, A tigrinum and other Ambystoma species. Ultimately, sampling the genomes of additional A mexicanum, including those from contemporary and larger historical populations, as well as A tigrinum and other members of the recently-radiated tiger salamander complex will be invaluable in understanding the distribution and evolutionary origins and biological consequences of genetic polymorphism within laboratory stocks of A mexicanum, and among its evolutionary/phenotypically diverse relatives.

4 ∣. EXPERIMENTAL PROCEDURES

4.1 ∣. Sequencing

In addition to a previously published 196 Gb whole genome shotgun sequencing dataset for A tigrinum (SRX5119016, SAMN10586087, 125 bp paired-end reads),8 DNA from the same individual was sequenced to increase depth of coverage. Illumina sequencing (NovaSeq 6000) was performed by the DNA Services Lab, University of Illinois at Urbana-Champaign (https://biotech.illinois.edu/htdna) and yielded an additional 123 Gb of sequence data in 150 bp paired-end reads (SRX910869, added to PRJNA509654: SAMN10586087). In total, these yielded about 309.5 Gb of sequence data in 2.3 billion bringing the depth of coverage for A tigrinum to ~9.6X.

This study also used a ~20X coverage shotgun sequence dataset from a phenotypically wildtype female individual selected from the AGSC (SRX800915, SAMN03256322, 100 bp paired-end reads)8,12 and a previously published ~8X coverage shotgun sequence dataset from the male axolotl that was used to generate primary contigs for the A mexicanum reference assembly (“d/d male”: SRX3655578-SRX3655581, SAMN06554622, 250 bp paired-end reads).13 Finally, set of previouslyreported reads from a collection of low coverage sequence data from 48 A mexicanum X A mexicanum/A tigrinum backcross hybrids (PRJNA477812, 124 bp paired-end reads)8,21 was used to calculate coverage and estimate copy number for a candidate introgressed region on chr1P.

4.2 ∣. Alignment and polymorphism calling

To avoid complications with handling >1 Gb scaffolds, reads were mapped to smaller individual scaffolds from the AmexG_v3.0.0 assembly (ASM291563v1)13 and lifted to the chromosome anchored assembly (ASM291563v2).8 For all data sets, whole genome sequencing data were aligned to the reference with BWA-MEM with option -a.26 Secondary alignments were removed using Samtools view with option -F230827 and duplicates were removed with the bammarkduplicates2 tool from biobambam v.0.0.189.28 Filtering resulted in the retention of 2.2 billion A tigrinum with a modal coverage depth of 5.5X; ~6 billion reads from the wildtype female AGSC A mexicanum with a modal coverage depth of ~20X and 0.9 billion reads from the d/d strain A mexicanum with a modal coverage depth of coverage of ~7X.

Due to the large size and complex repetitive environment of the A mexicanum genome, variant discovery and genotyping focused on approximately single-copy intervals, as assessed by depth of sequence coverage. Valid coverage ranges for the three datasets were 2—20X for A tigrinum, 10—50X for the colony A mexicanum, and 4—24X for the d/d strain A mexicanum. Genotypes calls were generated from mapped reads using Samtools mpileup v.1.8., with the options −B −A −x −Q 0 to avoid losing alignments positioned near polymorphic insertions of other structurally variant features,27 following by bcftools call v.1.9.29 to assign genotypes. After extraction of fixed polymorphic and heterozygous sites polymorphisms were lifted to the chromosome-scale assembly ASM291563v2. Polymorphism densities were calculated using the makewindows and map functions of bedtools.30 Depth of coverage was computed from alignment tracks using the genomeCoverageBed function of bedtools. Normalized coverage tracks were generated over 1 kB intervals for visualization though the A mexicanum assembly hub (https://ambystoma.uky.edu/genome-resources/86-axolotl-genome-browser). For reanalysis of repetitive elements, alignments were filtered to a minimal alignment length of 90 bp.

The program snpEff v4.331 was used to categorize polymorphisms with respect to their location and potential functional impact relative to annotated genes. Estimates of polymorphism rate were generated separately for three classes of polymorphism (1) homozygous polymorphisms between the d/d male and AGSC female, (2) heterozygous polymorphisms within the AGCS female, and (3) heterozygous polymorphisms within the d/d male. Rates of nonsynonymous polymorphism were standardized to CDS density and rates of upstream polymorphism were standardized to gene density within each chromosome or interval in order to account for variation in gene density across chromosomes and chromosomal regions.

To identify repetitive elements that occur at high copy numbers within A tigrinum and the chr1P polymorphic region, we extracted 1 kB intervals with average depth of sequence coverage exceeding 1000 reads. These regions were aligned and manually curated using SeqManV10.0.0 (Lasergene) to yield consensus sequences for repetitive elements that occurred in multiple positions within the chr1P region. Classification of repetitive elements was performed using the RepeatMasker Web Server (http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker)32 and BLAST+33 nucleotide alignments to annotated A mexicanum repeats.13 To estimate copy number of these consensus sequences we used BWA-MEM with option -a to align reads from A mexicanum and A tigrinum to the repeat consensus sequence and filtered out secondary alignments with the SAMtools view flag -F2308 as well as alignments spanning fewer than 90 bases after soft clipping. Copy numbers were estimated from depth of coverage and normalized by a factor of 5.5 for A tigrinum, 20 for the AGSC female A mexicanum, and 7 for d/d male. Generation of multiple sequence alignments and tree construction were performed using Clustal Omega using default parameters available through the ENSEMBL API.34,35 Plotting and annotation of trees was performed using FigTree V1.4.4 (https://github.com/rambaut/figtree).

4.3 ∣. PCR assays

Primers were designed to amplify across a diagnostic indel using Primer3.36 Sequences for these primers are (CIRBP_for: GGCCATAGTCCGCGCTCTATA, CIRBP_rev: ATGCAACAACTAACGCTGTAGAATA). Fragments were amplified using standard PCR conditions (150 ng DNA, 50 ng of each primer, 200 mM each dATP, dCTP, dGTP, dTTP; thermal cycling at 94°C for 4 minutes; 34 cycles of 94°C for 45 seconds, 60°C for 45 seconds, 72°C for 30 seconds; and 72°C for 7 minutes). The DNAs used in these reactions were extracted from tail clips using standard phenol/chloroform extraction.37

4.4 ∣. RNAseq mapping and genotyping

RNAseq reads (158 samples) from several previouslypublished studies4,13,20,38-43 were mapped (bowtie2 V.2.3) to the complete set of annotated genomic gene models.13 Biallelic variants were called by HaplotypeCaller (GATK v. 3.6.)44 and their chromosomal positions located with CrossMap(v0.3.6).45 Numbers of fixed polymorphisms for each sample were computed for all 10 Mb intervals on chr1P.

ACKNOWLEDGMENTS

This work was funded by grants from the National Institutes of Health (NIH) (R24OD010435) to S. Randal Voss, (5R35GM130349) to Jeramiah J. Smith and Department of Defence (DOD) (W911NF1110475) to S. Randal Voss. Animals used in this study were provided by the Ambystoma Genetic Stock Center, which is currently funded by the NIH (P40OD019794) and previously by the National Science Foundation (NSF) (DBI-0951484) to S. Randal Voss. The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of NIH, DOD, or NSF. Partial computational support was provided by The University of Kentucky High Performance Computing complex.

REFERENCES

  • 1.Voss SR, Epperlein HH, Tanaka EM. Ambystoma mexicanum, the axolotl: a versatile amphibian model for regeneration, development, and evolution studies. Cold Spring Harb Protoc. 2009;(8). 10.1101/pdb.emo128. [DOI] [PubMed] [Google Scholar]
  • 2.Voss SR, Woodcock MR, Zambrano L. A tale of two axolotls. Bioscience. 2015;65(12):1134–1140. 10.1093/biosci/biv153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Voss SR, Palumbo A, Nagarajan R, et al. Gene expression during the first 28 days of axolotl limb regeneration I: experimental design and global analysis of gene expression. Regeneration (Oxf). 2015;2(3):120–136. 10.1002/reg2.37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Dwaraka VB, Smith JJ, Woodcock MR, Voss SR. Comparative transcriptomics of limb regeneration: identification of conserved expression changes among three species of Ambystoma. Genomics. 2019;111(6):1216–1225. 10.1016/j.ygeno.2018.07.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Voss SR, Ponomareva LV, Dwaraka VB, et al. HDAC regulates transcription at the outset of axolotl tail regeneration. Sci Rep. 2019;9(1):6751. 10.1038/s41598-019-43230-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Voss SR, Shaffer HB. Adaptive evolution via a major gene effect: paedomorphosis in the Mexican axolotl. Proc Natl Acad Sci U S A. 1997;94(25):14185–14189. 10.1073/pnas.94.25.14185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Woodcock MR, Vaughn-Wolfe J, Elias A, et al. Identification of mutant genes and Introgressed Tiger salamander DNA in the laboratory axolotl, Ambystoma mexicanum. Sci Rep. 2017;7(1):6. 10.1038/s41598-017-00059-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Smith JJ, Timoshevskaya N, Timoshevskiy VA, Keinath MC, Hardy D, Voss SR. A chromosome-scale assembly of the axolotl genome. Genome Res. 2019;29(2):317–324. 10.1101/gr.241901.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Keinath MC, Timoshevskaya N, Timoshevskiy VA, Voss SR, Smith JJ. Miniscule differences between sex chromosomes in the giant genome of a salamander. Sci Rep. 2018;8(1):17882. 10.1038/s41598-018-36209-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Page RB, Boley MA, Kump DK, Voss SR. Genomics of a metamorphic timing QTL: met1 maps to a unique genomic position and regulates morph and species-specific patterns of brain transcription. Genome Biol Evol. 2013;5(9):1716–1730. 10.1093/gbe/evt123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Humphrey RR. Albino axolotls from an albino tiger salamander through hybridization. J Hered May-Jun. 1967;58(3):95–101. 10.1093/oxfordjournals.jhered.a107572. [DOI] [PubMed] [Google Scholar]
  • 12.Keinath MC, Timoshevskiy VA, Timoshevskaya NY, Tsonis PA, Voss SR, Smith JJ. Initial characterization of the large genome of the salamander Ambystoma mexicanum using shotgun and laser capture chromosome sequencing. Sci Rep. 2015;5:16413. 10.1038/srep16413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Nowoshilow S, Schloissnig S, Fei JF, et al. The axolotl genome and the evolution of key tissue formation regulators. Nature. 2018;554(7690):50–55. 10.1038/nature25458. [DOI] [PubMed] [Google Scholar]
  • 14.Kent WJ, Sugnet CW, Furey TS, et al. The human genome browser at UCSC. Genome Res. 2002;12(6):996–1006. 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Smith JJ, Putta S, Walker JA, et al. Sal-site: integrating new and existing ambystomatid salamander research and informational resources. BMC Genomics. 2005;6:181. 10.1186/1471-2164-6-181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Baddar NW, Woodcock MR, Khatri S, Kump DK, Voss SR. Sal-site: research resources for the Mexican axolotl. Methods Mol Biol. 2015;1290:321–336. 10.1007/978-1-4939-2495-0_25. [DOI] [PubMed] [Google Scholar]
  • 17.Shaffer HB, McKnight ML. The polytypic species revisited: genetic differentiation and molecular Phylogenetics of the Tiger salamander Ambystoma tigrinum (Amphibia: Caudata) complex. Evolution. 1996;50(1):417–433. 10.1111/j.1558-5646.1996.tb04503.x. [DOI] [PubMed] [Google Scholar]
  • 18.Hardy GH. Mendelian proportions in a mixed population. Science. 1908;28(706):49–50. 10.1126/science.28.706.49. [DOI] [PubMed] [Google Scholar]
  • 19.Weinberg W. Über den Nachweis der Vererbung beim Menschen. Jahreshefte Des Vereins für vaterländische Naturkunde in Württemberg. 1908;64:368–382. [Google Scholar]
  • 20.Caballero-Perez J, Espinal-Centeno A, Falcon F, et al. Transcriptional landscapes of axolotl (Ambystoma mexicanum). Dev Biol. 2018;433(2):227–239. 10.1016/j.ydbio.2017.08.022. [DOI] [PubMed] [Google Scholar]
  • 21.Voss SR, Smith JJ. Evolution of salamander life cycles: a majoreffect quantitative trait locus contributes to discrete and continuous variation for metamorphic timing. Genetics. 2005;170(1): 275–281. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Smith JJ, Voss SR. Amphibian sex determination: segregation and linkage analysis using members of the tiger salamander species complex (Ambystoma mexicanum and A. t. tigrinum). Heredity (Edinb). 2009;102(6):542–548. 10.1038/hdy.2009.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cuny R, Malacinski GM. Banding differences between tiger salamander and axolotl chromosomes. Can J Genet Cytol. 1985;27 (5):510–514. 10.1139/g85-076. [DOI] [PubMed] [Google Scholar]
  • 24.Smith JJ, Sumiyama K, Amemiya CT. A living fossil in the genome of a living fossil: harbinger transposons in the coelacanth genome. Mol Biol Evol. 2012;29(3):985–993. 10.1093/molbev/msr267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Elewa A, Wang H, Talavera-Lopez C, et al. Reading and editing the Pleurodeles waltl genome reveals novel features of tetrapod regeneration. Nat Commun. 2017;8(1):2286. 10.1038/s41467-017-01964-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009;25(14):1754x–1760. 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Li H, Handsaker B, Wysoker A, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25 (16):2078–2079. 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Tischler G, Leonard S. Biobambam: tools for read pair collation based algorithms on BAM files. Source Code Biol Med. 2014;9:13. [Google Scholar]
  • 29.Li H A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987–2993. 10.1093/bioinformatics/btr509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Quinlan AR. BEDTools: the Swiss-Army tool for genome feature analysis. Curr Protoc Bioinformatics. 2014;47:11 12 1–34. 10.1002/0471250953.bi1112s47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Cingolani P, Platts A, Wang le L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6(2): 80–92. 10.4161/fly.19695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Smit AFA, Hubley R, Green P. RepeatMasker. Unpublished data. 2020; open-4.0.9. [Google Scholar]
  • 33.Camacho C, Coulouris G, Avagyan V, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sievers F, Wilm A, Dineen D, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal omega. Mol Syst Biol. 2011;7:539. 10.1038/msb.2011.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Madeira F, Park YM, Lee J, et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 2019;47 (W1):W636–W641. 10.1093/nar/gkz268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Untergasser A, Cutcutache I, Koressaar T, et al. Primer3—new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115. 10.1093/nar/gks596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sambrook J, Russell DW, Sambrook J. The Condensed Protocols from Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 2006:800. [Google Scholar]
  • 38.Bryant DM, Johnson K, DiTommaso T, et al. A tissue-mapped axolotl De novo Transcriptome enables identification of limb regeneration factors. Cell Rep. 2017;18(3):762–776. 10.1016/j.celrep.2016.12.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kerdivel G, Blugeon C, Fund C, Rigolet M, Sachs LM, Buisine N. Opposite T3 response of ACTG1-FOS subnetwork differentiate tailfin fate in Xenopus tadpole and post-hatching axolotl. Front Endocrinol (Lausanne). 2019;10:194. 10.3389/fendo.2019.00194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Jiang P, Nelson JD, Leng N, et al. Analysis of embryonic development in the unsequenced axolotl: waves of transcriptomic upheaval and stability. Dev Biol. 2017;426(2):143–154. 10.1016/j.ydbio.2016.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bryant DM, Sousounis K, Payzin-Dogru D, et al. Identification of regenerative roadblocks via repeat deployment of limb regeneration in axolotls. NPJ Regen Med. 2017;2:30. 10.1038/s41536-017-0034-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Natarajan N, Abbas Y, Bryant DM, et al. Complement receptor C5aR1 plays an evolutionarily conserved role in successful cardiac regeneration. Circulation. 2018;137(20):2152–2165. 10.1161/CIRCULATIONAHA.117.030801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Wu CH, Tsai MH, Ho CC, Chen CY, Lee HS. De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration. BMC Genomics. 2013;14:434. 10.1186/1471-2164-14-434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.McKenna A, Hanna M, Banks E, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–1303. 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Zhao H, Sun Z, Wang J, Huang H, Kocher JP, Wang L. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics. 2014;30(7):1006–1007. 10.1093/bioinformatics/btt730. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES