Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 Dec 15;15(12):e0243273. doi: 10.1371/journal.pone.0243273

Antifreeze protein dispersion in eelpouts and related fishes reveals migration and climate alteration within the last 20 Ma

Rod S Hobbs 1, Jennifer R Hall 2, Laurie A Graham 3,*, Peter L Davies 3, Garth L Fletcher 1
Editor: Michael Schubert4
PMCID: PMC7737890  PMID: 33320906

Abstract

Antifreeze proteins inhibit ice growth and are crucial for the survival of supercooled fish living in icy seawater. Of the four antifreeze protein types found in fishes, the globular type III from eelpouts is the one restricted to a single infraorder (Zoarcales), which is the only clade know to have antifreeze protein-producing species at both poles. Our analysis of over 60 unique antifreeze protein gene sequences from several Zoarcales species indicates this gene family arose around 18 Ma ago, in the Northern Hemisphere, supporting recent data suggesting that the Arctic Seas were ice-laden earlier than originally thought. The Antarctic was subject to widespread glaciation over 30 Ma and the Notothenioid fishes that produce an unrelated antifreeze glycoprotein extensively exploited the adjoining seas. We show that species from one Zoarcales family only encroached on this niche in the last few Ma, entering an environment already dominated by ice-resistant fishes, long after the onset of glaciation. As eelpouts are one of the dominant benthic fish groups of the deep ocean, they likely migrated from the north to Antarctica via the cold depths, losing all but the fully active isoform gene along the way. In contrast, northern species have retained both the fully active (QAE) and partially active (SP) isoforms for at least 15 Ma, which suggests that the combination of isoforms is functionally advantageous.

Introduction

Most marine teleosts are unable to inhabit ice-laden sea waters characteristic of polar and sub-polar oceans because the temperature of the water (−1.9°C) can be a full degree lower than the freezing point of their body fluids (−0.7 to −0.9°C) [1]. In contrast such environmental conditions pose no risk to most invertebrates as their freezing points are usually the same as that of sea water [2]. Teleost fish that can survive and thrive in such environments do so by producing antifreeze proteins (AFPs) or glycoproteins (AFGPs) that bind to nascent ice crystals within their body fluids, thereby preventing their further growth that would ultimately result in death [37]. The difference between the melting point and non-equilibrium freezing point is defined as thermal hysteresis (TH) and is a measure of AFP activity [5].

To date, three non-homologous, physiologically functional groups of AFPs (types I-III), as well as the AFGPs, have been described in a variety of fish taxa [810]. Both type II and type III AFP, are globular and non-repetitive, with the type II AFP gene having been horizontally transferred between three fish orders (Osmeriformes, Perciformes, and Clupeiformes) that diverged over 200 Ma ago [11, 12]. In contrast, both AFGP and type I AFP are repetitive proteins. AFGPs consist of variable numbers of glycosylated tripeptide repeats and arose by convergent evolution in Antarctic Notothenioids (Order Perciformes) and the unrelated northern cods (Order Gadiformes) from two different progenitor sequences [1316]. The repetitive alanine-rich α-helical type I AFPs also arose by convergence, within four families from three orders: Pleuronectiformes, Labriformes and Perciformes [17]. All of the AFP-producing fish orders mentioned above diverged prior to the Eocene, which began with polar oceans that were ice-free during the Paleocene–Eocene thermal maximum at 55 Ma [18, 19]. Therefore, the impetus behind the evolution of these unique ice binding structures was the subsequent occurrence of sea ice following the onset of global cooling and glaciation [20]. These are clear examples of convergent protein evolution to a common function [21]. The occurrence of lateral gene transfer [12], and of convergence to highly similar repeat sequences [13, 17] adds a level of evolutionary complexity that is absent from most gene families.

In contrast to the other fish AFPs and AFGPs, type III AFP is restricted to a single taxonomic group, order Perciformes, infraorder Zoarcales (previously suborder Zoarcoidei), that diverged approximately 50 Ma ago [19] (Fig 1). This ~7 kDa protein arose from the C-terminal domain of sialic acid synthase (SAS) following a gene duplication [2224]. The first type III AFPs were isolated from ocean pout (Zoarces americanus, family Zoarcidae) [25, 26]. They are found as mixtures of SP-Sephadex- (SP) and QAE-Sephadex-(QAE) binding isoforms that are only about 55% identical. Both of these isoform sets have the capacity to bind to ice, but only a subset of QAE isoforms are able to completely halt ice crystal growth [27, 28]. Interestingly, the SP isoforms can be made fully active in stopping ice growth if as little as 1% of the QAE isoform is included with them [28].

Fig 1. Detailed relationships among species of the infraorder Zoarcales discussed in this study.

Fig 1

Divergence times were obtained from the following studies; Gasterosteiformes and Zoarcales, from 21 loci from a broad range of teleost fishes [19], Bathymasteridae, from 10 nuclear genes [88] and within Zoarcales [42] or Anarhichadidae [89] using fewer loci. A double asterisk denotes that the fish from these studies was the same as the species in this study, whereas a single asterisk denotes that the fish was a different species within the same genus. The 95% posterior density intervals for the branch points are shown by grey bars. Species names are coloured in a rainbow pattern from red to purple from earliest to latest time of divergence of each family. The location of the Cryptacanthodidae varies between studies (denoted with a dotted line) [19, 46] and some Stichaeidae cluster with Pholidae [4446]. Information on Lycodinae and radiated shanny was limited. Therefore, their relationship to the others was determined using the COX gene with Alaskan ronquil as the outgroup (S2 Fig), with divergence times estimated from the nearest calibrated node. The nodes at which there is the earliest evidence for the presence (+) or absence (-) of SP or QAE isoforms are indicated, as well as branches along which gene duplications have occurred (SP↑). The number of known Zoarcales species and their range (green background for northern, pink for southern) was determined from FishBase [72]. The average temperature of the deep ocean, relative to the present day, as well as the temporal extent of northern (N) and southern (S) glaciation, was adapted from Fig 1 of [90].

Fishes from the infraorder Zoarcales have a global distribution and are the only group of fishes that have been demonstrated to have AFP-producing species that have conquered the icy waters at both poles (Fig 1). Five families, four of which are restricted to Northern waters, were shown by early Southern blotting studies to possess numerous type III AFP genes: Zoarcidae [ocean pout (Z. americanus)]; Anarhichadidae [spotted wolffish (Anarhichas minor) and Atlantic wolffish (A. lupus)]; Pholidae [rock gunnel (Pholis gunnellus)]; Stichaeidae [radiated shanny (Ulvaria subbifurcata)] [2931]; Cryptacanthodidae [wrymouth (Cryptacanthodes maculatus)] [32, unpublished data]. However, AFP sequences have only been obtained from sequences within Zoarcidae and Anarhichadidae [2527, 29, 30, 33, 34]. Most of these species inhabit the shallow inshore waters of Newfoundland, where they are frequently exposed to subzero temperatures and ice during the winter months [35].

Type III AFPs have also been characterized from two species that reside in the frigid Southern Ocean around Antarctica. Both Lycodichthys dearborni and Pachycara brachycephalum are commonly called Antarctic eelpouts, so to differentiate them, we will refer to L. dearborni as Antarctic eelpout and P. brachycephalum by scientific name only. The AFP complement of the Antarctic eelpout has been studied through a combination of protein, cDNA and genomic DNA sequencing (yielding over 20 gene sequences). This species produces both monomers and tandemers consisting of two or more linked AFP domains [23, 3640], whereas the northern species studied produce only monomers. Interestingly, there is no evidence of tandemers in the second Antarctic species examined, P. brachycephalum [39, 41].

Given the diverse evolutionary history of fish AFPs, we examined in this study several questions about type III AFPs. The first was to establish if this AFP has spread through the Zoarcales by direct descent as opposed to parallel evolution. By mining unique sequences from various databases and transcriptome studies, and through targeted sequencing projects on several species, we can confidently say that all known type III AFPs are related by descent. Secondly, we wanted to date the origin of the type III AFP, so we successfully expanded the search for AFP sequences in other Zoarcales families out to the Alaskan ronquil, Bathymaster caeruleofasciatus, which diverged around 18 Ma [42] (Fig 1). Finally, having established here that all type III AFPs are related by descent, we addressed the timing of the colonization of Antarctic waters by zoarcids. The most plausible explanation is one of recent migration of a founder species from the north that transitioned through the cold ocean depths linking the two poles.

Results

Insights into species relationships in the infraorder Zoarcales

A number of phylogenetic and taxonomic studies have elucidated the relationships amongst the infraorder Zoarcales and this information is summarized in Fig 1 [19, 4346]. To confirm and bolster the connections within the tree of Zoarcales, (particularly for the polyphyletic Stichaeidae, the radiated shanny, and Antarctic eel pouts), cytochrome oxidase (COX) gene sequences from these (or closely related) species were downloaded and an alignment was generated (S1 Fig). This was used to produce a phylogenetic tree (S2 Fig). All of the branch points in the DNA-based tree were consistent with those determined in the taxonomic studies (Fig 1). Key points from this new study are that: i) the Antarctic species were found to be more closely related to each other than to the Canadian eelpout; ii) the radiated shanny, like some other Stichaeidae, was found to be more closely related to the rock gunnel than to any of the other species; iii) the three species newly-examined in this study are found in two lineages that diverged ~18 Ma (Alaskan ronquil) and ~15 Ma (radiated shanny and rock gunnel). The two families from which AFP sequences were previously known, Anarhichadidae and Zoarcidae, only diverged from each other ~10 Ma.

Type III AFPs arose early in the infraorder Zoarcales

To trace the evolutionary history of type III AFP and date its origin, we set out to expand the number of known sequences from the three Zoarcales species mentioned above (Alaskan ronquil, radiated shanny and rock gunnel that are found in two lineages that diverged early in this clade. Full-length cDNA sequences encoding both QAE and SP isoforms were obtained from rock gunnel and radiated shanny using primers based on known sequence (S1 Table) and a combination of RLM-RACE and RT-PCR. When the protein sequences of these two species are aligned (Fig 2, S3 Fig), the SP isoforms are 72% identical while the QAE isoforms are 79% identical. This identity drops to ~60% when SP and QAE isoforms are compared. The accession numbers and characteristics of these sequences are given in S2 Table. QAE and SP isoforms were originally categorized by their ability to bind to positively-charged quaternary aminoethyl (QAE) or negatively-charged sulfopropyl (SP) resins at neutral pH [26]. However, the rock gunnel-Q1 sequence has a predicted isoelectric point at pH 9.5, more like that of SP isoforms. Therefore, sequence similarity is a better way to categorize type III AFPs. When the cDNA sequences were aligned with those of other fish (S4 Fig) and used to generate a phylogenetic tree (Fig 3), the SP sequences clustered together in this phylogeny, as did the QAE sequences, consistent with the relatedness of these two species (Fig 1). However, the first exon, which encodes the signal peptide, was identical between the rock gunnel SP and QAE isoforms. Phylogenetic analysis of this short exon (S5 Fig), using the homologous region of SAS, which does not encode a signal peptide [23], to root the tree, shows that for all other sequences, QAE sequences cluster, as do SP sequences. In contrast, the rock gunnel and radiated shanny exons cluster outside of these two groups. This suggests that exon shuffling may have occurred in the common ancestor of these two species.

Fig 2. Alignment of a representative selection of known type III AFP sequences.

Fig 2

Sequences are named using the common name for each species except for P. brachycephalum. Names are colored as in Fig 1 with an asterisk indicating the sequences were obtained by PCR in this study (S3 Table) and with a dagger for those assembled from the SRA database (S4 Table). They are numbered consecutively by species within the QAE (Q) or SP (S) group as they appear in S3 Fig. Residues conserved in all isoforms (sasA and B) are indicated with asterisks, variations characteristic of the QAE and SP groups are highlighted cyan and yellow respectively, with other variations in grey. Shared differences in the signal peptides of radiated shanny and rock gunnel are highlighted red. Black and red boxes show residues on the pyramidal- and prism-plane binding surfaces respectively [79]. Inward-facing residues (i) are indicated along the top. Residues unique to SAS are highlighted black with other differences within this group highlighted purple. Signal peptides are in lower case font. Dashes indicate gaps (internal) or incomplete sequence (at termini). The complete protein and nucleotide alignments are shown in S3 and S4 Figs, respectively.

Fig 3. Phylogenetic and functional comparison of type III AFPs.

Fig 3

A) A maximum-likelihood phylogenetic tree of the nucleotide sequences (S4 Fig) of the subset of type III AFP sequences shown in Fig 2. Cyan and yellow backing denotes QAE and SP isoforms respectively and bootstrap values (percent) are indicated at most nodes. The scale bar represents an average of 0.1 or 1 changes per site for solid and dashed lines respectively. B) and C) Representative structure of a QAE (PDB:4UR4) and SP isoform (PDB:4UR6) respectively [53] with the pyramidal and prism ice-binding surfaces colored orange and cyan respectively [78]. D) and E) Diagram of the fluorescent ice-plane affinity of a QAE and SP isoform respectively, adapted from previously published images [79]. F) and G) Ice crystals in the presence of 0.1 mM of a fully-active QAE isoform (M1.1, [91]) and 0.46 mM of an SP isoform (notched-fin eelpout-S5, S3 Fig [27]) respectively. Samples were cooled at a rate of 0.01°C/6 sec for one min then held for the indicated times at 0.1°C below the melting point. The scale bar represents 10 μm.

Northern blotting was performed using various tissues from both the rock gunnel and the radiated shanny (S6 Fig) and the transcripts were most abundant in the liver. Moderate expression was found in skin, gill and stomach, indicating that that the liver-dominant expression originally observed in Zoarcidae [47] and Anarhichadidae [29] is also a feature that arose early in Zoarcales. A partial type III AFP sequence from an even more divergent species, the Alaskan ronquil (Fig 1), was amplified from genomic DNA in two overlapping fragments using semi-nested PCR (see Supplementary Methods). This single sequence encodes a QAE isoform, Alaskan ronquil-Q1 (Fig 2). The gene structure is consistent with other type III genes. The 174 bp intron (not shown) lies close to the expected location, but both splice junctions are shifted leftward by three bp (S4 Fig). This sequence is most similar to Atlantic wolffish-Q2. The identities between the protein sequences, as well as between the coding sequences and the single intron, were between 94 and 95%. The 3' UTR is up to 97% identical to other type III sequences. SP sequences were not recovered, but their existence cannot be ruled out as the DNA amount and quality from the museum specimen was low and/or the primers used may not match Alaskan ronquil sequences. The presence of this sequence in the Alaskan ronquil pushes back the origin of type III AFP to approximately 18 Ma.

Additional AFP variants in the Atlantic ocean pout indicate the large size of its gene family

Type III AFP sequences from the Atlantic ocean pout (Z. americanus), the species in which this AFP was first discovered, were previously obtained by Edman degradation of proteins or from cDNA and genomic sequences [25, 26, 34]. Here, cDNA sequences encoding eight type III AFPs from this fish were cloned using 3ʹ RACE (S3 Table). To ensure that the sequence differences were not due to PCR or sequencing errors, only sequences represented by multiple clones are reported. There were between five (ocean pout-S2) and 25 (ocean pout-Q1) clones of each. The eight sequences were deposited in GenBank under accession numbers KR872957-KR872964, but throughout the manuscript, we refer to them by their corresponding protein accession numbers, ALL26673-ALL26680.

When the new ocean pout sequences were compared to known conspecific nucleotide and protein sequences (S3 Table), only two out of the eight matched previously known nucleotide sequences. Of these two, a sequence obtained from muscle (ALL26680) matches one previously obtained from pancreas (ocean pout-Q7, Fig 2, [34]), and a cDNA obtained from gill (ALL26678) matches the genomic clone OP5 [26]. Three others (ocean pout-Q5, ocean pout-S2 and ocean pout-S4) match isoforms known only from protein sequencing (HPLC12, HPLC7 and HPLC1, respectively) [26]. There is evidence of post-translational modification of HPLC1 and HPLC7 as the last two residues encoded by ocean pout-S2 and ocean pout-S4 (Gly-Lys) are absent. The Gly residue is likely acted on by peptidylglycine α-amidating monooxygenase [48] following removal of the Lys, generating an amidated C-terminal Ala. The three remaining sequences, ocean pout-Q1, ocean pout-Q4 and ocean pout-Q6 are unique. Two of these new sequences are SP isoforms (ocean pout-S2 and ocean pout-S4), three are QAE isoforms (ocean pout-Q1, ocean pout-Q4 and ocean pout-Q5) and the sixth (ocean pout-Q6) diverged early within the QAE lineage (Figs 2 and 3, S3 Fig).

New variants from the transcriptomes of viviparous eelpout and P. brachycephalum strengthen the pattern of AFP relationships

Of the over 300,000 transcriptome sequence reads from the livers of eighteen viviparous eelpouts (Z. viviparus) from Scandinavian waters, some of which were from fish were harvested in November [49], 0.7% or ~2000 encoded AFPs. Therefore, this collection likely encompasses all of the sequences expressed in liver, but there may be tissue-specific genes expressed in other organs. A total of 19 unique sequences (12 SP, 7 QAE) were unambiguously assembled from these reads. One exactly matched a previously known protein sequence (AGM97733), while two others differed from ABN42204 and ABN42205 at two and three residues, respectively. Once sequences with two or fewer a.a. differences were excluded, 5 QAE and 8 SP sequences remained, designated viviparous eelpout-Q1 to -Q5 and -S3 to -S10 (S3 Fig). The pairs of SRA reads that can be used to generate these sequences are indicated in S4 Table.

The P. brachycephalum transcriptome sequence reads were generated from mRNA extracted from the hearts and livers of nine captured fish from Antarctica that were reared in tanks [50]. Only 165 reads, or 0.034% of over 480,000 reads obtained from normalized cDNA fractions, encoded AFPs. These assembled into groups encoding three distinct QAE isoforms (P. brachycephalum-Q1, -Q2, -Q4) with 75–89% identity, with P. brachycephalum-Q4 matching the previously-known cDNA sequence [34].

Phylogenetic comparisons suggest extant type III AFP sequences arose only once

Type III AFP sequences with at least three amino acid differences (within a species) were aligned (Fig 2, S3 Fig) to trace the origin and relatedness of the different sequences both within and between species. Furthermore, the nucleotide alignment for the same sequences (S4 Fig) was considered the best choice for generating a phylogenetic tree for the following reasons. First, there are several informative silent-site mutations, in both the first and third codon positions. Second, some codons have two or three differences that are more informative than the single change represented by the amino acid. Third, the position of the gaps was easier to ascertain from the nucleotide alignment. Although the tree generated using the protein sequences was very similar (S7 Fig), the bootstrap values were lower and many more nodes were unresolved (polytomous). Nevertheless, the two AFPs known only from Edman degradation of purified protein (Canadian eelpout-Q1 and P. brachycephalum-Q3) clustered with the other sequences from the two Antarctic species.

The phylogenetic tree of a representative subset of the coding sequences (Fig 3) shows that most of the type III AFP sequences cluster into either the SP (yellow shading) or QAE (cyan shading) group. These two types were initially recognized in ocean pout [26]. The identity within the SP group ranges from 80–100%, with Atlantic wolffish-S1 and spotted wolffish-S1 being identical. Within the QAE group, identities are 73% or higher. Between the two groups, identities range from 66% to 89%. The similarity to the progenitor sequence (sasB) is lower, with identities ranging from 60 to 65%. The distance between the SAS cluster and the node at which the QAE and SP groups diverge is quite long, so it was shortened in Fig 3 (dashed line) for aesthetic reasons. This long branch as well as the near identity (96%) of the SAS-B C-terminal domains from wolf eel (Anarrhichthys ocellatus), a fish from the same family of northern fishes (Anarhichadidae) as the wolffish, and the Antarctic eelpout, indicate that the all of the AFP sequences are far more similar to each other than they are to SAS. This shows that the AFP arose from SAS one time only.

Two sequences from ocean pout (ocean pout-6 and ocean pout-7) appeared to be intermediate in nature, containing residues typical of both the SP or QAE groups, indicated by yellow and cyan highlighting, respectively (Fig 2). As these variations are not contiguous, they are unlikely to have arisen by recombination or gene conversion. Rather, these alleles appear to have been duplicated soon after the QAE and SP gene lineages began to diverge, so they retain characteristics of both. There are a few other instances where residues typical of QAE sequences are found in SP sequences and vice versa, such as the Lys residue found at position 25 of some QAE sequences. Rather than being ancestral states, these appear to be reversions that occur subsequent to the initial mutations within each group.

The two SAS sequences from the wolf eel were obtained from a recent genome assembly from 150 bp paired-end Illumina reads (GenBank assembly accession GCA_004355925.1). These genes resided on a 5.7 Mb scaffold. Unfortunately, the four scaffolds containing AFP sequences (NW_022287273, NW_022287277, NW_022287306, and NW_022287306) were only ~2 kb in length and all four encoded an identical SP isoform. Due to the fragmentary nature of these assemblies, this AFP sequence, which was 96% identical to spotted wolffish-S2, was not analyzed further.

SP sequences cluster along family lines

There are two main groups within the SP cluster (S8 Fig), in which sequences from Anarhichadidae (red dashed box) cluster separately from those within Zoarcinae (blue dashed box). The sequences from radiated shanny and rock gunnel also cluster. In contrast, the QAE sequences that are known show a much weaker association by family (Fig 4). This suggests that the common ancestor of these families may have possessed a larger number of QAE sequences than SP sequences. Alternatively, the SP genes may have undergone rounds of expansion and contraction more frequently than QAE genes.

Fig 4. Phylogenetic and geographical distribution of QAE isoforms.

Fig 4

A) Phylogenetic tree of the complete QAE subgroup from S4 Fig. Sequences are indicated as in Fig 2 and sequential tandemers from Antarctic eelpout are labelled alphabetically (e.g. Antarctic-eelpout-Q3a, Antarctic-eelpout-Q3b). Sequences from northern fish are on a green background while those from southern fish are on pink. The scale bar represents an average of 0.02 changes per site and bootstrap values (percent) are indicated at most nodes. B) A newly-discovered Pachycara sp., found during the 2016 NOAA Okeanos expedition at the Mariana Trench, 18°27ʹ N; 147°50ʹ E, in 1.5°C waters below 4000 m (https://service.ncddc.noaa.gov/rdn/oer-rov-cruises/ex1605l3/#tab-20). C) Diagram of a cold-water migration route through a cross-section of the oceans from the Arctic to the Anarctic. The deep water passage shown by the dotted arrows through the tropics avoids warmer surface waters (red).

Viviparous eelpout sequences reveal recent AFP gene amplification

The amino acid sequence identity between the SP isoforms of the viviparous eelpout is at least 78% and between QAE isoforms it is at least 75%. This drops to between 55% and 66% between the two groups. All but one sequence (viviparous eelpout-Q5) closely clusters with sequences from the notched-fin eelpout (Fig 4). They also group closely with some of the ocean pout sequences, a result which is expected given the close relationship between these three species (Fig 1). Within the SP group, the ten viviparous eelpout and four notched-fin SP isoforms cluster separately from the four isoforms known from ocean pouts, albeit with some low bootstrap values (S8 Fig). This suggests that their genes have undergone multiple rounds of gene duplication and unequal crossing over (gene amplification) within the last few million years, after the ocean pout lineage separated from that leading to the notched-fin eelpout and viviparous eelpout lineages (Fig 1, SP↑). We should not be surprised at the plasticity of this and other AFP gene loci because there are other documented examples where AFP gene copy number is highly variable between closely related species [20] and even between the same species in different geographical regions [26].

Antarctic AFP sequences cluster within a single QAE clade

Of interest in this study is the origin of type III AFP genes in the Antarctic zoarcids. Once highly similar sequences were eliminated, a total of 11 protein and 10 nucleotide sequences remained from the Antarctic species (S3 and S4 Figs). These cluster with high confidence into a single group, within the QAE clade (Fig 4). This pattern is consistent with AFP gene loss followed by reamplification from a single progenitor gene.

The encoded protein of one of the sequences recovered from the P. brachycephalum transcriptome exactly matched the known sequence (P. brachycephalum-4) [34, 41] but a few silent or non-coding variations were also detected (not shown). The other two sequences, P. brachycephalum-Q1 and P. brachycephalum-Q2, were more similar (78 and 91% identity respectively) to the second sequence determined by protein sequencing (P. brachycephalum-3), which was isolated from fish living in a different area of Antarctica, approximately 4300 km distant [41, 50]. One of these (P. brachycephalum-Q1) is unique in having a 3ʹ UTR that does not match that of any known isoforms. The C terminus is similar to notched-fin eelpout-Q1 (CLCI vs CLCA, Fig 2), but this appears coincidental as this segment is also non-homologous given the Leu codon differs at two positions and the second Cys codon differs at the wobble position.

Core and pyramidal ice-plane binding residues are well conserved

The protein alignments (Fig 2 and S3 Fig) show the variability at each position within the sequence. These differences have been mapped, using PyMol (1.7.6.3) [51], onto a stereo view of the highest resolution structure available (PDB 1UCS, [33]), which differs from Antarctic eelpout-Q3a at one position (S9 Fig). Structurally important residues, such as the fifteen core residues, are highly conserved with conservative substitutions at only four positions (5-V/I/A, 22-M/I/T, 40-L/I/M, 55-L/I, S3 Fig). Two residues in turns, Pro29 and Pro33, are also conserved (S9A and S9B Fig). These are also the most highly-conserved residues relative to SAS-B (Fig 2), with conservation at all but two positions (22-M/L, 40-I/M), which underscores their importance in maintaining the hydrophobic core and overall structure of this domain/protein.

Residues found on the pyramidal ice-binding surface (IBS) are largely conserved (6 out of 9, green) with only three variable positions (13-I/M, light green, 15-T/S, light green (in only one isoform), 9-Q/V/T/R, grey) whereas the prism IBS is more variable, with only one out of five residues conserved (18-T, red), and four that are variable (orange and pale orange residues) (S9 Fig). One of the Antarctic QAE AFPs, P. brachycephalum-Q4, has residues on the IBS that are typical of SP isoforms (18–20, TLV to TPA, S3 Fig) and these significantly alter the prism IBS (S10A and S10B Fig). Conversely, some SP isoforms such as viviparous eelpout-S7 (S3 Fig) have TLV instead of TPA. When the most extreme substitution at each variable position with the IBS of all of the isoforms is mapped onto an ocean pout QAE structure [52], those on the prism IBP appear to change the surface flatness significantly (S10A and S10C Fig), whereas those on the pyramidal IBS (9, Q to R and 13, I to M) do not.

Both SP and QAE isoforms are more hydrophobic than the progenitor

When the sequence of Antarctic eelpout SAS-B is mapped onto notched-fin eelpout QAE (PDB 4UR4) [53] using PyMol [51], the overall hydrophobicity of both the QAE (S10D Fig) and SP (PDB 4UR6) [53] (S10E Fig) isoforms relative to SAS (S10F Fig) is quite apparent. Here, a PyMol script that colors carbon atoms that lack hydrogen-bonding potential yellow and charges blue or red, was used [54]. The difference is most obvious on and flanking the IBS, as both the QAE and SP isoform are devoid of charged residues here. In contrast, SAS-B has three charged residues that protrude from this surface (K10, K13, D20) and three that are adjacent (K8, E47, D48), making this surface considerably more hydrophilic in SAS than in the AFPs.

The remaining surface residues of the AFPs are quite variable (S9 Fig, grey). There are many differences between QAE and SP isoforms (Fig 2, S3 Fig, yellow and cyan highlighting) or even within each group (Fig 2, S3 Fig, grey highlighting). However, there are ten of positions where QAE and SP isoforms share differences relative to SAS (no cyan or yellow, SAS highlighted, in black). The positions of two such residues, Thr54 and Asn46, are shown in S9 Fig. Their distance from the IBS would suggest they are not under selection, so these differences may have arisen by genetic drift following the duplication of the primordial AFP gene.

When the region encompassing the IBS is excluded, the AFPs are still more hydrophobic than the progenitor. Certain patches, such as residues 50–53 of QAE (PLGT) and SP (AKGQ), are significantly less charged relative to SAS (EEDD), but in some cases, such as position 23 of QAE, the reverse is true (S10D–S10F Fig). When the six charged residues near the IBS (above) and the last four residues of SAS-B are excluded, this domain still has an excess of six to seven charges over the AFPs. This is reflected in the relative percentage of solvent-exposed surface, calculated using PyMol [51], where it is lowest (57%) in the QAE isoform, slightly higher in the SP isoform (59%) and highest in SAS-B (64%) (S10D–S10F Fig). This is not only a property of these particular sequences alone, as the value for the portion of the sequence that was resolved in the NMR structure of human SAS (PDB 1WVO) [55] is similar to SAS-B (64%) and the QAE structure shown in S10A Fig (PDB 1HG7) [52] is similar to the SP isoform in S10E Fig as it has the same number of charged residues.

Discussion

Type III AFP, which is found in fishes from a single branch of the taxonomic tree, may have allowed the Zoarcales to diversify and spread during the last 18 Ma, during a time of global cooling (Fig 1). Like most other fish AFPs, these are also present in multi-gene families [56]. We add to the body of evidence that this gene family has undergone many changes during this period, in which gene losses, gene duplications and other mutational events have occurred. The impetus for these changes is likely related both to the changing climate and to the migration patterns of the various species examined.

The Paleocene and early Eocene, from 65 to 45 Ma, was much warmer than today, with an ocean that was ice-free [18]. The presence of cold-intolerant tropical vegetation along the Antarctic coastline during the early Eocene [57] and deep ocean temperatures up to 14°C higher than today [18], support this assertion. The Zoarcales diverged from the Gasterosteiformes (sticklebacks and relatives) around this time [19], so the absence of type III AFPs outside of Zoarcales is not surprising, given the common ancestor of these two groups would have had no need for an AFP.

Once the Earth began to cool and ice was again present in the oceans, fish that acquired AFPs would be able to exploit “freeze-risk ecozones” with their abundance of invertebrate prey [29, 58]. This scenario is considered the impetus for the evolution and diversification of the four known AF(G)P types in diverse fish taxa [20]. An excellent example of ecozone exploitation is provided by the Notothenioids, which became the dominant taxon in Antarctic waters after acquiring AFGPs [59, 60].

The situation in the Northern Hemisphere is more complex as species with all four functional AFP types are found here [20]. In addition, the climate history of the Arctic is not as well understood. While Antarctic glaciation began ~35 Ma, the Arctic remained warmer for far longer with widespread glaciation apparently only occurring within the last 3.5 Ma [61, 62]. However, proxy evidence such as diatoms assemblages and ice-rafted debris suggest that ephemeral Arctic sea ice formed far earlier than previously thought, prior to 40 Ma [63, 64].

It is generally accepted that the infraorder Zoarcales originated in the Northern Pacific [6568]. Therefore, our discovery of type III AFPs in two northern Zoarcales species that diverged over 10 Ma (radiated shanny, rock gunnel) and the Alaskan ronquil that diverged ~ 18.4 Ma (based on a time-calibrated phylogeny [42]), lends credence to the hypothesis that sea-ice was abundant in the North Pacific, well in advance of the opening of the Bering Strait [6971] and many Ma before widespread glaciation occurred. It is likely that these groups remained in the Northern Pacific Ocean as nearly all species from the three families to which these fish belong (Bathymasteridae, Pholidae and Stichaeidae) still reside in this area [72]. It may be that the ancestors of the radiated shanny and rock gunnel migrated through the Bering Sea once the Bering Strait opened ~5 Ma [6971], as the Bering Sea is thought to be the location where the Zoarcales underwent a major radiation [67]. The AFPs of these fish would have allowed them to survive in these icy northern waters. Species from these three groups never crossed the equator however [72], but as the surface waters in the tropical Pacific Ocean were only 1 to 2°C cooler during the last glacial maximum than they are today [73], they may have acted as a barrier to the spread of species adapted to survive in colder waters.

The SP and QAE types arose early within the Zoarcales lineage as they are found within all northern fishes for which multiple sequences were obtained [Zoarcidae [26], Anarhichadidae [29, 30], Pholidae and Stichaeidae (this study)]. The progenitor is under strong negative selection as the entire SAS sequence is highly conserved, as shown by the near-identity (96%) of SAS-B from a northern wolf eel (family Anarhichadidae) to that from Antarctic eelpout (family Zoarcidae). The AFPs are far more variable and show a high rate of mutation on surfaces that are not involved in binding to the pyramidal ice plane. Despite this, all of the AFP sequences, whether they belong to the QAE or SP groups, are far more similar to each other than they are to SAS. Additionally type III AFP is only found in one infraorder (Zoarcales), whereas the three other AFP types that are known to have arisen by convergence or lateral gene transfer are found in fish from different orders [12, 1417]. Taken together, it is very unlikely that type III AFP arose more than once from SAS (convergence, specifically parallelism). Instead, a single AFP likely arose early during the diversification of suborder Zoarcales and following one or more duplication events, gave rise to the SP and QAE isoforms that were transmitted by vertical descent to various families within Zoarcales as they arose.

As fishes from all but the most recently evolved family within Zoarcales are restricted to northern waters (Fishbase, [72]), the type III AFP clearly arose in the Northern Hemisphere. The eelpouts (Zoarcidae) originated around 10 Ma and are one of only a few families of fish found at both poles [74]. How then did some Zoarcids move from the far north to the Antarctic Ocean? It seems unlikely that cold-water fishes would migrate into warmer tropical waters and then back into cold Antarctic waters, all while retaining AFPs. They would need to first adapt to warmer water, where they would face competition from extant warm-water species, and then adapt back to colder water on the way to Antarctica. This would have been the case even during the coldest periods of the ice ages, because although the extent of the warmer waters was reduced, both the tropical Atlantic and Pacific waters were only a couple of degrees cooler than they are today [73]. However, there is another possible route, in both the Pacific and Atlantic basins, that does not require incursion into warm waters. Cold-adapted species could move from pole to pole through deep, perpetually cold waters (Fig 4). Indeed, Zoarcids are the predominant species near hydrothermal vents where the temperature a short distance away is typically around 2°C, and Pachycara spp. have been found at depths exceeding 3 km in both the Atlantic and Pacific basins (Fig 4) [75, 76]. Interestingly, the majority of the Zoarcids that have been found near the equator are demersal (bottom dwelling) fish that have been recovered from depths of 500 m or more [72]. Unfortunately, the present-day distribution of the Antarctic eelpout and P. brachycephalum does not provide further clues as to their migration route as the Southern Ocean lacks land barriers.

Another possible scenario that could explain the presence of type III AFPs in southern Zoarcids is that the fish that migrated south did not have AFPs. Instead, they could have evolved anew from SAS once they encountered the icy Southern Ocean (parallelism). Work by Deng et al. (2010) in Antarctic eelpout showed that it was the SASb gene that was duplicated, with the 3ʹ exon encoding all but the signal peptide of the AFP [23]. The comparison between the C-terminal domain of SASa and SASb from wolf-eel, a northern fish within the same family as the spotted and Atlantic wolffishes, to the corresponding sequences from Antarctic eelpout, casts doubt on this hypothesis. If the AFPs arose independently from SAS progenitors in the two hemispheres (parallelism), the extant AFPs should show more similarity to their conspecific progenitors than to each other. However, the phylogenetic trees indicate the exact opposite (Fig 3). Additionally, there would be no reason for the Antarctic sequences to be most closely related to the single sequence known from Canadian eelpout, which is the closest relative to the Antarctic species in our study, with these two lineages having diverged less than 10 Ma (Fig 1). However, the Canadian eelpout sequence is 87% identical to Antarctic eelpout Q4 and clusters with the Antarctic sequences in the phylogenetic tree (S7 Fig). Furthermore, all of the AFPs clustered together near the end of a long branch (Fig 3), suggesting that the AFPs did not begin duplicating until the primordial gene had diverged significantly from SAS. Although it is plausible that the residues on the ice-binding surfaces could be similar due to convergence, the similarities between the variable surface residues away from these sites is strongly suggestive of homology. Therefore, the AFPs within the Antarctic species are clearly homologs of those found in northern fishes.

An alternative hypothesis to parallelism is that the population that migrated south, long after the onset of Antarctic glaciation, may have lost all but one or a few nearly-identical AFP genes during the journey as all of the Antarctic sequences form a single cluster within the QAE portion of the phylogeny. While it is difficult to prove the complete absence of SP isoforms, the fact that none were recovered from serum or the transcriptome of P. brachycephalum [41, 50] or following screening of EST and BAC libraries from Antarctic eelpout [39, 40] strongly suggests they are absent. Unfortunately, genome sequencing has not proven useful for characterizing this multi-copy gene family, as shown by the failure of the AFPs of wolf eel to properly assemble. Therefore, the sequencing of individual BACs likely provides a clearer picture of the gene complement in Antarctic eelpout [23, 39]. Conversely, it is possible that other species of southern Zoarcids did retain SP isoforms and that there may have been more than one species that migrated into southern waters. The plasticity of the AFP gene family, both with regards to gene number and gene organization, has been clearly demonstrated. For example, the number of AFP genes, even within a single species, can change dramatically as was seen with ocean pout living at different latitudes [26]. Additionally, the organization can differ as the AFP genes are found in tandem repeats in ocean pout [26], as inverted pairs within tandem repeats in Atlantic wolffish [30] and in tandem repeats containing some tandemers at a single locus in Antarctic eelpout [23, 39]. rDNA genes are found in similar arrays and they are known to undergo rapid changes in copy number (reviewed in [77]). It should be noted that there is no well-documented advantage of AFPs to fish other than to protect them from internal ice growth. Thus, there is no need for AFPs in cold deep water where ice crystals are absent. Therefore, the phylogenetic analysis supports the hypothesis that gene duplication/amplification of the remaining QAE type gene(s) in the newly arrived migrants would have allowed these fish to survive in the icy Antarctic waters.

Further evidence for the plasticity of the AFP gene family is provided by the additional sequences we obtained from ocean pout, and by analyzing the transcriptome of the viviparous eelpout [49]. A variety of divergent QAE isoforms have been retained in both of these eelpouts as the QAE isoforms of these two species form a number of clusters. However, the SP isoforms of viviparous eelpout cluster together with those of the notched-fin eelpout but separately from those of ocean pout. This suggests that gene losses and duplications may have occurred frequently, particularly within the SP group, within the last 10 Ma. Still, both types have been retained. Many isoforms (all SP and some QAE) have been shown to impart hexagonal shaping to ice without preventing growth, due to structural differences in the ice binding site that prevents binding to the primary prism plane of ice [78, 79]. This has led some to suggest that these AFPs should be reclassified as "ice growth modifiers" [28, 80]. This limitation is overcome by the addition of minor quantities of a "fully active" QAE isoform [27, 28]. The isoforms appear to show synergism as the activity of the mixture is a little higher than that obtained with the QAE isoform alone [28]. Synergism has also been reported with AFGPs [81, 82] and beetle AFPs [83]. Therefore, the retention of multiple isoform types is likely advantageous for type III-producing fishes.

The two Antarctic species appear to have lost SP isoforms as mentioned above, as all of the isoforms from these two species cluster on a single branch within the QAE grouping of the gene phylogeny. The serum of P. brachycephalum isolated from McMurdo Sound was found to contain only two QAE isoforms [41] whereas transcriptome sequencing (over 480k sequences) from fishes isolated over 4000 km away in the South Shetland Islands [50] revealed three QAE isoforms, two of which were unique, but no SP isoforms. The isoform encoded by the most abundant transcript matched the predominant isoform isolated from the McMurdo fishes. Interestingly, this isoform has two mutations that convert the ice-binding surface from QAE-like (TLV) to SP-like (TPA). Although it was reported to be fully active [41], this is no longer a certainty as we now know that trace amounts of fully-active QAE isoforms can confer full activity to SP isoforms [27, 78]. Why then would fish retain isoforms that are not fully active? One observation is that the SP-like mutations to the ice binding surface appear to extend the footprint of the pyramidal-plane binding surface while slightly reducing the overall hydrophobicity of the protein. This may strengthen or speed up ice binding while improving solubility of the AFP. Additionally, the AFPs are substantially more hydrophobic than Antarctic eelpout SAS-B, a difference that was also noted when the NMR structure of the corresponding domain of human SAS was solved [55]. While this invariably relates to function, as hydrophobicity is a general property of the IBS of type III AFPs and loss of the hydrophobic residues flanking the IBS can be deleterious [84], it may be that a mixture of QAE and SP isoforms is more soluble at high concentrations than one isoform alone. For example, a notched-fin eelpout SP isoform expressed in bacteria has shown a propensity to dimerize [53]. Furthermore, it seems unlikely that the ice-binding site of the dominant Antarctic QAE isoform would have mutated to an SP-like ice-binding site if there was no selective advantage to having both types.

The lack of an SP isoform in the Antarctic eelpout also seems probable, as sequencing of 61 AFP-encoding expressed sequence tags (ESTs) and six BAC clones containing multiple AFP genes [39, 40] failed to turn up a single SP isoform. The various isoforms from this species have a number of mutations on the ice-binding surface, such as TLI or TLM instead of TLV, but more importantly, this is the only species known to produce tandemers (sequential AFP domains encoded in a single transcript) [37, 39, 40]. This again suggests that fully-active QAE monomers alone might not be as effective as a mixture of isoforms and reveals yet another mechanism by which the diversity of the type III AFP family is attained.

It is tempting to try to deduce function from the sequences of the various isoforms of type III AFP, but unfortunately, the effect of the various mutations on AFP activity are very difficult to predict. Mutations of SP isoform notched-fin eelpout-S1, introduced to mimic the ice-binding residues of a fully-active QAE isoform (TPA to TVL), almost restored wild type activity. However, the crystals still grew, albeit at a much slower rate [78]. This may be because other residues are also know to affect activity. For example, a reduction in the size of hydrophobic residues bordering the ice-binding surface of a QAE isoform reduced activity [84]. Additionally, it is known that some QAE isoforms are not fully active [27, 28]. Therefore, it may be that the activities of closely-related sequences could vary and the only reliable way to determine this would be to express and purify these isoforms.

There are over 400 known species of Zoarcales [66], including others that inhabit the ocean near Antartica [67], so the sampling of 12 species catalogued in this study by us and others is by no means a comprehensive survey of all of the type III diversity that is present in this infraorder. There may well be northern species that lack SP isoforms or southern species that have retained them, although the two Antarctic species that have been characterized do appear to have lost their SP isoforms and duplicated/amplified their QAE isoforms. Lineage-specific amplifications of SP isoforms have also taken place in northern species. Additionally, we have shown that all of the known isoforms were derived from a single progenitor sequence and have further elucidated the variability of this gene family in this infraorder. The addition from this study of 24 sequences and their polymorphisms to the type III AFP family should also prove useful in modelling or mutagenesis studies aimed at further understanding the relationship between structure and function in type III AFPs, particularly as a number of these isoforms have mutations on or near the ice-binding surface that would be expected to alter their activity. In addition, we have shown that the type III AFP arose early in the Zoarcales lineage and this may have been one of the factors that allowed this group of fishes to migrate into waters across the globe and to diversify over the last ~20 Ma into the over 400 species that are extant today.

Materials and methods

Sample collection and preparation

The Alaskan ronquil tissue sample preserved in ethanol was obtained from The University of Washington Fish Collection (Catalog number UW 150179). Ethanol was removed from the tissue and DNA was purified as described in Supplementary Methods.

Radiated shannys, rock gunnels and Atlantic ocean pout were collected near Logy Bay, Newfoundland, by divers from the Field Services Unit, Ocean Sciences Centre, Memorial University of Newfoundland, and transported in seawater to the Ocean Sciences Centre. Tissues were obtained from fish maintained at the centre that were euthanized by an overdose of MS222 just prior to tissue extraction. RNA was extracted from fresh or flash-frozen tissues as described in Supplementary Methods. Experiments were approved and carried out in accordance with Animal Utilization Protocols issued by Memorial University of Newfoundland’s Animal Care Committee. All measures were taken to minimize pain and discomfort during animal experiments. Guidelines followed were those of the Canadian Council on Animal Care (CCAC).

Type III AFP cloning and sequencing

Sequences that were obtained by PCR in this study are indicated by an asterisk while those assembled from the SRA database are indicated by a dagger (S3 Fig). A type III AFP sequence from the Alaskan ronquil was obtained from semi-nested PCR. Full-length cDNAs for type III AFP (QAE and SP isoforms) from radiated shanny and rock gunnel were cloned using a commercial kit for RNA ligase-mediated rapid amplification of 5ʹ and 3ʹ cDNA ends (RLM-RACE) [GeneRacer Kit (Invitrogen/Life Technologies)]. Partial cDNAs for type III AFPs from Atlantic ocean pout were cloned using the 3ʹ RACE protocol only. The sequences of all primers used in gene and cDNA cloning are presented in S1 Table. Detailed cloning and sequencing methods are provided in Supplementary Methods.

Assembly of type III sequences from transcriptomes

Sequences from two transcriptomes, obtained from a northern viviparous eelpout ((Z. viviparus), SRA SRX002161) and southern Zoarcid (P. brachycephalum SRA SRX118640) were screened for sequences encoding AFPs. Longer sequences from the viviparous eelpout were selected to reduce the number of sequences that would exactly match more than one variant. These were grouped into sets of identical or near identical sequences, where the only allowed differences were length variations within homopolymeric runs. Sequences were assembled using the CAP3 Sequence Assembly Program [85]. Following assembly into nineteen (viviparous eelpout) or three (P. brachycephalum) unique sequences, reads that overlapped more than one of these unique sequences were discarded and the assemblies were redone.

Phylogenetic analysis of predicted type III AFP protein sequences and structure rendering

The assembled Sequence Read Archive (SRA) data from above, along with type III AFP protein and nucleotide sequences from the protein and nucleotide databases of NCBI were aligned using Clustal Omega [86]. Sequences from the same species that differed at two or fewer positions within the protein sequence (considering each AFP domain within the tandemers encoded in multiple BAC clones from Antarctic eelpout individually [23]) were considered redundant and were excluded. Minor adjustments were made to the protein alignment based on the nucleotide alignment. Phylogenetic trees were generated using nucleotide alignments and the SAS-B progenitor from Antarctic eelpout was used to root the tree containing both SP and QAE sequences [23] while a more divergent AFP sequence (ocean pout-7) was used to root the QAE tree. MEGA version 7 [87] was used to perform model tests prior to generating phylogenetic trees by the maximum likelihood statistical method using a moderate branch swap filter and all positions with 500 bootstrap replicates.

Supporting information

S1 Fig. Alignment of cytochrome oxidase subunit I (COI) sequences from various members of the infraorder Zoarcales.

GenBank accession #s, sequentially top to bottom as in the alignment, are KC016052, KJ205263, KC517318, HQ712639, HQ713113, HQ713057, KJ205118, KC016016, JQ685890, KC015305, EU752057, EF427917. Since a COI sequence was not available for the Antarctic eelpout, Lycodichthys dearborni, the one for Lycodichthys antarcticus was used instead. Asterisks under the alignment indicate perfect conservation of a base.

(DOCX)

S2 Fig. Phylogenetic tree of Zoarcales species using the COI alignment from S1 Fig.

The maximum-likelihood tree was generated using the Kimura 2-parameter with invariant sites model and bootstrap values (%) are indicated at the nodes. The Alaskan ronquil was used as the outgroup and the scale bar represents an average of 0.02 changes per site. This tree was used to determine the placement of the radiated shanny, Antarctic eelpout and Pachycara brachycephalum relative to the other species in Fig 1.

(DOCX)

S3 Fig. Protein alignment of known type III AFPs that have more than two differences relative to other sequences from the same species.

Sequences are named as in Fig 2. Grey highlighting indicates sequences that were determined by Edman degradation. The tandemers of Antarctic eelpout are lettered sequentially (e.g. Antarctic eelpout-Q3b is the second AFP domain in Antarctic eelpout-Q3). Variable residues are highlighted or coloured according to the phylogenetic tree (Fig 3) with conserved residues typical of QAE or SP variants highlighted cyan and yellow, respectively. Mutations that arose somewhere within Antarctic species are highlighted pink and residues within SAS-B that were not conserved when the QAE or SP groups arose are highlighted black. Other differences between SAS sequences are highlighted purple. Differences that do not correlate with these aforementioned groupings are highlighted grey. Red highlighting indicates shared differences between the signal peptides of the sequences from radiated shanny and rock gunnel. The black boxes and red boxes show residues involved in binding to the pyramidal plane and prism plane respectively, as in Fig 2. The signal peptide is in lowercase font. Italics indicate linkers between tandemers, internal dashes indicate gaps, whereas leading or trailing dashes indicate that the sequence is incomplete at either terminus. Identity, high similarity and low similarity between all AFPs (incomplete sequences ignored) is indicated at the bottom with asterisks, colons and periods, respectively. Residues with an inward pointing sidechain are indicated by “i” at the top. Asterisks denote sequences obtained by PCR in this study (S3 Table) and daggers denote those assembled from the SRA database (S4 Table). Accession numbers are listed in S4 and S5 Tables.

(PDF)

S4 Fig. Nucleotide alignment of type III AFPs.

Sequences are named and variable nucleotides are highlighted as in Fig 2. Red highlighting indicates shared differences between the first exons of radiated shanny and rock gunnel and these exons were excluded prior to generating all but the exon 1 phylogenetic tree as they may have been homogenized by exon shuffling. The translation of notched-fin eelpout-Q1 has black boxes and red boxes showing residues involved in binding to the pyramidal plane and prism plane respectively, as in Fig 2. The signal peptide is in lowercase font. Internal dashes indicate gaps, whereas leading or trailing dashes indicate that the sequence is incomplete at the respective terminus. Intronic sequences are not shown for genomic clones but an arrow indicates where an intron is found. The 3′ splice junction of ocean pout-Q2 was originally predicted based on SP isoforms but has been adjusted by three bases (lower-case font) to match QAE-type cDNAs. The 3ʹ end of P. brachycephalum-Q1 (italics) and the linker sequences between Antarctic eelpout tandemers (not shown) were excluded from the phylogenetic analysis as they are not homologous to the other AFP sequences. These nucleotide sequences are unambiguously accessed through the protein accession numbers in S4 and S5 Tables as some of the nucleotide sequences encode multiple AFPs.

(DOCX)

S5 Fig. Maximum-likelihood phylogenetic tree of all non-identical exon 1 sequences from type III AFPs from S4 Fig generated using the Jukes-Cantor model with 5 gamma categories.

Cyan and yellow backing denotes QAE and SP isoforms, respectively, except for radiated shanny and rock gunnel sequences that cluster together on a separate branch. Bootstrap values (percent) are indicated at most nodes and the scale bar represents an average of 0.1 changes per site. Sequences are named as in Fig 2.

(TIF)

S6 Fig. Tissue distribution of rock gunnel and radiated shanny type III AFPs.

Northern blot analysis of total RNA from two rock gunnel individuals (A and B) and two radiated shanny individuals (C and D). The panels in each set show the hybridization signal to the AFP probe (top); the chicken β-tubulin probe (middle) cDNAs; or ethidium bromide staining of the 28S and 18S rRNA bands (bottom). The tissues are indicated as follows; L = liver, Sk = skin, G = gill, I = intestine, St = stomach, M = muscle, H = heart, Sp = spleen, K = kidney. RNA size marker positions are indicated on the left (bases) and total RNA from cunner skin was used as a negative control (n).

(TIF)

S7 Fig. Maximum-likelihood phylogenetic tree generated from protein sequences.

The amino acid sequences shown in Fig 2, along with two sequences determined solely by Edman degradation of purified proteins (Canadian eelpout-Q1 and P. brachycephalum-Q3, labelled with #) were used to generate a phylogenetic tree equivalent to Fig 3.

(TIF)

S8 Fig. Phylogenetic comparison of SP isoforms.

The nucleotide sequences of the SP subset of type III AFP sequences (S4 Fig) were used to generate a maximum-likelihood phylogenetic tree using a divergent isoform (ocean pout-Q6) as the outgroup. Bootstrap values (percent) are indicated at most nodes and the scale bar represents an average of 0.02 changes per site. Sequences are named as in Fig 2.

(TIF)

S9 Fig. Stereoscopic views of type III AFP showing the location of variable and conserved residues.

Residues that are absolutely, moderately or poorly conserved are colored as follows: respectively; pyramidal ice-binding plane, dark green, light green, yellow; prism ice-binding plane, red, orange, pale orange; rest of protein, dark blue, light blue, grey. Front view (A), back view (B) and front surface view (C).

(TIF)

S10 Fig. Surface view showing the expected structural effect of ice-binding residue mutations and the hydrophobicity of different isoforms.

A) Wild-type QAE isoform, ocean pout-Q5 (HPLC12, PDB 1HG7) B) Ocean pout-Q5 with the introduction of the three ice-binding mutations found in P. brachycephalum-Q4 and C) A compilation of the most severe mutations at variable ice-binding residues (S3 Fig) introduced to ocean pout-Q5. Nitrogen is blue, oxygen is red, sulfur is yellow and carbon is pale orange (on the pyramidal-plane ice-binding surface), cyan (on the prism-plane ice-binding surface) or white (elsewhere). D) QAE isoform (PDB 4UR4) E) SP isoform (PDB 4UR6) F) Antarctic eelpout SAS-B residues mapped onto 4UR4 (excluding the last four residues of SAS-B). Atoms are colored by charge and hydrophobicity with red for charged oxygen, blue for charged nitrogen and yellow for carbon not bonded to nitrogen or oxygen. All other backbone and polar groups are colored white. Residues are numbered according to Fig 2, except residues 50–53 (PLGT, AKGQ and EEDD respectively). The percentage of the surface that is solvent accessible is indicated.

(TIF)

S1 Table. Sequences of oligonucleotides used in PCR studies.

*F is forward and R is reverse direction.

(DOCX)

S2 Table. Rock gunnel and radiated shanny type III AFP cDNA and predicted protein features (from ProtParam [92]).

1Excludes poly(A) tail. 2Includes STOP codon. 3Excludes STOP codon and poly(A) tail. 4Presumes cleavage of C-terminal Lys.

(DOCX)

S3 Table. Ocean pout sequences cloned in this study and their closest nucleotide and protein matches.

SP and QAE isoforms are highlighted yellow and cyan, respectively, with the QAE sequences that diverged early are highlighted grey. Protein accession numbers are used for consistency between figures. Percent identity excludes gaps, and sequences known only from Edman degradation are underlined. Sequences denoted with an asterisk were not included in the alignments as they differed at two or fewer a.a. resides from isoforms that were included.

(DOCX)

S4 Table. Accession numbers for one or two sequences from the NCBI SRA database that will generate viviparous eelpout (SRA accession # SRX002161) and P. brachycephalus (SRA accession # SRX118640) isoforms as shown in Fig 2 and S3 Fig.

(DOCX)

S5 Table. Genbank protein accession numbers for sequences shown in Fig 2 and S3 and S4 Figs.

(DOCX)

S1 File. Supplement for materials and methods, results, and references.

(DOCX)

S1 Raw image

(PDF)

Data Availability

The DNA sequences are available from NCBI GenBank under accession numbers KR872957-KR872964. The corresponding protein accession numbers are ALL26673-ALL26680.

Funding Statement

This research was supported by the Canadian Institutes of Health Research (https://cihr-irsc.gc.ca/e/193.html)(Foundation Grant FRN 148422 to P.L.D.) and the Natural Sciences and Engineering Research Council (https://www.nserc-crsng.gc.ca/index_eng.asp) (Discovery Grant 6836-06 to G.L.F.). P.L.D. holds the Canada Research Chair in Protein Engineering (https://www.chairs-chaires.gc.ca/home-accueil-eng.aspx). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Scholander PF, van Dam L, Kanwisher JW, Hammel HT, Gordon MS. Supercooling and osmoregulation in arctic fish. J Cell Comp Physiol. 1957;49: 5–24. 10.1002/jcp.1030490103 [DOI] [Google Scholar]
  • 2.Charmantier G, Charmantier-Daures M, Towle D. Osmotic and Ionic Regulation Evans DH, editor. Osmotic and Ionic Regulation:Cells and Animals. CRC Press; 2008. 10.1201/9780849380525 [DOI] [Google Scholar]
  • 3.Fletcher GL, Kao MH, Fourney RM. Antifreeze peptides confer freezing resistance to fish. Can J Zool. 1986;64: 1897–1901. 10.1139/z86-284 [DOI] [Google Scholar]
  • 4.Bar Dolev M, Braslavsky I, Davies PL. Ice-Binding Proteins and Their Function. Annu Rev Biochem. 2016;85: 515–542. 10.1146/annurev-biochem-060815-014546 [DOI] [PubMed] [Google Scholar]
  • 5.Raymond JA, DeVries AL. Adsorption inhibition as a mechanism of freezing resistance in polar fishes. Proc Natl Acad Sci U S A. 1977;74: 2589–2593. 10.1073/pnas.74.6.2589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wilson PW. Explaining thermal hysteresis by the Kelvin effect. Cryo Letters. 1993;14: 31–36. [Google Scholar]
  • 7.Knight CA, Cheng CC, DeVries AL. Adsorption of alpha-helical antifreeze peptides on specific ice crystal surface planes. Biophys J. 1991;59: 409–418. 10.1016/S0006-3495(91)82234-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Davies PL, Baardsnes J, Kuiper MJ, Walker VK. Structure and function of antifreeze proteins Bowles DJ, Lillford PJ, Rees DA, Shanks IA, editors. Philos Trans R Soc London Ser B Biol Sci. 2002;357: 927–935. 10.1098/rstb.2002.1081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ewart K V., Lin Q, Hew CL. Structure, function and evolution of antifreeze proteins. Cell Mol Life Sci C. 1999;55: 271–283. 10.1007/s000180050289 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Fletcher GL, Hew CL, Davies PL. Antifreeze proteins of teleost fishes. Annu Rev Physiol. 2001;63: 359–90. 10.1146/annurev.physiol.63.1.359 [DOI] [PubMed] [Google Scholar]
  • 11.Graham LA, Li J, Davidson WS, Davies PL. Smelt was the likely beneficiary of an antifreeze gene laterally transferred between fishes. BMC Evol Biol. 2012;12: 190 10.1186/1471-2148-12-190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Graham LA, Lougheed SC, Ewart KV, Davies PL. Lateral Transfer of a Lectin-Like Antifreeze Protein Gene in Fishes. Isalan M, editor. PLoS One. 2008;3: e2616 10.1371/journal.pone.0002616 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Chen L, DeVries AL, Cheng C-HC. Convergent evolution of antifreeze glycoproteins in Antarctic notothenioid fish and Arctic cod. Proc Natl Acad Sci. 1997;94: 3817–3822. 10.1073/pnas.94.8.3817 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chen L, DeVries AL, Cheng C-HC. Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci. 1997;94: 3811–3816. 10.1073/pnas.94.8.3811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Baalsrud HT, Tørresen OK, Solbakken MH, Salzburger W, Hanel R, Jakobsen KS, et al. De Novo Gene Evolution of Antifreeze Glycoproteins in Codfishes Revealed by Whole Genome Sequence Data. Mol Biol Evol. 2018;35: 593–606. 10.1093/molbev/msx311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zhuang X, Yang C, Murphy KR, Cheng C-HC. Molecular mechanism and history of non-sense to sense evolution of antifreeze glycoprotein gene in northern gadids. Proc Natl Acad Sci. 2019;116: 4400–4405. 10.1073/pnas.1817138116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Graham LA, Hobbs RS, Fletcher GL, Davies PL. Helical Antifreeze Proteins Have Independently Evolved in Fishes on Four Occasions. Pastore A, editor. PLoS One. 2013;8: e81285 10.1371/journal.pone.0081285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zachos JC, Dickens GR, Zeebe RE. An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics. Nature. 2008;451: 279–283. 10.1038/nature06588 [DOI] [PubMed] [Google Scholar]
  • 19.Betancur-R R, Broughton RE, Wiley EO, Carpenter K, López JA, Li C, et al. The Tree of Life and a New Classification of Bony Fishes. PLoS Curr. 2013. 10.1371/currents.tol.53ba26640df0ccaee75bb165c8c26288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Scott GK, Fletcher GL, Davies PL. Fish Antifreeze Proteins: Recent Gene Evolution. Can J Fish Aquat Sci. 1986;43: 1028–1034. 10.1139/f86-128 [DOI] [Google Scholar]
  • 21.Doolittle RF. Convergent evolution: the need to be explicit. Trends Biochem Sci. 1994;19: 15–18. 10.1016/0968-0004(94)90167-8 [DOI] [PubMed] [Google Scholar]
  • 22.Baardsnes J, Davies PL. Sialic acid synthase: the origin of fish type III antifreeze protein? Trends Biochem Sci. 2001;26: 468–469. 10.1016/s0968-0004(01)01879-5 [DOI] [PubMed] [Google Scholar]
  • 23.Deng C, Cheng C-HC, Ye H, He X, Chen L. Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci U S A. 2010;107: 21593–8. 10.1073/pnas.1007883107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409: 860–921. 10.1038/35057062 [DOI] [PubMed] [Google Scholar]
  • 25.Li XM, Trinh KY, Hew CL, Buettner B, Baenziger J, Davies PL. Structure of an antifreeze polypeptide and its precursor from the ocean pout, Macrozoarces americanus. J Biol Chem. 1985;260: 12904–12909. Available: http://www.ncbi.nlm.nih.gov/pubmed/3840475 [PubMed] [Google Scholar]
  • 26.Hew CL, Wang NC, Joshi S, Fletcher GL, Scott GK, Hayes PH, et al. Multiple genes provide the basis for antifreeze protein diversity and dosage in the ocean pout, Macrozoarces americanus. J Biol Chem. 1988;263: 12049–12055. Available: http://www.ncbi.nlm.nih.gov/pubmed/3403560 [PubMed] [Google Scholar]
  • 27.Nishimiya Y, Sato R, Takamichi M, Miura A, Tsuda S. Co-operative effect of the isoforms of type III antifreeze protein expressed in Notched-fin eelpout, Zoarces elongatus Kner. FEBS J. 2005;272: 482–492. 10.1111/j.1742-4658.2004.04490.x [DOI] [PubMed] [Google Scholar]
  • 28.Takamichi M, Nishimiya Y, Miura A, Tsuda S. Fully active QAE isoform confers thermal hysteresis activity on a defective SP isoform of type III antifreeze protein. FEBS J. 2009;276: 1471–1479. 10.1111/j.1742-4658.2009.06887.x [DOI] [PubMed] [Google Scholar]
  • 29.Desjardins M, Graham LA, Davies PL, Fletcher GL. Antifreeze protein gene amplification facilitated niche exploitation and speciation in wolffish. FEBS J. 2012;279: 2215–2230. 10.1111/j.1742-4658.2012.08605.x [DOI] [PubMed] [Google Scholar]
  • 30.Scott GK, Hayes PH, Fletcher GL, Davies PL. Wolffish antifreeze protein genes are primarily organized as tandem repeats that each contain two genes in inverted orientation. Mol Cell Biol. 1988;8: 3670–3675. 10.1128/mcb.8.9.3670 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Shears M, Kao MH, Scott GK, Davies PL, Fletcher GL. Distribution of type III antifreeze proteins in the Zoarcoidei. Mol Mar Biol Biotechnol. 1993;2: 104–111. [Google Scholar]
  • 32.Davies PL, Hew CL, Fletcher GL. Fish antifreeze proteins: physiology and evolutionary biology. Can J Zool. 1988;66: 2611–2617. 10.1139/z88-385 [DOI] [Google Scholar]
  • 33.Ko TP, Robinson H, Gao YG, Cheng CHC, DeVries AL, Wang AHJ. The refined crystal structure of an eel pout type III antifreeze protein RD1 at 0.62-Å resolution reveals structural microheterogeneity of protein and solvation. Biophys J. 2003;84: 1228–1237. 10.1016/S0006-3495(03)74938-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Cheng C-HC, Cziko PA, Evans CW. Nonhepatic origin of notothenioid antifreeze reveals pancreatic synthesis as common mechanism in polar fish freezing avoidance. Proc Natl Acad Sci. 2006;103: 10491–10496. 10.1073/pnas.0603796103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Scott WB, Scott MG. Atlantic fishes of Canada Canadian bulletin of fisheries and aquatic sciences. 1988. p. 731. [Google Scholar]
  • 36.Schrag JD, Cheng C-HC, Panico M, Morris HR, Deries AL. Primary and secondary structure of antifreeze peptides from arctic and antarctic zoarcid fishes. Biochim Biophys Acta—Protein Struct Mol Enzymol. 1987;915: 357–370. 10.1016/0167-4838(87)90021-5 [DOI] [PubMed] [Google Scholar]
  • 37.Wang Xin, DeVries AL, Cheng Chi Hing. Genomic basis for antifreeze peptide heterogeneity and abundance in an Antarctic eel pout: Gene structures and organization. Mol Mar Biol Biotechnol. 1995;4: 135–147. [PubMed] [Google Scholar]
  • 38.Wang X, DeVries AL, Cheng CHC. Antifreeze peptide heterogeneity in an antarctic eel pout includes an unusually large major variant comprised of two 7 kDa type III AFPs linked in tandem. Biochim Biophys Acta (BBA)/Protein Struct Mol. 1995;1247: 163–172. 10.1016/0167-4838(94)00205-U [DOI] [PubMed] [Google Scholar]
  • 39.Zhang J, Deng C, Wang J, Chen L. Identification of a two-domain antifreeze protein gene in Antarctic eelpout Lycodichthys dearborni. Polar Biol. 2009;32: 35–40. 10.1007/s00300-008-0499-8 [DOI] [Google Scholar]
  • 40.Kelley JL, Aagaard JE, MacCoss MJ, Swanson WJ. Functional diversification and evolution of antifreeze proteins in the antarctic fish Lycodichthys dearborni. J Mol Evol. 2010;71: 111–118. 10.1007/s00239-010-9367-6 [DOI] [PubMed] [Google Scholar]
  • 41.Cheng C-HC, DeVries AL. Structures of antifreeze peptides from the antarctic eel pout, Austrolycicthys brachycephalus. Biochim Biophys Acta—Protein Struct Mol Enzymol. 1989;997: 55–64. 10.1016/0167-4838(89)90135-0 [DOI] [PubMed] [Google Scholar]
  • 42.Radchenko OA. Timeline of the evolution of eelpouts from the suborder Zoarcoidei (Perciformes) based on DNA variability. J Ichthyol. 2016;56: 556–568. 10.1134/S0032945216040123 [DOI] [Google Scholar]
  • 43.Betancur RR, Wiley EO, Arratia G, Acero A, Bailly N, Miya M, et al. Phylogenetic classification of bony fishes. BMC Evol Biol. 2017;17: 162 10.1186/s12862-017-0958-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Turanov S V, Kartavtsev YP, Lee YH, Jeong D. Molecular phylogenetic reconstruction and taxonomic investigation of eelpouts (Cottoidei: Zoarcales) based on Co-1 and Cyt-b mitochondrial genes. Mitochondrial DNA. 2016;14: 1–11. 10.3109/24701394.2016.1155117 [DOI] [PubMed] [Google Scholar]
  • 45.Radchenko OA, Chereshnev IA, Petrovskaya AV, Antonenko D V. Genetic differentiation of species and taxonomic structure of the superfamily Stichaeoidea (Perciformes: Zoarcoidei). Russ J Mar Biol. 2014;40: 473–485. 10.1134/s1063074014060194 [DOI] [Google Scholar]
  • 46.Radchenko OA. The system of the Suborder Zoarcoidei (Pisces, Perciformes) as inferred from molecular genetic data. Russ J Genet. 2015;51: 1096–1112. 10.1134/s1022795415100130 [DOI] [PubMed] [Google Scholar]
  • 47.Gong Z, Fletcher GL, Hew CL. Tissue distribution of fish antifreeze protein mRNAs. Can J Zool. 2008;70: 810–814. 10.1139/z92-114 [DOI] [Google Scholar]
  • 48.Kumar D, Mains RE, Eipper BA. From POMC and α-MSH to PAM, molecular oxygen, copper, and vitamin C. J Mol Endocrinol. 2016;56: T63–T76. 10.1530/JME-15-0266 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kristiansson E, Asker N, Förlin L, Larsson DGJ. Characterization of the Zoarces viviparus liver transcriptome using massively parallel pyrosequencing. BMC Genomics. 2009;10: 345 10.1186/1471-2164-10-345 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Windisch HS, Lucassen M, Frickenhaus S. Evolutionary force in confamiliar marine vertebrates of different temperature realms: Adaptive trends in zoarcid fish transcriptomes. BMC Genomics. 2012;13: 549 10.1186/1471-2164-13-549 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.DeLano WL. Pymol: An open-source molecular graphics tool. CCP4 Newsl Protein Crystallogr. 2002;40: 82–92. [Google Scholar]
  • 52.Antson AA, Smith DJ, Roper DI, Lewis S, Caves LSD, Verma CS, et al. Understanding the mechanism of ice binding by type III antifreeze proteins. J Mol Biol. 2001;305: 875–889. 10.1006/jmbi.2000.4336 [DOI] [PubMed] [Google Scholar]
  • 53.Wilkens C, Poulsen JCN, Ramløv H, Lo Leggio L. Purification, crystal structure determination and functional characterization of type III antifreeze proteins from the European eelpout Zoarces viviparus. Cryobiology. 2014;69: 163–168. 10.1016/j.cryobiol.2014.07.003 [DOI] [PubMed] [Google Scholar]
  • 54.Hagemans D, van Belzen IAEM, Luengo TM, Rüdiger SGD. A script to highlight hydrophobicity and charge on protein surfaces. Front Mol Biosci. 2015;2: 56 10.3389/fmolb.2015.00056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Hamada T, Ito Y, Abe T, Hayashi F, Güntert P, Inoue M, et al. Solution structure of the antifreeze-like domain of human sialic acid synthase. Protein Sci. 2006;15: 1010–1060. 10.1110/ps.051700406 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Davies PL, Graham LA. Protein evolution revisited. Syst Biol Reprod Med. 2018;64: 403–416. 10.1080/19396368.2018.1511764 [DOI] [PubMed] [Google Scholar]
  • 57.Pross J, Contreras L, Bijl PK, Greenwood DR, Bohaty SM, Schouten S, et al. Persistent near–tropical warmth on the antarctic continent during the early eocene epoch. Nature. 2012;488: 73–77. 10.1038/nature11300 [DOI] [PubMed] [Google Scholar]
  • 58.Hobbs RS, Fletcher GL. Epithelial dominant expression of antifreeze proteins in cunner suggests recent entry into a high freeze-risk ecozone. Comp Biochem Physiol—A Mol Integr Physiol. 2013;164: 111–118. 10.1016/j.cbpa.2012.10.017 [DOI] [PubMed] [Google Scholar]
  • 59.Matschiner M, Hanel R, Salzburger W. On the origin and trigger of the notothenioid adaptive radiation. PLoS One. 2011;6: e18911 10.1371/journal.pone.0018911 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Near TJ, Dornburg A, Kuhn KL, Eastman JT, Pennington JN, Patarnello T, et al. Ancient climate change, antifreeze, and the evolutionary diversification of Antarctic fishes. Proc Natl Acad Sci. 2012;109: 3434–3439. 10.1073/pnas.1115169109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Miller GH, Brigham-Grette J, Alley RB, Anderson L, Bauch HA, Douglas MSV, et al. Temperature and precipitation history of the Arctic. Quat Sci Rev. 2010;29: 1679–1715. 10.1016/j.quascirev.2010.03.001 [DOI] [Google Scholar]
  • 62.Zachos J, Pagani H, Sloan L, Thomas E, Billups K. Trends, rhythms, and aberrations in global climate 65 Ma to present. Science (80-). 2001;292: 686–693. 10.1126/science.1059412 [DOI] [PubMed] [Google Scholar]
  • 63.Stickley CE, St John K, Koç N, Jordan RW, Passchier S, Pearce RB, et al. Evidence for middle Eocene Arctic sea ice from diatoms and ice-rafted debris. Nature. 2009;460: 376–379. 10.1038/nature08163 [DOI] [PubMed] [Google Scholar]
  • 64.Tripati A, Darby D. Evidence for ephemeral middle Eocene to early Oligocene Greenland glacial ice and pan-Arctic sea ice. Nat Commun. 2018;9: 1038 10.1038/s41467-018-03180-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Nazarkin MV. Gunnels (Perciformes, Pholidae) from the Miocene of Sakhaline Island. Ichthyology. 2002;42: 279–288. [Google Scholar]
  • 66.Møller PR, Gravlund P. Phylogeny of the eelpout genus Lycodes (Pisces, Zoarcidae) as inferred from mitochondrial cytochrome b and 12s rDNA. Mol Phylogenet Evol. 2003;26: 369–388. 10.1016/s1055-7903(02)00362-7 [DOI] [PubMed] [Google Scholar]
  • 67.Briggs JC. Marine centres of origin as evolutionary engines. J Biogeogr. 2003;30: 1–18. 10.1046/j.1365-2699.2003.00810.x [DOI] [Google Scholar]
  • 68.Briggs JC, Bowen BW. A realignment of marine biogeographic provinces with particular reference to fish distributions. J Biogeogr. 2012;39: 12–30. 10.1111/j.1365-2699.2011.02613.x [DOI] [Google Scholar]
  • 69.Marincovich L, Gladenkov AY. New evidence for the age of Bering Strait. Quat Sci Rev. 2001;20: 329–335. 10.1016/S0277-3791(00)00113-X [DOI] [Google Scholar]
  • 70.Marincovich L, Gladenkov AY. Evidence for an early opening of the Bering Strait. Nature. 1999;397: 149–151. 10.1038/16446 [DOI] [Google Scholar]
  • 71.Gladenkov AY, Oleinik AE, Marincovich L, Barinov KB. A refined age for the earliest opening of Bering Strait. Palaeogeogr Palaeoclimatol Palaeoecol. 2002;183: 321–328. 10.1016/S0031-0182(02)00249-3 [DOI] [Google Scholar]
  • 72.Froese R, Pauly D, Editors. FishBase. World Wide Web electronic publication. [cited 10 Feb 2018]. www.fishbase.org
  • 73.Trend-Staid M, Prell WL. Sea surface temperature at the Last Glacial Maximum: A reconstruction using the modern analog technique. Paleoceanography. 2002;17: 1–18. 10.1029/2000pa000506 [DOI] [Google Scholar]
  • 74.Donnelly J, Torres JJ, Sutton TT, Simoniello C. Fishes of the eastern Ross Sea, Antarctica. Polar Biol. 2004;27: 637–650. 10.1007/s00300-004-0632-2 [DOI] [Google Scholar]
  • 75.Biscoito M, Almeida AJ. New Species of Pachycara Zugmayer (Pisces: Zoarcidae) from the Rainbow Hydrothermal Vent Field (Mid-Atlantic Ridge). Copeia. 2006. 10.1643/ci-03-031r2 [DOI] [Google Scholar]
  • 76.Corbella C, Møller PR. Description of a new deep-sea eelpout Pachycara matallanasi sp. nov. from the Solomon Sea (western South Pacific Ocean). Mar Biol Res. 2015;11: 180–187. 10.1080/17451000.2014.894245 [DOI] [Google Scholar]
  • 77.Salim D, Gerton JL. Ribosomal DNA instability and genome adaptability. Chromosom Res. 2019;27: 73–87. 10.1007/s10577-018-9599-7 [DOI] [PubMed] [Google Scholar]
  • 78.Garnham CP, Natarajan A, Middleton AJ, Kuiper MJ, Braslavsky I, Davies PL. Compound ice-binding site of an antifreeze protein revealed by mutagenesis and fluorescent tagging. Biochemistry. 2010;49: 9063–9071. 10.1021/bi100516e [DOI] [PubMed] [Google Scholar]
  • 79.Garnham CP, Nishimiya Y, Tsuda S, Davies PL. Engineering a naturally inactive isoform of type III antifreeze protein into one that can stop the growth of ice. FEBS Lett. 2012;586: 3876–3881. 10.1016/j.febslet.2012.09.017 [DOI] [PubMed] [Google Scholar]
  • 80.Harding MM, Ward LG, Haymet ADJ. Type I “antifreeze” proteins. Structure-activity studies and mechanisms of ice growth inhibition. Eur J Biochem. 1999;264: 653–665. 10.1046/j.1432-1327.1999.00617.x [DOI] [PubMed] [Google Scholar]
  • 81.Osuga DT, Feeney RE. Antifreeze glycoproteins from arctic fish. J Biol Chem. 1978;253: 5338–5343. [PubMed] [Google Scholar]
  • 82.Schrag JD, Devries AL. The effect of freezing rate on the cooperativity of antifreeze glycopeptides. Comp Biochem Physiol—Part A Physiol. 1983;74: 381–385. 10.1016/0300-9629(83)90619-9 [DOI] [Google Scholar]
  • 83.Wang L, Duman JG. Antifreeze proteins of the beetle Dendroides canadensis enhance one another’s activities. Biochemistry. 2005;44: 10305–10312. 10.1021/bi050728y [DOI] [PubMed] [Google Scholar]
  • 84.Baardsnes J, Davies PL. Contribution of hydrophobic residues to ice binding by fish type III antifreeze protein. Biochim Biophys Acta. 2002;1601: 49–54. 10.1016/s1570-9639(02)00431-4 [DOI] [PubMed] [Google Scholar]
  • 85.Huang X, Madan A. CAP3: A DNA sequence assembly program. Genome Res. 1999;9: 868–877. 10.1101/gr.9.9.868 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Sievers F, Higgins DG. Clustal omega, accurate alignment of very large numbers of sequences. Methods Mol Biol. 2014;1079: 105–116. 10.1007/978-1-62703-646-7_6 [DOI] [PubMed] [Google Scholar]
  • 87.Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol. 2016;33: 1870–1874. 10.1093/molbev/msw054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Near TJ, Dornburg A, Eytan RI, Keck BP, Smith WL, Kuhn KL, et al. Phylogeny and tempo of diversification in the superradiation of spiny-rayed fishes. Proc Natl Acad Sci. 2013;110: 12738–12743. 10.1073/pnas.1304661110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.McCusker MR, Bentzen P. Phylogeography of 3 North Atlantic wolffish species (Anarhichas spp.) with phylogenetic relationships within the family anarhichadidae. J Hered. 2010;101: 591–601. 10.1093/jhered/esq062 [DOI] [PubMed] [Google Scholar]
  • 90.Hansen J, Sato M, Russell G, Kharecha P. Climate sensitivity, sea level and atmospheric carbon dioxide. Philos Trans R Soc A Math Phys Eng Sci. 2013;371: 20120294 10.1098/rsta.2012.0294 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Chao H, DeLuca CI, Davies PL, Sykes BD, Sönnichsen FD. Structure‐function relationship in the globular type III antifreeze protein: Identification of a cluster of surface residues required for binding to ice. Protein Sci. 1994;3: 1760–1769. 10.1002/pro.5560031016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD, et al. Protein Identification and Analysis Tools on the ExPASy Server The Proteomics Protocols Handbook. 2005. pp. 571–607. [Google Scholar]

Decision Letter 0

Michael Schubert

4 Mar 2020

PONE-D-19-34926

Antifreeze protein dispersion in eelpouts and related fishes reveal migration and climate alteration within the last 20 Ma

PLOS ONE

Dear Dr. Davies,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that comprehensively addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Apr 18 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Michael Schubert

Academic Editor

PLOS ONE

PLOS ONE now requires that submissions reporting blots or gels include original, uncropped blot/gel image data as a supplement or in a public repository. This is in addition to complying with the image preparation guidelines described at https://journals.plos.org/plosone/s/figures#loc-blot-and-gel-reporting-requirements . These requirements apply both to the main figures and to cropped blot/gel images included in Supporting Information.

Journal Requirements:

When submitting your revision, we need you to address these additional requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. PLOS ONE now requires that authors provide the original uncropped and unadjusted images underlying all blot or gel results reported in a submission’s figures or Supporting Information files. This policy and the journal’s other requirements for blot/gel reporting and figure preparation are described in detail at https://journals.plos.org/plosone/s/figures#loc-blot-and-gel-reporting-requirements and https://journals.plos.org/plosone/s/figures#loc-preparing-figures-from-image-files. When you submit your revised manuscript, please ensure that your figures adhere fully to these guidelines and provide the original underlying images for all blot or gel data reported in your submission. See the following link for instructions on providing the original image data: https://journals.plos.org/plosone/s/figures#loc-original-images-for-blots-and-gels.

In your cover letter, please note whether your blot/gel image data are in Supporting Information or posted at a public data repository, provide the repository URL if relevant, and provide specific details as to which raw blot/gel images, if any, are not available. Email us at plosone@plos.org if you have any questions.

3. Thank you for including your ethics statement:  "Tissues were obtained from fish maintained at the Ocean Sciences Centre, Memorial University of Newfoundland that were euthanized by an overdose of MS222 just prior to tissue extraction. Experiments were carried out in accordance with Animal Utilization Protocols issued by Memorial University of Newfoundland’s Animal Care Committee. All measures were taken to minimize pain and discomfort during animal experiments. Guidelines followed were those of the Canadian Council on Animal Care (CCAC)."

Please amend your current ethics statement to confirm that your named institutional review board or ethics committee specifically approved this study.

For additional information about PLOS ONE submissions requirements for ethics oversight of animal work, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-animal-research  

Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

4. Please upload a copy of Supporting Information Figure S4 which you refer to in your text on page 10.

5. Please upload a copy of Supporting Information Table S1-S4 which you refer to in your text on page 7, 9 and 21.

 

 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Dear Dr. Hobbs et al.,

I have now read through the submitted ms PONE-D-19-34926 in detail – and I do have some concerns that I will specify in detail below:

First, I have some issues with the readability of the paper, i.e. which question that was asked vs how the results were presented. F. ex. I would have loved to see more readable figures that includes the full common name and/or scientific names. As presented now (from Figure 2 and onwards) the authors have only listed the initials of the common names (then I needed to double check the actual species named over and over again (a bit frustrating and very confusing for people that are not that familiar with the group of fishes that have been studied here).

And: not sure if which tissue needs to be specified in the figure? Think the different sequences could be denoted a number – and linked to tissue f. ex. in a supp table (since which tissue the sequence is obtained from seems not to be in focus in this paper).

Further, it is not clear which data the authors have generated themselves vs. data/sequences that they have collected from databases (GeneBank ++). Maybe this could have been elaborated in the very beginning of the Mat and Met/Results section. And by such specify the samples collected/investigated that were needed to fill the gaps/answered the specific questions in mind? This is somewhat touched upon in the intro – but could preferably be elaborated more in the Mat and Met section.

In this regard I would also suggest some re-writing of the introduction – to get the message through to the reader. In the intro you start out with a general paragraph re the antifreeze proteins in fish – which is great. Then in the second paragraph moving into the details of AFP III which is the main focus of the ms – which is also OK as is. The next (third) paragraph lists the other AFPs and AFGPs and how they have evolved and then and a fourth paragraph about the timing of the divergence of the different lineages that produced AFP. I would suggest that these two paragraphs are combined into one – and maybe focus on the evolutionary aspects (convergent evolution and timing the most; a bit descriptive as is). And this paragraph could preferably have become the second paragraph of the intro – setting the stage of the question in mind. And then going into the detail about the AFP type III afterwards and then moving into the questions to be asked within this lineage. I would then also have moved upward the description about the early southern blotting studies up (line 95-101) + the description about the Antarctic species (104-112) to the AFP type III paragraph (which I suggest to be the third paragraph). Can maybe be divided in to two paragraphs. But then at least, you could list the specific scientific questions in on go at the very end of the introduction. As is, you give some introduction/background and list the question, move on to some more details/background information and a new question. Think it would be better to combine the background information and then list the questions in one go at the very end.

How the results are presented is also a bit messy. Would have first given the reader the results of the phylogeny of this infra order (confirming the Figure 1 which is presented in the ms). Then I would have presented the phylogenetic gene trees of the APF sequences together with the full-length cDNA sequences encoding both QAE and SP isoforms from rock

gunnel and radiated shanny as well as the additional sequences identified for the P. brachycephalum and the Atlantic ocean pout (and for the two latter some of the detailed description could go into the Mat and Met or supp I think).

And one last comment re the result section – do not think the subtitles are optimal should be re-written to mirror the results presented in a better way.

Then to the inference of the data, where I think the conclusions about the timing of the evolution of the AFPs within this lineage and that QAE sequences origin from direct decent are solid. However, I do question if they could say the same for the SPs? Here, we do see a lager degree of homology within families (that they cluster together) which is not the case for the QAE sequences (at the same degree at least).

I would say that the higher divergence between species for the QAE sequences indicate strongly that they origin from the same ancestor QAE sequence, that have then evolved in the different lineages in different paces/slightly different directions. But for SP the results could indicate that the SP sequences have evolved over and over again independently for the different families (from the SAS sequence specific for the different species and/or the common ancestor for the specific families). Think this cannot be ruled out – and maybe also plausible – since the SPs is only functional in combination with the QAE type (and could be looked upon as added value in addition to the QAE under certain circumstances I guess). Think this or higher degree of divergent selection could explain this (more than gene losses and duplications as the authors state). This could and should be further looked into by performing dN/dS analyses of the sequence data. Would also strongly suggest to do these analyses for the QAE sequences too.

And f. ex not sure if I truly agree on the statement re the QAE sequences on line 255-256: “This suggests that the common ancestor of these families may have possessed a larger number of QAE sequences than SP sequences.”

These speculations, as well as about more gene losses and duplications being responsible for the clustering of the SP sequences can only be inferred if they have had whole genome sequencing data. Same goes for the plausible loss of the SP in the Antarctica species. Full genome data set is needed to confirm a loss) so should be careful here I think.

Additionally, in this paper only two Antarctic species are investigated (it is listed that this linage contain about 100 species so I guess some of them could have the SP type (i.e. not yet confirmed but then either retained if lost in some of them and/or evolved from the SAS sequence and/or the QAE sequence in the common ancestor of this lineage)).

And then what I really miss in this ms is the sequence data for SP for the Canadian eelpout, would have been beneficial to have that one to see how divergent this SP sequence is compared to the other ones. Any change that the authors could get hold of that sequence information? This information would for sure enlighten the evolutionary path of the SP within this lineage.

My last comment is about the speculation how the eelpouts have migrated down south. Is there not a possibility that the eelpouts have moved northwards and southwards from different refugia? And diversified in different rounds after settlement in the different regions? Would have loved to see some more speculations here – and then not at least linked more to past paleoclimatic events during the Miocene (cooling ++ as well as different possible refugia during that period). Think this explanation is more likely to have found place.

In relation to this, they write that the eelpouts might have migrated via the cold depths and at the same time (on the way down) lost their SP due to lack of ice crystals at these depths. Not sure if the “red” color on Figure 4 is describing this in a good way as well as the migratory route (not that illustrative). Would suggest to modify this figure and also take into account the other possibilities re coming for different refugia f. ex.

One minor comment:

Would strongly suggest to use gene duplication not amplification.

Reviewer #2: Antifreeze proteins are fascinating examples of convergent evolution of function. Four different types of antifreeze proteins have evolved in fish species throughout the globe. This manuscript addresses three questions about the evolution of Type III antifreeze proteins in Zoarcales including whether the proteins arose by direct descent vs parallel evolution, when the antifreeze proteins (AFPs) arose, and the timing of the colonization of the Antarctic by zoarcids.

The finding of the AFP in the Alaskan ronquil is exciting and challenges the current accepted timeline for the evolution of AFPs. I agree with the manuscript that it also provides additional evidence for an earlier cooling of the Arctic Ocean.

The evidence for the evolution of QAE and SP isoforms is contradictory. While the phylogenetic tree places the appearance of the QAE at the Bathymasteridae (Figure 1), the tree of relationships in Figure 3 suggests that the SP isoform is ancestral and the QAE isoform is the derived state, especially with regards to the placement of the OPpan and the AEsasB. To simply state it: Figure 1 shows QAE arose first and SP is derived whereas Figure 3 shows the opposite (SP is ancestral and QAE is derived). The results in Figure 4 are also contradictory with the statements regarding the Antarctic having the highly derived sequence. Many of the branches have little to no support, which suggests the branches should be collapsed and not interpreted as evidence for the findings. Moreover, the authors propose that “likely lost all but one or a few nearly-identical AFP genes during the journey [from the Arctic to Antarctic]” and that “gene losses and duplications have occurred frequently, particularly within the SP group” so it is possible that SP were lost in other lineages and the SP form arose first.

The manuscript highlights a recent gene amplification in the Viviparous eelpout, however, there is no discussion of the L. dearborni expansion. There is data available from ~30 L. dearborni AFPs on GenBank through BAC sequencing (Deng et al 2010), which has the genomic context for the expansion.

The findings are generally confounded by the duplications and deletions that are presumably occurring very frequently.

For some of the sections, the authors provide alternative explanations. The alternative explanations are incredibly helpful in interpreting the results and I appreciate the inclusion.

The figure legends and text are confounded and it is hard to determine what is in the figure legend and what is main text.

Initials of a common name seems to be an unusual way to label sequences in the text and Figure 2 (sequence alignment).

The tissue of origin is included on some of the isoforms, does the tissue of origin matter? The liver is the most common place for transcription of AFPs but there seem to be tissue-specific transcripts. Are those relevant for the organismal performance? There are several similar smaller results in the manuscript where it is unclear the relevance to the larger findings.

None of the supplementary tables are available.

Figure S4 is not referred to in the main text and is not available in the supplementary materials.

Is there evidence for loss of the SP isoform?

Figure 4C is not informative.

There are some references that are missing (for example, on line 111).

Line 269 has some specific PLOS guidelines that should be removed.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Dec 15;15(12):e0243273. doi: 10.1371/journal.pone.0243273.r002

Author response to Decision Letter 0


21 Apr 2020

Dear Dr. Schubert

Thank you for coordinating the review of our manuscript and for inviting us to submit a revision that addresses the criticisms raised by the reviewers. We apologize for the lack of inclusion of the supplementary tables and the figure legends during the submission process. We have addressed our responses to the reviewers point by point as documented below.

Reviewer Comments:

Reviewer 1

I have now read through the submitted ms PONE-D-19-34926 in detail – and I do have some concerns that I will specify in detail below:

First, I have some issues with the readability of the paper, i.e. which question that was asked vs how the results were presented. F. ex. I would have loved to see more readable figures that includes the full common name and/or scientific names. As presented now (from Figure 2 and onwards) the authors have only listed the initials of the common names (then I needed to double check the actual species named over and over again (a bit frustrating and very confusing for people that are not that familiar with the group of fishes that have been studied here).

- We have taken this advice and have altered all the figures and tables to include the full common name, or a slight contraction, where necessary. In addition, we have removed the accession numbers from figure 2 and the supplementary figures and inserted them into the corresponding figure legends.

And: not sure if which tissue needs to be specified in the figure? Think the different sequences could be denoted a number – and linked to tissue f. ex. in a supp table (since which tissue the sequence is obtained from seems not to be in focus in this paper).

- This point was also raised by Reviewer 2, and as described below information about tissue origin has been removed from the figures. However, this information can still be accessed from Table S3.

Further, it is not clear which data the authors have generated themselves vs. data/sequences that they have collected from databases (GeneBank ++). Maybe this could have been elaborated in the very beginning of the Mat and Met/Results section. And by such specify the samples collected/investigated that were needed to fill the gaps/answered the specific questions in mind? This is somewhat touched upon in the intro – but could preferably be elaborated more in the Mat and Met section.

- We have now distinguished the data we generated ourselves from those that came from databases by using a double underline in sequence alignments and in phylogenetic trees to denote sequences we obtained via PCR, and a single underline for those newly assembled from the sequence read archive in GenBank. This is now stated in Materials and Methods at line 422 and in the legend to Fig 2. No underlining is used for sequences obtained from the protein or nucleotide databases of GenBank.

In this regard I would also suggest some re-writing of the introduction – to get the message through to the reader. In the intro you start out with a general paragraph re the antifreeze proteins in fish – which is great. Then in the second paragraph moving into the details of AFP III which is the main focus of the ms – which is also OK as is. The next (third) paragraph lists the other AFPs and AFGPs and how they have evolved and then and a fourth paragraph about the timing of the divergence of the different lineages that produced AFP. I would suggest that these two paragraphs are combined into one – and maybe focus on the evolutionary aspects (convergent evolution and timing the most; a bit descriptive as is). And this paragraph could preferably have become the second paragraph of the intro – setting the stage of the question in mind. And then going into the detail about the AFP type III afterwards and then moving into the questions to be asked within this lineage. I would then also have moved upward the description about the early southern blotting studies up (line 95-101) + the description about the Antarctic species (104-112) to the AFP type III paragraph (which I suggest to be the third paragraph). Can maybe be divided in to two paragraphs. But then at least, you could list the specific scientific questions in on go at the very end of the introduction. As is, you give some introduction/background and list the question, move on to some more details/background information and a new question. Think it would be better to combine the background information and then list the questions in one go at the very end.

- The introduction has been extensively edited and rearranged to follow these suggestions and address the reviewer’s concerns. These changes can be seen in the marked-up version of the manuscript.

How the results are presented is also a bit messy. Would have first given the reader the results of the phylogeny of this infra order (confirming the Figure 1 which is presented in the ms).

- The phylogeny results have been moved forward so the alignment and tree that support Figure 1 are now the first and second Supplementary Figures.

Then I would have presented the phylogenetic gene trees of the APF sequences together with the full-length cDNA sequences encoding both QAE and SP isoforms from rock

gunnel and radiated shanny as well as the additional sequences identified for the P. brachycephalum and the Atlantic ocean pout (and for the two latter some of the detailed description could go into the Mat and Met or supp I think).

- Done

And one last comment re the result section – do not think the subtitles are optimal should be re-written to mirror the results presented in a better way.

- We have revised many of the sub-titles to make them more descriptive of the results being presented in the following text.

Then to the inference of the data, where I think the conclusions about the timing of the evolution of the AFPs within this lineage and that QAE sequences origin from direct decent are solid. However, I do question if they could say the same for the SPs? Here, we do see a lager degree of homology within families (that they cluster together) which is not the case for the QAE sequences (at the same degree at least).

I would say that the higher divergence between species for the QAE sequences indicate strongly that they origin from the same ancestor QAE sequence, that have then evolved in the different lineages in different paces/slightly different directions. But for SP the results could indicate that the SP sequences have evolved over and over again independently for the different families (from the SAS sequence specific for the different species and/or the common ancestor for the specific families). Think this cannot be ruled out – and maybe also plausible – since the SPs is only functional in combination with the QAE type (and could be looked upon as added value in addition to the QAE under certain circumstances I guess). Think this or higher degree of divergent selection could explain this (more than gene losses and duplications as the authors state). This could and should be further looked into by performing dN/dS analyses of the sequence data. Would also strongly suggest to do these analyses for the QAE sequences too.

- The sequence differences between the SAS C-terminal domain and SP-type AFP isoforms are considerable and much more extensive than between SP sequences. Therefore, it seems unlikely that such similar SP sequences would independently evolve over and over again. Nevertheless, as a theoretical possibility we have included this scenario at your suggestion.

- We have performed the suggested dN/dS analyses, which suggest the AFP sequences are under positive selection. We describe these results beginning on line 276 and show the data in Fig S10A.

And f. ex not sure if I truly agree on the statement re the QAE sequences on line 255-256: “This suggests that the common ancestor of these families may have possessed a larger number of QAE sequences than SP sequences.”

- We do realize that this statement is speculative and as these genes have likely undergone many rounds of expansion and contraction, but the greater variability within the QAE group does indicate that it is a possibility.

These speculations, as well as about more gene losses and duplications being responsible for the clustering of the SP sequences can only be inferred if they have had whole genome sequencing data. Same goes for the plausible loss of the SP in the Antarctica species. Full genome data set is needed to confirm a loss) so should be careful here I think.

- Your point is well taken. It is worth noting that genome sequencing is the easy part. Assembling and annotating the genome is incredibly difficult and very few have been done properly from start to finish. A good example where genome sequencing of an AFP-producing fish was not helpful is described in Zhuang X, Yang C, Fevolden SE, Cheng CH. Protein genes in repetitive sequence-antifreeze glycoproteins in Atlantic cod genome. BMC Genomics. 2012 13:293. doi: 10.1186/1471-2164-13-293. PMID: 22747999. There, the large number AFGP genes of a northern cod were missed in the genome assembly due to their repetitive nature.

Additionally, in this paper only two Antarctic species are investigated (it is listed that this linage contain about 100 species so I guess some of them could have the SP type (i.e. not yet confirmed but then either retained if lost in some of them and/or evolved from the SAS sequence and/or the QAE sequence in the common ancestor of this lineage)).

- If any species had an AFP that had convergently evolved (parallelism) from the SAS sequence, it should form a distinct clade within the phylogenetic tree in Figure 3. What we observe is a very long branch to SAS (dotted line) and a clustering of all known type III AFPs, including those from these two Antarctic species. This point is made in lines 206 to 209 of the Results and touched on again in the Discussion (lines 328 – 332).

- There are over 400 species of which there is sequence information on 12. We are confident about our model, but it could certainly be revisited when sequences are available from substantially more species.

And then what I really miss in this ms is the sequence data for SP for the Canadian eelpout, would have been beneficial to have that one to see how divergent this SP sequence is compared to the other ones. Any change that the authors could get hold of that sequence information? This information would for sure enlighten the evolutionary path of the SP within this lineage.

- Unfortunately, the only sequence available from this fish is a QAE isoform and this was determined by Edman degradation of a purified protein in 1987 (Schrag JD, Cheng C-HC, Panico M, Morris HR, DeVries AL. Primary and secondary structure of antifreeze peptides from arctic and antarctic zoarcid fishes. Biochim Biophys Acta - Protein Struct Mol Enzymol. 1987;915: 357–370).

My last comment is about the speculation how the eelpouts have migrated down south. Is there not a possibility that the eelpouts have moved northwards and southwards from different refugia? And diversified in different rounds after settlement in the different regions? Would have loved to see some more speculations here – and then not at least linked more to past paleoclimatic events during the Miocene (cooling ++ as well as different possible refugia during that period). Think this explanation is more likely to have found place.

- We think it is unlikely for two reasons stated in the manuscript: 1) The AFPs are all clearly related and appear to have evolved once. 2) The Antarctic species arose from northern species long after ice was again present at the poles, so the fish had clearly left refugia and populated icy seas well before the lineage that includes the southern species arose.

In relation to this, they write that the eelpouts might have migrated via the cold depths and at the same time (on the way down) lost their SP due to lack of ice crystals at these depths. Not sure if the “red” color on Figure 4 is describing this in a good way as well as the migratory route (not that illustrative). Would suggest to modify this figure and also take into account the other possibilities re coming for different refugia f. ex.

- The red colour indicates the warmer surface waters near the equator, whereas the blue indicates cold water at all depths in the north and south and at depths throughout the tropics. This is now clearly stated in the legend to Fig. 4. We have added a dashed arrow to illustrate the most likely direction of migration. We have not added the other speculation to the figure to avoid confusing the reader but have detailed this in the text as described above.

One minor comment:

Would strongly suggest to use gene duplication not amplification.

- In our earlier work we have published several examples of AFP genes going from a presumed single copy to huge tandem arrays of 30 to over 150 gene copies. Although gene duplication was no doubt responsible for the initial expansion of gene copy number, unequal crossing over would be at work for the major expansion of the tandem arrays. We have now added this definition of gene amplification into the manuscript on lines 235 and 236.

“This suggests that their genes have undergone multiple rounds of gene duplication and unequal crossing over (gene amplification) within the last few million years, after the ocean pout lineage separated from that leading to the notched fin and viviparous lineages.”

Reviewer #2: Antifreeze proteins are fascinating examples of convergent evolution of function. Four different types of antifreeze proteins have evolved in fish species throughout the globe. This manuscript addresses three questions about the evolution of Type III antifreeze proteins in Zoarcales including whether the proteins arose by direct descent vs parallel evolution, when the antifreeze proteins (AFPs) arose, and the timing of the colonization of the Antarctic by zoarcids.

The finding of the AFP in the Alaskan ronquil is exciting and challenges the current accepted timeline for the evolution of AFPs. I agree with the manuscript that it also provides additional evidence for an earlier cooling of the Arctic Ocean.

The evidence for the evolution of QAE and SP isoforms is contradictory. While the phylogenetic tree places the appearance of the QAE at the Bathymasteridae (Figure 1), the tree of relationships in Figure 3 suggests that the SP isoform is ancestral and the QAE isoform is the derived state, especially with regards to the placement of the OPpan and the AEsasB. To simply state it: Figure 1 shows QAE arose first and SP is derived whereas Figure 3 shows the opposite (SP is ancestral and QAE is derived). The results in Figure 4 are also contradictory with the statements regarding the Antarctic having the highly derived sequence. Many of the branches have little to no support, which suggests the branches should be collapsed and not interpreted as evidence for the findings. Moreover, the authors propose that “likely lost all but one or a few nearly-identical AFP genes during the journey [from the Arctic to Antarctic]” and that “gene losses and duplications have occurred frequently, particularly within the SP group” so it is possible that SP were lost in other lineages and the SP form arose first.

- We argue that there is no evidence to support one isoform type preceding the other and have added some clarifications to address the concern of the reviewer. Our lack of recovery of an SP isoform from ronquil could have occurred due to non-matching primers or the poor quality of the museum specimen used. We have altered figure 1 to include “SP?” at the branchpoint leading to ronquil to indicate that it is still undetermined as to when this isoform arose. We would also argue that figure 3 does not indicate that SP arose first. The SP and QAE groups diverge at a single node. If the two “intermediate” sequences are excluded, the bootstrap value for this node is very well supported. If the QAE sequences arose from an SP sequence, we would expect them to form a subcluster within the SP tree, rather than being separate from it.

The manuscript highlights a recent gene amplification in the Viviparous eelpout, however, there is no discussion of the L. dearborni expansion. There is data available from ~30 L. dearborni AFPs on GenBank through BAC sequencing (Deng et al 2010), which has the genomic context for the expansion.

- The sequences from these BACs were indeed used in our manuscript, the accession numbers were given, and the manuscript was cited. We have added text in the Discussion (line 390) and in Materials and Methods (line 448) to make it clearer that all these BACs were examined, and we have cited the manuscript again there and on line 357. The reason that only seven L. dearborni sequences were used in the phylogenetic tree was that most of these sequences differed at two or fewer residues, so they would not have added anything of value to the phylogenetic analysis. We do state in Materials and Methods that this criterion was used when selecting sequences for further analysis but had not mentioned L. dearborni specifically as it was applied to sequences from all species.

The findings are generally confounded by the duplications and deletions that are presumably occurring very frequently.

- On line 240, we reference the example of type III AFP gene copy variation in different populations of present-day eel pouts (see reference 26).

For some of the sections, the authors provide alternative explanations. The alternative explanations are incredibly helpful in interpreting the results and I appreciate the inclusion.

- We appreciate that comment.

The figure legends and text are confounded and it is hard to determine what is in the figure legend and what is main text.

- We have paid particular attention to remove results and discussion from figure legends and have only placed there information needed to interpret the figure.

Initials of a common name seems to be an unusual way to label sequences in the text and Figure 2 (sequence alignment).

- As requested by Reviewer 1 we have changed the labeling to refer now to common species names.

The tissue of origin is included on some of the isoforms, does the tissue of origin matter? The liver is the most common place for transcription of AFPs but there seem to be tissue-specific transcripts. Are those relevant for the organismal performance? There are several similar smaller results in the manuscript where it is unclear the relevance to the larger findings.

- It is difficult to say if the tissue origin will be important at some point. This information has been removed from the labels and some discussion of tissue origin has been removed from Results. However, this information can still be accessed from Table S3.

None of the supplementary tables are available.

- Our apologies for this oversight. They are now provided.

Figure S4 is not referred to in the main text and is not available in the supplementary materials.

- We reference this figure on lines 303-304 of the manuscript. We do apologize that this figure was somehow not included in the draft pdf that the reviewers received.

Is there evidence for loss of the SP isoform?

- Only as much as the SP isoforms are not found in those species in Antarctica but they are present in the northern species.

Figure 4C is not informative.

- The description for this part of the figure was lacking in the legend. We have rectified this. It represents a cross-section of the oceans from north to south through which the eel pouts migrated in deep, cold water. Water temperature is indicated by a red (warm) – blue (cold) gradient. The likely direction of migration is indicated by a dotted arrow.

There are some references that are missing (for example, on line 111).

- We have checked the source of every nucleotide and protein sequence from ocean pout in the NCBI databases and the three references we cited account for all of them.

Line 269 has some specific PLOS guidelines that should be removed.

- Done

We trust that these changes will meet with your approval.

Sincerely,

Peter L. Davies, PhD, FRSC

Canada Research Chair in Protein Engineering

Laurie A. Graham, PhD

Decision Letter 1

Michael Schubert

17 Jun 2020

PONE-D-19-34926R1

Antifreeze protein dispersion in eelpouts and related fishes reveals migration and climate alteration within the last 20 Ma

PLOS ONE

Dear Dr. Davies,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that comprehensively addresses the points raised during the review process.

Please submit your revised manuscript by Aug 01 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Michael Schubert

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Dear Dr. Hobbs et al.,

I have now read through the revised version of the ms PONE-D-19-34926 in detail, and appreciate the fact that the authors have taken many of my comments into account, but I still have some issues that I think should be dealt with before acceptance for publication.

First, I think the authors did a great job on the introduction and results section, it reads so much better now. Also, happy to see that they performed the suggested additional analyses testing for positive and/or negative selection.

Re the figures I do see that they have tried to change accordingly to my suggestions – but still think they could be improved a bit further:

It is the shortening of some of the common names + underline + underscore that bothers me the most.

I strongly advise the authors to come up with a better naming system – where f. ex. the number for the different sequences in the same cluster is the same (not following a number given by you and linked to the accession number): like “common name” followed by QEA1, “common name” followed by QEA2 and so on – which could be easily used in the text as well – and much more meaning full. The linking of the sequence number to the accession number should still be done – but after you have done the numbering of the clusters – hope you get what I mean here.

I see that this will take up some space but still better than as is with the shortening of the names + the underscore + the number (linked to the accession numbers).

Would be so much better if the sequences are proper qualified into different clades/clusters.

Think this would be less confusing, like the short names + the number linked to the sequences which are used in the text (for example ronquil-1 and Ant_eelpout_4 +++).

Then I would also strongly suggest to use asterisk * or ** instead of underline (think this will look better).

Then to the inference and discussion of the results:

When I now re-read the discussion – I do see that the discussion has become more coherent – but think it still could benefit some re-writings. F.ex. it takes a few paragraphs before they start the discussion of their results for real. I would have started the different paragraphs with a highlight of the results – before going into the discussion – then the reader gets the message/new findings up front.

For dN/dS the results – discussed in line 324-337 – I think the authors could go a bit further in their interpretation here regarding these findings – since they do find differences in positive selection for the QAE vs SP sequences. This is not highlighted in the discussion – which I think it should: the higher degree of positive selection found for the QAE sequences could be linked to the higher variation observed as well as in regards to the clustering patterns. It is stated somewhat in the discussion, but not in relation to the difference in dN/dS ratio observed between the QAE and SP sequences. I truly think this could be linked to the weaker association to families by the QAE sequences – as compared to the SP – which indeed show a clearer separation by family (i.e. which then could be linked to the lower degree of similar positive selection). See my comments below (too (also related to this issue).

Line 329: Can you please elaborate on your results and function of the signatures of selection that you find on the surfaces?

Do understand that they are not involved in the binding to the pyramidal ice plane which are found to be conserved. Are they found on the prism IBS? What is the function of the prism IBS? If not, where do you find most changes – and any idea what those changes could have of effect on the function?

Line 331-333:

I find this sentence a bit awkward and suggest to re-write:

“This together with the fact that the type III AFT is only found within this linage, and not a result of gene conversion and/or lateral gene transfer, indicates that …..”

In the first review I stated the following:

“I would say that the higher divergence between species for the QAE sequences indicate strongly that they origin from the same ancestor QAE sequence, that have then evolved in the different lineages in different paces/slightly different directions. But for SP the results could indicate that the SP sequences have evolved over and over again independently for the different families (from the SAS sequence specific for the different species and/or the common ancestor for the specific families). Think this cannot be ruled out – and maybe also plausible – since the SPs is only functional in combination with the QAE type (and could be looked upon as added value in addition to the QAE under certain circumstances I guess).”

You say that you have added this – cannot see that this has been included (in more depth than already mentioned in line 226-228 (was already included in the ms the first round).

Further I stated: “And f. ex not sure if I truly agree on the statement re the QAE sequences on line 255-256: “This suggests that the common ancestor of these families may have possessed a larger number of QAE sequences than SP sequences.””

How I see this now (when the dN/dS analyses have been performed): Is that both groups (QAE and SP) have most likely undergone gene duplications – also the QAE (as you indirectly state by saying that they have a large number from the very beginning). It is most likely the timing and fate of the gene duplication for the different groups that is somewhat different, with more and stronger selection for the QAEs compared to the SPs, which again could imply that the QAE gene duplications are more likely to be maintained and also result in genes with more similar function over different families. This goes for SP too but most likely at a slower pace. Here we have could have gene duplications that are more easily lost and that selection for similar function is maybe not that evident? In fact, the lower degree of positive selection -> indicate a lower degree of gene duplications (that are maintained) not higher as the authors state. The authors should look into those statements.

For instance, you write in line 371-372: “This suggests that gene losses and duplications have occurred frequently, particularly within the SP group, within the last 10 Ma.”

Can you really say this? Think we for sure agree that this is a result of gene duplication events -> but can we really infer from this that they happened more frequently?

Additionally, is it so that the duplications/precursors for the QAE arose in the ancestor of this lineage, while for the SP sequences the duplication event could have occurred later (in separate rounds, as I stated in my previous comment to the authors). To infer this, it would actually been nice to see the different variants found of both QAE and SP mapped onto the species tree, and also be a nice add on as one of the main figures of the paper. And if made, inference of the timing can be made.

Think the authors should revisit the ms with this in mind and add comments re the different possibilities both in the results section as well as the discussion.

Then I state in my first review:

“These speculations, as well as about more gene losses and duplications being responsible for the clustering of the SP sequences can only be inferred if they have had whole genome sequencing data. Same goes for the plausible loss of the SP in the Antarctica species. Full genome data set is needed to confirm a loss so should be careful here I think.”

This was a statement to the authors that precaution should be made – since they do not have the full overview of the gene variant present or not – and that this should be stated in the paper (i.e. that full genome data-sets are needed to look into this in more detail).

And PS: fully aware of that these genes can be hard assemble – but they will be part of the raw reads and/or unassembled contigs (so most likely genome data will aid here too even if not fully put together).

Furthermore, I stated in my pervious review:

“Additionally, in this paper only two Antarctic species are investigated (it is listed that this linage contain about 100 species (so I guess some of them could have the SP type (i.e. not yet confirmed but then either retained if lost in some of them and/or evolved from the SAS sequence and/or the QAE sequence in the common ancestor of this lineage)).”

This needs to be addressed I think (still not done I see) – that you have only looked into two species – and that the loss observed could potentially not be the case for all of these species.

Moreover, I also stated:

“And then what I really miss in this ms is the sequence data for SP for the Canadian eelpout, would have been beneficial to have that one to see how divergent this SP sequence is compared to the other ones. Any change that the authors could get hold of that sequence information? This information would for sure enlighten the evolutionary path of the SP within this lineage.”

I see that this in not obtained – but they do state that a QAE isoform is identified – can I ask why is this then not included in your analyses? This would also have been a nice add on to the paper for sure! And also, I would have appreciated if the authors could have mentioned made the authors aware of the fact that SP is not obtained in the Canadian eelpout. Could it be that it is also lacking from this species -> lost in the common ancestor? Or do you know that it is present? If yes, how do you know this? In other words – some more elaboration around this – the findings vs. interpretations and limitations in their dataset is still needed.

And I do see that the authors did not agree with me re my hypothesis re a common refugia. How was the climate 5 mill years ago? All fish settled in the north they think?

Would still have loved to see some more speculations here – and then not at least linked more to past paleoclimatic events during the Miocene (cooling ++ as well as different possible refugia during that period). Any reason for why some should start migrating down south?

My last and final comment, is that I miss a concluding remark, as is the discussion is ending quite abruptly.

Some minor comments:

In line 350 you state:

“The two Antarctic eelpouts diverged from the Canadian eelpout more recently than 10 Ma (Fig 1).”

This is not correct I think: the Antarctic eelpouts and the Canadian eelpots have a common ancestor -> they did clearly not diverge from the Canadian eelpouts. Hop ethe authors agree – and re-write.

In line 357 you state:

“The population that migrated south, ....”

This is pure speculation ….. how the migration occurred we do not know …. if it was one or more populations …... could it have been several events? And not at least, the specimens that migrated could have diverged along the gradient down south. Why I say this is that you so far only have two of the Antarctic species included in this study.

The only thig that needs to be added is that “in the species investigated we find …..” so that the reader is aware of this.

In line 381 you state:

“The two Antarctic species did not retain SP isoforms as mentioned above, .....”

Do not get as mention above – in the result section? Not stated anywhere in the discussion as I can see…. Think you should re-write.

Reviewer #2: The revised manuscript is much improved from the original submission.

There are a few outstanding / new issues with the manuscript but I think they are all straightforward to address.

The section on positive selection is incorrect. The added dn/ds analysis is incorrect / incomplete. There are several issues that I outline here: SNAP does not test for selection, so in the current implementation dn/ds > 1 cannot be distinguished from dn/ds = 1. To test for selection, the authors will need to use hyphy or codeml or another similar program that actually uses different codon models to test for selection. The cumulative dn and ds rates (Fig 10A) is not informative about positive selection. It is notable that the signal peptide does not have nonsynonymous or synonymous substitutions, but this has been observed before. The pairwise comparisons with ten or more positions likely leads to an overestimate of dn/ds.

Results line 201+ The inclusion of only the subset of sequences that are at least three amino acids different makes some sense in the figure but the subsequent discussion in the results seems to be about all sequences so there is an inherent challenge in reading the paragraph in the results.

The relevance of the comment about no tandemers in P. brachycephalum in the introduction is unclear.

The end of the first paragraph of the results seems to contradict the introduction about what is known about the shanny and gunnel.

Line 118 three Zoarcales lineages are referred to but it’s not entirely clear which three are relevant here.

Figure 4 was an issue for both reviewers and I still do not feel it is particularly clear and it does not represent any alternative possibilities.

Figure 1, Fig S2 (and possibly others), the underlining of either genus or genus species is not clear to me. The statement in the figure legend does not clarify.

Line 138, only needs to reference Fig 2 for the ronquil-1 sequence.

Line 142, unclear why the percent similarity is a range when it is a single pairwise comparison.

Lines 150-152, the meaning of the sentence is not entirely clear.

Line 199, include the tree in the supplement.

Line 329, should be substitution not “mutation” the dn/ds method analyzes substitutions not mutations.

Line 389, issue with citation format

The figure legends are helpful but now quite extensive and could be shortened for clarity and brevity.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Dec 15;15(12):e0243273. doi: 10.1371/journal.pone.0243273.r004

Author response to Decision Letter 1


17 Nov 2020

Dr. Michael Schubert

Academic Editor

PLOS ONE

Dear Dr. Schubert,

Once again, I would like to thank you for a second opportunity to revise our manuscript PONE-D-19-34926R1 and for giving us an adequate time frame in which to do this. We note that Reviewer #2 was satisfied with our previous responses, but we have still addressed their remaining queries and corrections. We have accommodated most of Reviewer #1’s criticisms but beg to differ on a few issues. All the changes we made are documented in point-by-point form in blue after the Reviewers’ queries.

We trust these changes will allow this study to be published.

With kind regards,

Peter L. Davies

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: No

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Dear Dr. Hobbs et al.,

I have now read through the revised version of the ms PONE-D-19-34926 in detail, and appreciate the fact that the authors have taken many of my comments into account, but I still have some issues that I think should be dealt with before acceptance for publication.

First, I think the authors did a great job on the introduction and results section, it reads so much better now. Also, happy to see that they performed the suggested additional analyses testing for positive and/or negative selection.

Re the figures I do see that they have tried to change accordingly to my suggestions – but still think they could be improved a bit further:

It is the shortening of some of the common names + underline + underscore that bothers me the most.

I strongly advise the authors to come up with a better naming system – where f. ex. the number for the different sequences in the same cluster is the same (not following a number given by you and linked to the accession number): like “common name” followed by QEA1, “common name” followed by QEA2 and so on – which could be easily used in the text as well – and much more meaning full. The linking of the sequence number to the accession number should still be done – but after you have done the numbering of the clusters – hope you get what I mean here.

I see that this will take up some space but still better than as is with the shortening of the names + the underscore + the number (linked to the accession numbers).

Would be so much better if the sequences are proper qualified into different clades/clusters.

Think this would be less confusing, like the short names + the number linked to the sequences which are used in the text (for example ronquil-1 and Ant_eelpout_4 +++).

Then I would also strongly suggest to use asterisk * or ** instead of underline (think this will look better).

We have followed the reviewer’s suggestions for improving the nomenclature of species and sequences. Thus in Figure 1, underlines have been replaced by single and double asterisks to denote the source of the taxonomic determination. Dashes and underscores have been removed from Figure S1 and underlines from all names in all Figures. As requested, abbreviated common names have been replaced by full common names in Figures 2, 3, 4, S3, S4, S5, S7, S9 and Tables S3 and S4. The letter Q or S has been added to each name to differentiate QAE and SP isoforms respectively as this keeps the names slightly shorter. For each species, they have been numbered consecutively within each group as they appear in the tree. As the two main groups within the QAE group cluster by species groups (northern or southern), they were not further differentiated. The text of the main manuscript has been modified to match. For example, P. brachy-1 is now P. brachycephalum-Q1 and viv-eelpout-12 is now viviparous eelpout-S7.

Then to the inference and discussion of the results:

When I now re-read the discussion – I do see that the discussion has become more coherent – but think it still could benefit some re-writings. F.ex. it takes a few paragraphs before they start the discussion of their results for real. I would have started the different paragraphs with a highlight of the results – before going into the discussion – then the reader gets the message/new findings up front.

A paragraph introducing the focus of the paper has been added to the beginning of the Discussion (lines 332-338).

For dN/dS the results – discussed in line 324-337 – I think the authors could go a bit further in their interpretation here regarding these findings – since they do find differences in positive selection for the QAE vs SP sequences. This is not highlighted in the discussion – which I think it should: the higher degree of positive selection found for the QAE sequences could be linked to the higher variation observed as well as in regards to the clustering patterns. It is stated somewhat in the discussion, but not in relation to the difference in dN/dS ratio observed between the QAE and SP sequences. I truly think this could be linked to the weaker association to families by the QAE sequences – as compared to the SP – which indeed show a clearer separation by family (i.e. which then could be linked to the lower degree of similar positive selection). See my comments below (too (also related to this issue).

The dN/dS analysis was added at the reviewer’s request. However, we realize this type of analysis is beyond our expertise. With Reviewer 2 finding fault with our dN/dS analysis, we have withdrawn this section and instead have expanded our analysis of the effects of mutations on the protein, as per the reviewer’s request (see paragraph below). We suggest either reviewer could consider performing a dN/dS analysis of the sequences herein and publish their analysis. We would be happy to provide any additional sequences needed.

Line 329: Can you please elaborate on your results and function of the signatures of selection that you find on the surfaces?

Do understand that they are not involved in the binding to the pyramidal ice plane which are found to be conserved. Are they found on the prism IBS? What is the function of the prism IBS? If not, where do you find most changes – and any idea what those changes could have of effect on the function?

We have expanded our interpretation of the mutations that are found on the surfaces of the AFPs by adding a section addressing the hydrophobicity of the AFPs relative to the progenitor (starting at line 303) and we have somewhat modified the section starting on line 282. Lines 465-480 of the Discussion have been modified as well. We have also expanded figure S10 to included SAS and the shift in hydrophobicity. We also discuss the difficulty with assessing mutations and their effect on activity starting on line 489.

Line 331-333:

I find this sentence a bit awkward and suggest to re-write:

“This together with the fact that the type III AFT is only found within this linage, and not a result of gene conversion and/or lateral gene transfer, indicates that …..”

We could not find this statement in our most recent submission.

In the first review I stated the following:

“I would say that the higher divergence between species for the QAE sequences indicate strongly that they origin from the same ancestor QAE sequence, that have then evolved in the different lineages in different paces/slightly different directions. But for SP the results could indicate that the SP sequences have evolved over and over again independently for the different families (from the SAS sequence specific for the different species and/or the common ancestor for the specific families). Think this cannot be ruled out – and maybe also plausible – since the SPs is only functional in combination with the QAE type (and could be looked upon as added value in addition to the QAE under certain circumstances I guess).”

We agree that it is difficult to ascertain the evolutionary forces that are generating the divergence within the type III gene family. Whenever genes are present in a large gene family, it is challenging to try to tease apart the role of gene duplication/amplification and gene loss, as well as genetic drift and founder effects, from selection. However, from Figure 2 it is apparent that the SP sequences from all the species share similarities that are unique to this group and that are not found within the QAE sequences. Additionally, they cluster in Figure 3. Thus we see nothing to suggest SP sequences have evolved over and over again.

You say that you have added this – cannot see that this has been included (in more depth than already mentioned in line 226-228 (was already included in the ms the first round).

The changes we made were in the Discussion. These issues are discussed in the paragraphs starting on line 370, 403, 422 and 444, which have been further modified. We have also added an analysis including wolf eel SAS to the Results, within the paragraph starting on line 213.

Further I stated: “And f. ex not sure if I truly agree on the statement re the QAE sequences on line 255-256: “This suggests that the common ancestor of these families may have possessed a larger number of QAE sequences than SP sequences.””

This is our conjecture based on the greater diversity of the QAE isoforms, particularly in the ocean pout. The statement is doubly qualified with the use of “suggests” and “may have”. We cannot be certain what happened millions of years ago and is still going on, but have put forward what we think is the most plausible scenario. That there are other possible scenarios is not grounds to obstruct publication of this work. Again, these other ideas could be published in a follow-up paper or could be revisited once the technology improves to the point that we can obtain complete assemblies of AFP loci from multiple fish.

Please note: PLOS now offers accepted authors the opportunity to publish the peer review history of their manuscript alongside the final article. The peer review history package includes the complete editorial decision letter for each revision, with reviews, and our responses to reviewer comments. We would like to pick this option so that your alternative suggestions can be aired.

How I see this now (when the dN/dS analyses have been performed): Is that both groups (QAE and SP) have most likely undergone gene duplications – also the QAE (as you indirectly state by saying that they have a large number from the very beginning). It is most likely the timing and fate of the gene duplication for the different groups that is somewhat different, with more and stronger selection for the QAEs compared to the SPs, which again could imply that the QAE gene duplications are more likely to be maintained and also result in genes with more similar function over different families. This goes for SP too but most likely at a slower pace. Here we have could have gene duplications that are more easily lost and that selection for similar function is maybe not that evident? In fact, the lower degree of positive selection -> indicate a lower degree of gene duplications (that are maintained) not higher as the authors state. The authors should look into those statements.

For instance, you write in line 371-372: “This suggests that gene losses and duplications have occurred frequently, particularly within the SP group, within the last 10 Ma.”

Can you really say this? Think we for sure agree that this is a result of gene duplication events -> but can we really infer from this that they happened more frequently?

Additionally, is it so that the duplications/precursors for the QAE arose in the ancestor of this lineage, while for the SP sequences the duplication event could have occurred later (in separate rounds, as I stated in my previous comment to the authors). To infer this, it would actually been nice to see the different variants found of both QAE and SP mapped onto the species tree, and also be a nice add on as one of the main figures of the paper. And if made, inference of the timing can be made.

Think the authors should revisit the ms with this in mind and add comments re the different possibilities both in the results section as well as the discussion.

The results within even a single species (notably Macrozoarces americanus for example) indicate that gene amplifications are ongoing and that there can be a large number of different isoforms. Also, due to the difficulties of assembling and characterizing multigene families, the isoform assemblages of most of the species examined herein are incomplete. Therefore, it would be impossible to ascertain the fate of individual gene copies within the species examined in this study. However, there are indications that gene losses and/or gene duplications/amplifications of either QAE or SP isoforms have occurred in specific lineages. To this end, we have mapped these events onto the species tree by adding SP↑ to two locations in Figure 1 to indicate branches were amplification of the SP complement has occurred. The possible locations where the SP and QAE arose or were lost were already present in this figure.

Then I state in my first review:

“These speculations, as well as about more gene losses and duplications being responsible for the clustering of the SP sequences can only be inferred if they have had whole genome sequencing data. Same goes for the plausible loss of the SP in the Antarctica species. Full genome data set is needed to confirm a loss so should be careful here I think.”

This was a statement to the authors that precaution should be made – since they do not have the full overview of the gene variant present or not – and that this should be stated in the paper (i.e. that full genome data-sets are needed to look into this in more detail).

And PS: fully aware of that these genes can be hard assemble – but they will be part of the raw reads and/or unassembled contigs (so most likely genome data will aid here too even if not fully put together).

Papers have been published outlining the difficulties with assembling multi-gene families, especially using short-read techniques where genome assemblies fail to generate AFP loci (see Zhuang X, Yang C, Fevolden SE, Cheng CH. Protein genes in repetitive sequence-antifreeze glycoproteins in Atlantic cod genome. BMC Genomics. 2012, 13:293, PMID: 22747999. For this reason, Zhang et al. opted to isolate BAC clones from one Antarctic species (Zhang, J., Deng, C., Wang, J. et al. Identification of a two-domain antifreeze protein gene in Antarctic eelpout Lycodichthys dearborni. 2009 Polar Biol 32:35). All of the type III sequences are similar enough to be detected using probes to either one, so the lack of an SP isoform in any of the BAC sequences would suggest that it is not present.

Additionally, the second Antarctic species was analysed at both the protein and transcriptome level, so if any SP genes are present, they do not appear to be expressed. Furthermore, the recent release of the Anarrhichthys ocellatus (wolf-eel) genome sequence demonstrates the problem in spades. The four putative AFP genes that were assembled are on scaffolds ~2 kb length. However, both SAS genes were assembled and lie next to each other on a 5.8 Mb scaffold. These sequences have been added to both Figure 2 and Figure 3, as well as to supplementary figures, and they have been used to addresses concerns raised about the possibility that these sequences could have arisen anew from SAS.

Furthermore, I stated in my pervious review:

“Additionally, in this paper only two Antarctic species are investigated (it is listed that this linage contain about 100 species (so I guess some of them could have the SP type (i.e. not yet confirmed but then either retained if lost in some of them and/or evolved from the SAS sequence and/or the QAE sequence in the common ancestor of this lineage)).”

This needs to be addressed I think (still not done I see) – that you have only looked into two species – and that the loss observed could potentially not be the case for all of these species.

For our analysis we have combed the database and have included as many sequences as we could find from other studies. This is admittedly a small sampling of the over 300 Zoarcid species, including those from northern waters, but ours is the first study to combine a dataset of this size. In order to achieve this, we would have to go to the Antarctic waters and collect dozens of species and sequence and assemble their genomes using a long-read platform such as PacBio, in order to bolster some reasonable speculation. While this would be ideal, it is not feasible.

Furthermore, the large number of sequences obtained by transcriptome sequencing of viviparous eelpout surely allows us to draw some inferences about the diversity of SP sequences present in this species as over 2000 reads were obtained. Liver is the dominant site of AFP production in all species examined, such as eelpouts and wolffish (line 144) and shanny and gunnel (Fig S6). SP sequences have also been obtained from the liver of ocean pout (see table S3 for example) yet these cluster with those obtained from other tissues and separate to those recovered from viviparous eelpout. If lineage-specific gene amplifications had not occurred, this would be unexpected.

Moreover, I also stated:

“And then what I really miss in this ms is the sequence data for SP for the Canadian eelpout, would have been beneficial to have that one to see how divergent this SP sequence is compared to the other ones. Any change that the authors could get hold of that sequence information? This information would for sure enlighten the evolutionary path of the SP within this lineage.”

This sequence was obtained by Edman degradation of a protein and was reported by Schrag et al. in 1987. We do not have access to this particular fish at this time. As a nucleotide sequence was not available, this sequence was not used in the generation of the main trees within this manuscript, but it was always shown in Fig S3. We have now added a supplementary tree (Fig S7) that shows that this sequence clusters with the sequences from the closely-related Antarctic fish.

We feel that the inclusion of this information is merited as this species is more closely related to the Antarctic species than any of the other species for which we have sequence data. Unfortunately, while we know that this species has more than one AFP isoform, given that the manuscript by Schrag et al. in 1987 shows three active protein peaks, we do not know anything substantive about them as only one was subjected to further analysis. We have included this information in the Discussion. While it would be nice to have additional sequence information from this species, we do not have access to this fish at this time and we do not know if it has SP isoforms.

I see that this in not obtained – but they do state that a QAE isoform is identified – can I ask why is this then not included in your analyses? This would also have been a nice add on to the paper for sure! And also, I would have appreciated if the authors could have mentioned made the authors aware of the fact that SP is not obtained in the Canadian eelpout. Could it be that it is also lacking from this species -> lost in the common ancestor? Or do you know that it is present? If yes, how do you know this? In other words – some more elaboration around this – the findings vs. interpretations and limitations in their dataset is still needed.

See above.

And I do see that the authors did not agree with me re my hypothesis re a common refugia. How was the climate 5 mill years ago? All fish settled in the north they think?

Would still have loved to see some more speculations here – and then not at least linked more to past paleoclimatic events during the Miocene (cooling ++ as well as different possible refugia during that period). Any reason for why some should start migrating down south?

We have added more discussion of this topic. The Zoarcids were undergoing rapid speciation during the last ice age, perhaps because their AFP genes allowed them to thrive in a glacial epoch ocean? Given the extent of the ice sheets, their range was undoubtedly shifted southward, but it would be very difficult to determine the historical localization of so many species over the entire glacial period. However, we have expanded our discussion of previous work that addresses the distribution of early Zoarcids. That being said, during the peak of the last ice age, the tropical surface ocean waters were only a few degrees cooler than they are today, so they could still have acted as a barrier to the migration of fishes adapted to cold water. We did neglect to explicitly state this and have added a reference to address this. See lines 356-369 in the Discussion.

With regards to migration, our hypothesis is that the deep ocean environment around features such as hydrothermal vents is not dependent on latitude. Zoarcids are the dominant fish at such features. Therefore, they may have simply populated the entire length of the divergent plate boundaries that span the ocean floors. Unfortunately, this is merely a correlation and we cannot directly prove causation. We have added material starting around line 384 of the Discussion to bolster this hypothesis.

My last and final comment, is that I miss a concluding remark, as is the discussion is ending quite abruptly.

The final paragraph of the Discussion has been altered to provide the missing concluding remarks.

Some minor comments:

In line 350 you state:

“The two Antarctic eelpouts diverged from the Canadian eelpout more recently than 10 Ma (Fig 1).”

This is not correct I think: the Antarctic eelpouts and the Canadian eelpots have a common ancestor -> they did clearly not diverge from the Canadian eelpouts. Hop ethe authors agree – and re-write.

This was sloppy wording on our part. We should have said: “The two Antarctic eelpouts diverged from the Canadian eelpout lineage more recently than 10 Ma (Fig 1).”

The section containing this statement has been rewritten as follows (starting on line 413).

…. Canadian eelpout, which is the closest relative to the Antarctic species in our study, with these two lineages having diverged less than 10 Ma (Fig 1).

In line 357 you state:

“The population that migrated south, ....”

This is pure speculation ….. how the migration occurred we do not know …. if it was one or more populations …... could it have been several events? And not at least, the specimens that migrated could have diverged along the gradient down south. Why I say this is that you so far only have two of the Antarctic species included in this study.

The only thig that needs to be added is that “in the species investigated we find …..” so that the reader is aware of this.

In line 431 of the Discussion we have noted that other species may have migrated south and may have retained SP isoforms and that there may have been multiple migrations to southern waters.

In line 381 you state:

“The two Antarctic species did not retain SP isoforms as mentioned above, .....”

Do not get as mention above – in the result section? Not stated anywhere in the discussion as I can see…. Think you should re-write.

The Discussion concerning the Antarctic species has been largely rewritten, so this statement (now on line 459), is referring to the paragraph beginning on line 422.

Reviewer #2: The revised manuscript is much improved from the original submission.

There are a few outstanding / new issues with the manuscript but I think they are all straightforward to address.

The section on positive selection is incorrect. The added dn/ds analysis is incorrect / incomplete. There are several issues that I outline here: SNAP does not test for selection, so in the current implementation dn/ds > 1 cannot be distinguished from dn/ds = 1. To test for selection, the authors will need to use hyphy or codeml or another similar program that actually uses different codon models to test for selection. The cumulative dn and ds rates (Fig 10A) is not informative about positive selection. It is notable that the signal peptide does not have nonsynonymous or synonymous substitutions, but this has been observed before. The pairwise comparisons with ten or more positions likely leads to an overestimate of dn/ds.

As indicated in our response to reviewer 1 we are out of our expertise here and have decided to leave this analysis for others to do in follow-up studies.

Results line 201+ The inclusion of only the subset of sequences that are at least three amino acids different makes some sense in the figure but the subsequent discussion in the results seems to be about all sequences so there is an inherent challenge in reading the paragraph in the results.

If we were to include all of the know sequences from each species, the phylogenetic trees would be very crowded. The exclusion of identical or near identical sequences does not change the gist of the paper. For example, the percentage ranges as given starting on line 215 go to 100% (between 80-100%) or are over 73% within the QAE or SP groups, respectively, whether or not the excluded sequences are included. Nor does the exclusion change the identity ranges between the different groups as the selection criterion meant that the more divergent sequences were retained in the analysis. Furthermore, within each section, the number of sequences recovered and the number retained for the phylogenetic analysis is clearly stated, as for example, in lines 186-192 as shown below.

“A total of 19 unique sequences (12 SP, 7 QAE) were unambiguously assembled from these reads. One exactly matched a previously known protein sequence (AGM97733), while two others differed from ABN42204 and ABN42205 at two and three residues, respectively. Once sequences with two or fewer a.a. differences were excluded, 5 QAE and 8 SP sequences remained, designated viviparous eelpout-Q1 to -Q5 and -S3 to -S10 (Fig S3).”

The relevance of the comment about no tandemers in P. brachycephalum in the introduction is unclear.

To make the relevance clearer, we changed that section from …..

The AFP complement of the Antarctic eelpout has been studied through a combination of protein, cDNA and genomic DNA sequencing (yielding over 20 gene sequences) and it produces both monomers and tandemers consisting of two or more linked AFP domains [23,36–40]. In contrast, there is no evidence of tandemers in P. brachycephalum [39,41].

To (starting on line 86) ….

The AFP complement of the Antarctic eelpout has been studied through a combination of protein, cDNA and genomic DNA sequencing (yielding over 20 gene sequences). This species produces both monomers and tandemers consisting of two or more linked AFP domains [23,36–40], whereas the northern species studied produce only monomers. This is a further illustration of how changeable these AFP genes are. Interestingly, there is no evidence of tandemers in the second Antarctic species examined, P. brachycephalum [39,41].

The end of the first paragraph of the results seems to contradict the introduction about what is known about the shanny and gunnel.

This has been changed from ….

iii) the three species newly-examined in this study (Alaskan ronquil, radiated shanny and rock gunnel) diverged prior to the two families (Anarhichadidae and Zoarcidae) from which AFP sequences were previously known.

To (starting on line 114) ….

iii) the three species newly-examined in this study are found in two lineages that diverged ~18 Ma (Alaskan ronquil) and ~15 Ma (rock gunnel and radiated shanny). The two families from which AFP sequences were previously known, Anarhichadidae and Zoarcidae, only diverged from each other ~10 Ma.

Line 118 three Zoarcales lineages are referred to but it’s not entirely clear which three are relevant here.

The names of the three species were added in brackets and the statement was further clarified.

Figure 4 was an issue for both reviewers and I still do not feel it is particularly clear and it does not represent any alternative possibilities.

This thorough review process has forced us to think extensively about our hypothesis. If we thought there was a plausible alternative we would propose it. We previously added to the figure legend to make the story clearer. We have also added material here to the Discussion, starting at line 494, which describes our hypothesis in more detail. We have also added two mentions of this figure to the Discussion at relevant points to allow readers to further understand what is shown. Our hypothesis is really quite simple. A Zoarcid, which developed AFP genes in the North, migrated through the cold deep oceans to Antarctica. Over the time this journey took, AFP genes were lost because they were not under selective pressure. On arrival in ice-laden Antarctic waters, selection for freeze resistance happened again and AFP genes (surviving QAE) were amplified. We have shown this figure and legend to numerous naïve scientists and asked them to explain the figure. They have all given a consistent version along the lines indicated above. We are at a loss to see how it could be made any clearer in the absence of any specific suggestions from the reviewer.

Figure 1, Fig S2 (and possibly others), the underlining of either genus or genus species is not clear to me. The statement in the figure legend does not clarify.

Underlining denoted whether the matching genus or species was sampled in these studies.

This has been changed to ….

A double asterisk denotes that the fish from these studies was the same as the species in this study, whereas a single asterisk denotes that the fish was a different species within the same genus.

Line 138, only needs to reference Fig 2 for the ronquil-1 sequence.

Done (now line 147).

Line 142, unclear why the percent similarity is a range when it is a single pairwise comparison.

We have made this clearer by changing the passage from this …..

“This sequence is most similar to an Atlantic wolffish QAE isoform (Atl-wolffish-2) with identities between the protein, coding and intronic sequences of 94 to 95%. The 3' UTR is up to 97% identical to other type III sequences.”

To this ….. now on line 149-152.

“This sequence is most similar to Atlantic wolffish-Q2. The identities between the protein sequences, as well as between the coding sequences and the single intron, were between 94 and 95%. The 3' UTR is up to 97% identical to other type III sequences.”

Lines 150-152, the meaning of the sentence is not entirely clear.

This was changed from ….

“Type III AFP sequences from the Atlantic ocean pout (Z. americanus), the species in which this AFP was first discovered, have previously come from protein, cDNA and genomic sequences [25,26,34].”

To this ….. now on lines 159-161.

“Type III AFP sequences from the Atlantic ocean pout (Z. americanus), the species in which this AFP was first discovered, were previously obtained by Edman degradation of proteins or from cDNA and genomic sequences [25,26,34].”

It is important to establish this for the statement on line 171:

Three others (ocean pout-Q5, ocean pout-S2 and ocean pout-S4) match isoforms known only from protein sequencing (HPLC12, HPLC7 and HPLC1, respectively) [26].

Line 199, include the tree in the supplement.

This tree has been added (Fig S7) and in addition, the two AFPs known only from protein sequencing have been added in and their relationship to the other sequences has been described in more detail.

Line 329, should be substitution not “mutation” the dn/ds method analyzes substitutions not mutations.

This sentence has been removed.

Line 389, issue with citation format

Before additional references were added, this citation was to reference 27 and 72, which may have looked to be an inadvertent reversal. Nevertheless, these were the intended references.

The figure legends are helpful but now quite extensive and could be shortened for clarity and brevity.

Accession numbers for the AFPs have been removed from figure legends and placed in table S5. Numerous changes have been made to figure legends to make them more concise.

For example, in the Figure 2 legend,

Sequences are named using the common name for each species (or a contraction thereof, see Fig S3), except P. brachycephalum where the binomial name is used.

was changed to …

Sequences are named using the common name for each species, except for P. brachycephalum.

7. PLOS authors have the option to publish the peer review history of their article. If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Decision Letter 2

Michael Schubert

19 Nov 2020

Antifreeze protein dispersion in eelpouts and related fishes reveals migration and climate alteration within the last 20 Ma

PONE-D-19-34926R2

Dear Dr. Davies,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Michael Schubert

Academic Editor

PLOS ONE

Acceptance letter

Michael Schubert

26 Nov 2020

PONE-D-19-34926R2

Antifreeze protein dispersion in eelpouts and related fishes reveals migration and climate alteration within the last 20 Ma

Dear Dr. Davies:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Michael Schubert

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Alignment of cytochrome oxidase subunit I (COI) sequences from various members of the infraorder Zoarcales.

    GenBank accession #s, sequentially top to bottom as in the alignment, are KC016052, KJ205263, KC517318, HQ712639, HQ713113, HQ713057, KJ205118, KC016016, JQ685890, KC015305, EU752057, EF427917. Since a COI sequence was not available for the Antarctic eelpout, Lycodichthys dearborni, the one for Lycodichthys antarcticus was used instead. Asterisks under the alignment indicate perfect conservation of a base.

    (DOCX)

    S2 Fig. Phylogenetic tree of Zoarcales species using the COI alignment from S1 Fig.

    The maximum-likelihood tree was generated using the Kimura 2-parameter with invariant sites model and bootstrap values (%) are indicated at the nodes. The Alaskan ronquil was used as the outgroup and the scale bar represents an average of 0.02 changes per site. This tree was used to determine the placement of the radiated shanny, Antarctic eelpout and Pachycara brachycephalum relative to the other species in Fig 1.

    (DOCX)

    S3 Fig. Protein alignment of known type III AFPs that have more than two differences relative to other sequences from the same species.

    Sequences are named as in Fig 2. Grey highlighting indicates sequences that were determined by Edman degradation. The tandemers of Antarctic eelpout are lettered sequentially (e.g. Antarctic eelpout-Q3b is the second AFP domain in Antarctic eelpout-Q3). Variable residues are highlighted or coloured according to the phylogenetic tree (Fig 3) with conserved residues typical of QAE or SP variants highlighted cyan and yellow, respectively. Mutations that arose somewhere within Antarctic species are highlighted pink and residues within SAS-B that were not conserved when the QAE or SP groups arose are highlighted black. Other differences between SAS sequences are highlighted purple. Differences that do not correlate with these aforementioned groupings are highlighted grey. Red highlighting indicates shared differences between the signal peptides of the sequences from radiated shanny and rock gunnel. The black boxes and red boxes show residues involved in binding to the pyramidal plane and prism plane respectively, as in Fig 2. The signal peptide is in lowercase font. Italics indicate linkers between tandemers, internal dashes indicate gaps, whereas leading or trailing dashes indicate that the sequence is incomplete at either terminus. Identity, high similarity and low similarity between all AFPs (incomplete sequences ignored) is indicated at the bottom with asterisks, colons and periods, respectively. Residues with an inward pointing sidechain are indicated by “i” at the top. Asterisks denote sequences obtained by PCR in this study (S3 Table) and daggers denote those assembled from the SRA database (S4 Table). Accession numbers are listed in S4 and S5 Tables.

    (PDF)

    S4 Fig. Nucleotide alignment of type III AFPs.

    Sequences are named and variable nucleotides are highlighted as in Fig 2. Red highlighting indicates shared differences between the first exons of radiated shanny and rock gunnel and these exons were excluded prior to generating all but the exon 1 phylogenetic tree as they may have been homogenized by exon shuffling. The translation of notched-fin eelpout-Q1 has black boxes and red boxes showing residues involved in binding to the pyramidal plane and prism plane respectively, as in Fig 2. The signal peptide is in lowercase font. Internal dashes indicate gaps, whereas leading or trailing dashes indicate that the sequence is incomplete at the respective terminus. Intronic sequences are not shown for genomic clones but an arrow indicates where an intron is found. The 3′ splice junction of ocean pout-Q2 was originally predicted based on SP isoforms but has been adjusted by three bases (lower-case font) to match QAE-type cDNAs. The 3ʹ end of P. brachycephalum-Q1 (italics) and the linker sequences between Antarctic eelpout tandemers (not shown) were excluded from the phylogenetic analysis as they are not homologous to the other AFP sequences. These nucleotide sequences are unambiguously accessed through the protein accession numbers in S4 and S5 Tables as some of the nucleotide sequences encode multiple AFPs.

    (DOCX)

    S5 Fig. Maximum-likelihood phylogenetic tree of all non-identical exon 1 sequences from type III AFPs from S4 Fig generated using the Jukes-Cantor model with 5 gamma categories.

    Cyan and yellow backing denotes QAE and SP isoforms, respectively, except for radiated shanny and rock gunnel sequences that cluster together on a separate branch. Bootstrap values (percent) are indicated at most nodes and the scale bar represents an average of 0.1 changes per site. Sequences are named as in Fig 2.

    (TIF)

    S6 Fig. Tissue distribution of rock gunnel and radiated shanny type III AFPs.

    Northern blot analysis of total RNA from two rock gunnel individuals (A and B) and two radiated shanny individuals (C and D). The panels in each set show the hybridization signal to the AFP probe (top); the chicken β-tubulin probe (middle) cDNAs; or ethidium bromide staining of the 28S and 18S rRNA bands (bottom). The tissues are indicated as follows; L = liver, Sk = skin, G = gill, I = intestine, St = stomach, M = muscle, H = heart, Sp = spleen, K = kidney. RNA size marker positions are indicated on the left (bases) and total RNA from cunner skin was used as a negative control (n).

    (TIF)

    S7 Fig. Maximum-likelihood phylogenetic tree generated from protein sequences.

    The amino acid sequences shown in Fig 2, along with two sequences determined solely by Edman degradation of purified proteins (Canadian eelpout-Q1 and P. brachycephalum-Q3, labelled with #) were used to generate a phylogenetic tree equivalent to Fig 3.

    (TIF)

    S8 Fig. Phylogenetic comparison of SP isoforms.

    The nucleotide sequences of the SP subset of type III AFP sequences (S4 Fig) were used to generate a maximum-likelihood phylogenetic tree using a divergent isoform (ocean pout-Q6) as the outgroup. Bootstrap values (percent) are indicated at most nodes and the scale bar represents an average of 0.02 changes per site. Sequences are named as in Fig 2.

    (TIF)

    S9 Fig. Stereoscopic views of type III AFP showing the location of variable and conserved residues.

    Residues that are absolutely, moderately or poorly conserved are colored as follows: respectively; pyramidal ice-binding plane, dark green, light green, yellow; prism ice-binding plane, red, orange, pale orange; rest of protein, dark blue, light blue, grey. Front view (A), back view (B) and front surface view (C).

    (TIF)

    S10 Fig. Surface view showing the expected structural effect of ice-binding residue mutations and the hydrophobicity of different isoforms.

    A) Wild-type QAE isoform, ocean pout-Q5 (HPLC12, PDB 1HG7) B) Ocean pout-Q5 with the introduction of the three ice-binding mutations found in P. brachycephalum-Q4 and C) A compilation of the most severe mutations at variable ice-binding residues (S3 Fig) introduced to ocean pout-Q5. Nitrogen is blue, oxygen is red, sulfur is yellow and carbon is pale orange (on the pyramidal-plane ice-binding surface), cyan (on the prism-plane ice-binding surface) or white (elsewhere). D) QAE isoform (PDB 4UR4) E) SP isoform (PDB 4UR6) F) Antarctic eelpout SAS-B residues mapped onto 4UR4 (excluding the last four residues of SAS-B). Atoms are colored by charge and hydrophobicity with red for charged oxygen, blue for charged nitrogen and yellow for carbon not bonded to nitrogen or oxygen. All other backbone and polar groups are colored white. Residues are numbered according to Fig 2, except residues 50–53 (PLGT, AKGQ and EEDD respectively). The percentage of the surface that is solvent accessible is indicated.

    (TIF)

    S1 Table. Sequences of oligonucleotides used in PCR studies.

    *F is forward and R is reverse direction.

    (DOCX)

    S2 Table. Rock gunnel and radiated shanny type III AFP cDNA and predicted protein features (from ProtParam [92]).

    1Excludes poly(A) tail. 2Includes STOP codon. 3Excludes STOP codon and poly(A) tail. 4Presumes cleavage of C-terminal Lys.

    (DOCX)

    S3 Table. Ocean pout sequences cloned in this study and their closest nucleotide and protein matches.

    SP and QAE isoforms are highlighted yellow and cyan, respectively, with the QAE sequences that diverged early are highlighted grey. Protein accession numbers are used for consistency between figures. Percent identity excludes gaps, and sequences known only from Edman degradation are underlined. Sequences denoted with an asterisk were not included in the alignments as they differed at two or fewer a.a. resides from isoforms that were included.

    (DOCX)

    S4 Table. Accession numbers for one or two sequences from the NCBI SRA database that will generate viviparous eelpout (SRA accession # SRX002161) and P. brachycephalus (SRA accession # SRX118640) isoforms as shown in Fig 2 and S3 Fig.

    (DOCX)

    S5 Table. Genbank protein accession numbers for sequences shown in Fig 2 and S3 and S4 Figs.

    (DOCX)

    S1 File. Supplement for materials and methods, results, and references.

    (DOCX)

    S1 Raw image

    (PDF)

    Data Availability Statement

    The DNA sequences are available from NCBI GenBank under accession numbers KR872957-KR872964. The corresponding protein accession numbers are ALL26673-ALL26680.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES