Skip to main content
F1000Research logoLink to F1000Research
letter
. 2018 Jun 28;7:963. [Version 1] doi: 10.12688/f1000research.15386.1

Major histocompatibility complex (MHC) fragment numbers alone – in Atlantic cod and in general - do not represent functional variability

Johannes M Dijkstra 1,a, Unni Grimholt 2
PMCID: PMC6081975  PMID: 30135730

Abstract

This correspondence concerns a publication by Malmstrøm et al. in Nature Genetics in October 2016. Malmstrøm et al. made an important contribution to fish phylogeny research by using low-coverage genome sequencing for comparison of 66 teleost (modern bony) fish species, with 64 of those 66 belonging to the species-rich clade Neoteleostei, and with 27 of those 64 belonging to the order Gadiformes. For these 66 species, Malmstrøm et al. estimated numbers of genes belonging to the major histocompatibility complex (MHC) class I lineages U and Z and concluded that in teleost fish these combined numbers are positively associated with, and a driving factor of, the rates of establishment of new fish species (speciation rates). They also claimed that functional genes for the MHC class II system molecules MHC IIA, MHC IIB, CD4 and CD74 were lost in early Gadiformes. Our main criticisms are (1) that the authors did not provide sufficient evidence for presence or absence of intact functional MHC class I or MHC class II system genes, (2) that they did not discuss that an MHC subpopulation gene number alone is a very incomplete measure of MHC variance, and (3) that the MHC system is more likely to reduce speciation rates than to enhance them. We conclude that their new model of MHC class I evolution, reflected in their title “Evolution of the immune system influences speciation rates in teleost fish”, is unsubstantiated. In addition, we explain that their “pinpointing” of the functional loss of the MHC class II system and all the important MHC class II system genes to the onset of Gadiformes is preliminary, because they did not sufficiently investigate the species at the clade border.

Keywords: fish, MHC, Atlantic cod, evolution, speciation rate

Correspondence

In the below, we explain our criticisms of the Malmstrøm et al. 1 article as they are summarized in our abstract.

When was the MHC class II system lost in Gadiformes? The data as presented by Malmstrøm et al. 1 suggest a simultaneous loss of major histocompatibility complex (MHC) IIA, MHC IIB, CD4 and CD74 functions at the evolutionary onset of Gadiformes (see their Figure 2). However, within their datasets for gadiform fishes, sequence reads that represent these genes can readily be detected ( Table S1 and Supplementary File 1). These sequence read numbers are much lower than found for the non-gadiform fish, and they may be contaminations, but that should be appropriately tested. Meanwhile, for several non-gadiform fishes, including for S. chordatus which among the investigated fishes is the species closest related to Gadiformes, there are no full-length MHC IIA, MHC IIB, CD4 or CD74 gene sequences in the unitig and scaffold datasets presented by Malmstrøm et al. 1 ( Supplementary File 2 and Table S2). We agree with the conclusion by Malmstrøm et al. 1 that their data suggest that throughout Gadiformes there is no canonical MHC class II system. However, as for the evolutionary timings of the loss of an intact MHC class II system and of the losses of the individual MHC class II system genes, we find their study technically wanting and preliminary. The combination of (i) not finding intact full-length sequences for all important MHC class II system genes in species closely related to Gadiformes, while (ii) finding reads of these genes in gadiform fishes, prohibits what the authors call “pinpointing the loss of MHC II pathway genes to the common ancestor of Gadiformes”. At least for a few species at either side of the Gadiformes clade border, Malmstrøm et al. 1 should have substantiated their claims by addition of specific PCR plus sequencing analyses, which should confirm presence of full-length intact MHC class II genes in the non-gadiform fishes, and their absence in the gadiform fishes.

Discussion of the MHC class I counting strategy by Malmstrøm et al. 1 Whereas our criticisms of the MHC class II system analysis by Malmstrøm et al. 1 are about technical issues and the preliminary character of their conclusions, we more fundamentally disagree with their analyses and discussions of MHC class I. The authors assumed 1, as postulated by other researchers before them, that there can be a “copy number optimum” of MHC genes affected by a tradeoff between a higher number allowing the presentation of more pathogen antigens while also having a depletion effect on the T cell population. Regardless of the extent to which this mostly theoretical concept is true 2, the MHC counting strategy by Malmstrøm et al. 1 should be deemed incomplete and far too simplistic. For their number determination Malmstrøm et al. 1 solely relied on estimation of U plus Z lineage genomic α3 exon fragment numbers, despite that the typical “birth and death” mode of MHC evolution can produce many pseudogenes 3. The decision of the authors to only count U plus Z lineage gene fragments was based on their unsubstantiated perception that (neo-)teleost U and Z molecules “predominantly” bind peptide ligands 1. However, not all teleost U and Z molecules are expected to present peptides 4, 5, for example this is not expected for the majority of U lineage molecules in the neoteleost fish medaka 6 and the non-neoteleost fish rainbow trout 7; how this is in the majority of the species investigated by Malmstrøm et al. 1 remains to be determined. Furthermore, it should be realized that MHC class II and non-peptide-binding MHC class I molecules (like maybe teleost molecules of the MHC class I lineages L, P and S 4) also can contribute to T cell depletion e.g. 8. Peculiarly, while from their referencing it follows that Malmstrøm et al. 1 were aware of an MHC class II impact on T cell depletion, the authors did not look at MHC class II when investigating their optimum MHC number model. A more general shortcoming of the article 1 is the lack of awareness that the direct determiner of the levels of “antigen coverage” and T cell depletion is the variation between the relevant MHC molecules 2, rather than merely the MHC gene copy number. Table 1 (with detailed explanations in Supplementary File 3) summarizes that different teleost fish species can have very different levels of variation between MHC molecules 4, and that despite its many U lineage gene copies the extent of MHC variation in Atlantic cod can be considered as relatively limited. Previously, we showed that salmon, zebrafish and eel share variation in U lineage sequences, dating from probably more than 300 million years ago (MYA), whereas all U lineage variation found within the neoteleost fishes stickleback and Atlantic cod probably was established only after these two species separated around 150 MYA 4. Without experimental evidence, it cannot simply be assumed that “antigen coverage” and/or T cell depletion are highest in fishes with the highest counts of U plus Z α3 fragments, while neglecting levels of variance among the intact U and Z molecules and possible presences of other categories of MHC molecules. As a last critical comment we point out that, in stark contrast to the evolution of any other known MHC lineage, most deduced Z lineage molecules are characterized by a putative peptide binding groove which was almost perfectly conserved since >400 MYA 4; this questions the model by Malmstrøm et al. 1 that Z lineage evolution was driven by pathogen antigen variation, and is a further argument against the use of combined U+Z numbers for analysis of MHC evolution.

Table 1. Intra-species major histocompatibility complex (MHC) variation differs among teleost fishes.

This table shows the lowest percentages of amino acid sequence identities between membrane-distal domains (α1+α2 for MHC I, α1 for MHC IIA, β1 for MHC IIB) of same category MHC molecules found between reported sequences of the same species. In some species no genes for particular categories were found (black boxes), while in other instances only one seemingly intact gene sequence was found (1 sequence) or only pseudogenes were found (pseudogene). A more detailed explanation of this table is provided in Supplementary File 3.

Species Neoteleostei
Zebrafish Salmon Medaka Fugu Stickleback Tilapia Cod
MHC category (Danio
rerio)
(Salmo salar) (Oryzias
latipes)
(Takifugu
rubripes)
(Gasterosteus
aculeatus)
(Oreochromis
niloticus)
(Gadus
morhua)
MHC class I U classical 40% 47% 52% 75% 76% 52% 58%
U all 27% 38% 32% 29% 62% 27% 58%
Z 70% 84% 84% 1 sequence 1 sequence 78% 89%
L 27% 51% 1 sequence
P pseudogene 85% 1 sequence
S 99%
MHC class II DA IIA 34% 84% 48% 72% 64% 39%
DA IIB 34% 76% 56% 76% 57% 46%
DB IIA 23% 20% 20% 1 sequence 21%
DB IIB 31% 25% 26% 1 sequence 22%
DE IIA 99%
DE IIB 99%

Discussion of the model by Malmstrøm et al. 1 saying that U+Z numbers in teleost fish affect speciation rates and that the half-life for reaching the U+Z optimum number is 23 million years. Malmstrøm et al. 1 postulated their multiple-regime Ornstein-Uhlenbeck model with very slow progress towards optimum MHC numbers because it was the best fitting model among the few models that they tested. However, an even better fitting model would be that in each species the respective optimal U and Z gene organizations were achieved. Further criticism is that their calculation methods for optimum U+Z numbers and half-life periods incorporated calculations of U+Z gene multiplication speeds, which suffered from the fact that (like for their other considerations) Malmstrøm et al. considered all U and Z genes as identical mathematical units 1. For such speed calculations U and Z genes should have been studied separately, and it also should have been realized that whereas from some U or Z genes multiple new copies were generated, others were lost in accordance with the “MHC gene birth and death” model 3. Lastly, even if, regardless of the discussable calculations for speeds and optimum numbers, there is a positive association in neoteleost fish between speciation rates and U+Z α3 fragment numbers (see their Figure 3), then still their model which considers MHC genes as “speciation genes that promote rapid diversification” 1 would be implausible in regard to cause and effect. Namely, in most species, there is a strong evolutionary pressure to maintain old allelic variation within MHC genes (trans-species polymorphism 3, 4, 9), which, if anything, is likely to slow down speciation rates because it increases the required size of the founder population 9. If old allelic or haplotype variation can’t be maintained because of rapid speciation through small founder populations, it can be speculated that a species might benefit from an enhanced capacity for the creation of new MHC allelic and/or haplotype variation by duplications/deletions and recombination 10 between a high number of linked MHC gene copies. However, in that scenario it wouldn’t be the MHC organization which drives the speciation rate, as suggested by Malmstrøm et al. 1, but the other way around.

Data availability

The data analyzed in this study are publicly available. Details are explained in Supplementary File 1, Supplementary File 2 and Supplementary File 3.

Funding Statement

The author(s) declared that no grants were involved in supporting this work.

[version 1; referees: 2 approved

Supplementary material

Supplementary Table S1: Examples of sequence reads of major histocompatibility complex (MHC) class II system genes found in single read archive (SRA) datasets published by Malmstrøm et al. for Gadiformes and closely related fishes.

Supplementary File 1: List of sequence reads in SRA datasets of Gadiformes published by Malmstrøm et al. that match with major histocompatibility complex (MHC) class II system genes.

Supplementary File 2: Investigation of unitigs with (partial) major histocompatibility complex (MHC) class II system genes which are listed by Malmstrøm et al. in their Table S7 for the non-gadiform fishes S. chordatus, C. roseus, Z. faber, T . subterraneus, P. transmontana, and P. japonica.

Supplementary File 3: Detailed explanation of Table 1.

References

  • 1. Malmstrøm M, Matschiner M, Tørresen OK, et al. : Evolution of the immune system influences speciation rates in teleost fishes. Nat Genet. 2016;48(10):1204–10. 10.1038/ng.3645 [DOI] [PubMed] [Google Scholar]
  • 2. Borghans J, Keşmir C, de Boer RJ: MHC diversity in Individuals and Populations.In: Flower D, Timmis J, editors. In Silico Immunology Springer, New York NY;2007;177–195. 10.1007/978-0-387-39241-7_10 [DOI] [Google Scholar]
  • 3. Nei M, Rooney AP: Concerted and birth-and-death evolution of multigene families. Annu Rev Genet. 2005;39:121–52. 10.1146/annurev.genet.39.073003.112240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Grimholt U, Tsukamoto K, Azuma T, et al. : A comprehensive analysis of teleost MHC class I sequences. BMC Evol Biol. 2015;15:32. 10.1186/s12862-015-0309-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Malmstrøm M, Jentoft S, Gregers TF, et al. : Unraveling the evolution of the Atlantic cod's ( Gadus morhua L.) alternative immune strategy. PLoS One. 2013;8(9):e74004. 10.1371/journal.pone.0074004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Nonaka MI, Aizawa K, Mitani H, et al. : Retained orthologous relationships of the MHC Class I genes during euteleost evolution. Mol Biol Evol. 2011;28(11):3099–112. 10.1093/molbev/msr139 [DOI] [PubMed] [Google Scholar]
  • 7. Miller KM, Li S, Ming TJ, et al. : The salmonid MHC class I: more ancient loci uncovered. Immunogenetics. 2006;58(7):571–89. 10.1007/s00251-006-0125-2 [DOI] [PubMed] [Google Scholar]
  • 8. Schümann J, Pittoni P, Tonti E, et al. : Targeted expression of human CD1d in transgenic mice reveals independent roles for thymocytes and thymic APCs in positive and negative selection of Valpha14i NKT cells. J Immunol. 2005;175(11):7303–10. 10.4049/jimmunol.175.11.7303 [DOI] [PubMed] [Google Scholar]
  • 9. Klein J, Sato A, Nikolaidis N: MHC, TSP, and the origin of species: from immunogenetics to evolutionary genetics. Annu Rev Genet. 2007;41:281–304. 10.1146/annurev.genet.41.110306.130137 [DOI] [PubMed] [Google Scholar]
  • 10. Doxiadis GG, de Groot N, Otting N, et al. : Haplotype diversity generated by ancient recombination-like events in the MHC of Indian rhesus macaques. Immunogenetics. 2013;65(8):569–84. 10.1007/s00251-013-0707-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
F1000Res. 2018 Aug 6. doi: 10.5256/f1000research.16766.r36540

Referee response for version 1

Jerzy K Kulski 1,2

The correspondence by Dijkstra & Grimholt 1  provides critical concerns about a publication by Malmstrøm et al in Nature Genetics in October 2016 2, concluding that their new model of MHC class I evolution, reflected in their title “Evolution of the immune system influences speciation rates in teleost fish”, is unsubstantiated. I concur with their three main criticisms “(1) that the authors did not provide sufficient evidence for presence or absence of intact functional MHC class I or MHC class II system genes, (2) that they did not discuss that an MHC subpopulation gene number alone is a very incomplete measure of MHC variance, and (3) that the MHC system is more likely to reduce speciation rates than to enhance them.”

All three critical points are well founded and stand-alone without too much need for further support. However, I have added the following 14-point commentary for Dijkstra & Grimholt 1, Malmstrøm et al (2016) 2 and others to consider and elaborate on if they would like to because they are important concerns in the field of MHC genomics, biological function and evolution.

1. According to Dijkstra & Grimholt 1  “the MHC counting strategy by Malmstrøm et al. 2  should be deemed incomplete and far too simplistic. For their number determination Malmstrøm et al. 2  solely relied on estimation of U plus Z lineage genomic  α3 exon fragment numbers, despite that the typical “birth and death” mode of MHC evolution can produce many pseudogenes.”  I agree that this is a major problem with the Malmstrøm et al (2016) 2  paper, one that also omits the other categories of the MHC I such as the L, S and P lineages that might contribute to a much large number of MHC I-like genes. In this regard, it seems that Malmstrøm et al. 2  have taken only the genomic exon fragment numbers of the MHC I U and Z lineages to represent the entire immune system of their title.

2. Dijkstra & Grimholt 1are also right to point out that low coverage sequencing by next generation sequencing (NGS) can result in the artifactual loss of genes that in turn can lead to misleading or incorrect conclusions when counting for gene copy numbers or looking for gene losses. Malmstrøm et al (2016) 2  sequencing coverage was 9 to 39x and they recovered only about 75% of the conserved eukaryotic genes. Therefore, in this situation, the  de novo low coverage sequencing data should not have been used as evidence for the absence of genes from the genome without providing a properly organized high coverage map of genomic assemblies to show where the sequences are missing in the genomes. The reviewers and editors of the Malmstrøm et al (2016) 2paper should have been aware of this basic problem of using low coverage NGS, particularly with respect to looking for a few needles in a haystack. For a better understanding of the advantages and disadvantages of MHC genotyping and haplotyping by NGS see the review by Shiina et al (2016) 3.

3. Malmstrøm et al (2016) 2  reported that there was no overall correlation between the combined MHC copy numbers and the HOX gene copy numbers that were used as a control. As already pointed out by Dijkstra & Grimholt 1, the U and Z genes should have been analysed separately. Nevertheless, it would have been interesting to see how these duplicated HOX development genes, which also have been implicated in driving speciation, compared with the properly classified duplicated MHC class I adaptive immune genes (separate Z, X, L, S and P lineages) at the classical and non-classical level in Fig 3 during speciation rate simulations 2.   In addition, it appears from Fig S3 that Malmstrøm et al (2016) 2  might have missed an inverse relationship between MHC & HOX for the MHC copy numbers up to 25 and direct correlation between MHC & HOX for the MHC copy number from 25 to 50.

4. It seems absurd to count up only the short sequences of a3 fragments from a low coverage sequence library and extrapolate the numbers counted in Fig 2b to reconstruct an artificial model for duplicated MHC gene copies influencing speciation or evolution without first knowing their categories (classical versus non-classical presentations), functions, overall structure and coding ability, transcriptional activity and genomic locations.  Malmstrøm et al (2016) 2  provided no properly organized genomic assemblies or genomic gene maps and no information about genomic distribution of the MHC I or II sequenced fragments or the duplication mechanisms involved. If they had done so, they might have added important information to better assess and place the threshold MHC I copy numbers and gene distributions into some sort of genomic perspective 4. More reliable models for the evolution of MHC class I genomic duplications might be achieved by providing duplication gene maps and the phylogenic relationships of the duplicated gene sequences showing likely duplication mechanisms, where and how these genes are located relative to each other, and how the genomic structures have changed in a comparative analysis between species. See Kulski et al. (2004) 5  for an example of one such duplication model. Mapping with phylogeny is a more informative approach than just constructing phylogenetic trees using one or more single exonic sequences from a limited number of each species and then claiming that the changes in copy numbers influence the speciation rates of almost the entire number of extant fish species. Perhaps, the Malmstrøm et al (2016) 2 low coverage sequence libraries could still be used effectively to reconstruct full-length gene structures and targeted genomic regions that harbour multiple copies of the MH(C) genes in a comparative analysis.

5. The multiple-optima Ornstein Uhlenbeck (OU) model 6

(a) According to Malmstrøm et al (2016) 2  the multiple-optima OU model vastly outperformed alternatives such as Brownian motion, white noise, single-peaked OU and early-burst models. This find­ing corroborated their hypothesis that MHC I copy number evolution is characterized by selection toward intermediate optima, resulting from a tradeoff between detection and elimination of pathogens. Presumably, the authors preferred the OU multiple-optima model to the other models because it supported rather than falsified their hypothesis. Of course, this highly artificial computing model did not detect a tradeoff between detection and elimination of pathogens, this would be the authors’ own biological hypothesis and bias.  The OU model intrinsically sets optima (biases) according to its built-in algorithm 6, and this is one of the main objections to the use of this prediction model. The OU model artificially generates bias because its purpose is to find the trait optimum that stabilizes selection 6. The misguided conclusion by Malmstrøm et al (2016) 2  in using the OU model is that the trait optimum influences speciation. 

(b) Interestingly, Malmstrøm et al (2016) 2  did not directly test the opposing hypothesis that the MHC I copy number evolution is not characterized by selection toward intermediate optima, and does not result from a tradeoff between detection and elimination of pathogens. Possibly, their best control in this regard was the simple Brownian model that did not work as well for them as the multiple-optima OU model that has extra parameters such as the addition of an overall optimum trait value to which all lineages are attracted. Other evolutionists often prefer the Brownian model for the reason that it is a simple, neutral model without the added bias of creating optimum trait value.

( c ) The OU multiple-optima model is not a fool-proof algorithm, and a number of evolutionists believe that it can be an unreliable or misleading model. According to Cooper et al (2016) 7, although widely used, the properties of the OU model, and other direct extensions of the Brownian model, are poorly understood leading to the potential for inappropriate use and misinterpretation of results. In particular, Cooper et al (2016) 7 used computer simulation studies to demonstrate that the single peaked OU model error rates are unacceptably high when tree size is small (< 200 species tips), when likelihood ratio tests or Akaike information criterion (AICc) are used to select the best model, and when measurement errors are introduced into the data. They also showed that when the alpha parameter of trait evolution was extremely small (<1) in the OU model it was indistinguishable from Brownian motion, and as the alpha value became larger it favoured OU prediction models, until the larger values of alpha were indistinguishable from white noise and it was therefore independent of phylogeny. The alpha values for the Malmstrøm et al (2016) 2  model selection analysis were markedly less than one (Supplementary Table 13), suggesting that they could have accepted the Brownian model over the OU model as the better model fit. 

6. The BiSSE threshold model. 

Malmstrøm et al (2016) 2  carried out binary state speciation and extinction (BiSSE) analysis to estimate differences in diversification rate between lineages with high and low MHC I copy numbers. They found that diver­sification rates based on correlation estimates differed most when the threshold was placed between 20 and 25 copies (Fig. 3c). With a threshold in this range, the model with two separate speciation rates for lineages with high and low copy numbers was statistically better supported than a model with a single speciation rate parameter.  On this basis, they concluded that, ‘These results suggest that the influence of MHC I genes on speciation rates is stronger in species that have already evolved at least 20 copies.’ In comparison, the number of MHC I gene copy numbers in humans (excluding haplotype differences) is approximately 18 genes; 6 classical and non-classical MHC I genes, 5 CD1 genes, and 5 PHFZ genes (MICA, MICB, MR1, HFE, Zn-A2-GP,  etc). Thus, in comparison to some fish species, humans are diversifying along very nicely as a ‘diversified’ species approaching the ‘magical’ threshold of between 20 and 25 MHC I copies. 

It is noteworthy that Maddison et al (2007) 8  highlighted the following assumptions that need to be taken into account when using their BiSSE package. For the BiSSE model analysis none of the characters associated with speciation rates can be said to be causing or influencing evolution, even if Maddison et al (2007) 8  write, ‘the correct conclusion given a significant result using our method is that the character examined or a codistributed character  appear to be controlling diversification rates.’ At best, the binary character state is an association, at worse a misleading one. Maddison et al (2007) 8  provided the following cautions and assumptions about the likelihood of the BiSSE (binary-state speciation and extinction) model:

(a) the transitions happen instantaneously over the time scales considered (i.e., ignore periods of time during which a species is polymorphic). 

(b) these events are independent of one another; in particular, the character state change does not, in and of itself, cause speciation (or vice versa).

(c) an accurate rooted phylogenetic tree with branch lengths is known (the "inferred tree") and the character state is known for each of the terminal taxa. 

(d) the tree is assumed complete: all extant species in the group have been found and included.  

(e) all terminal taxa are contemporaneous, and the tree is ultrametric (i.e., the total root-to-tip distance is the same for all tips). 

I’m not convinced that Malmstrøm et al (2016) 2  considered or accepted these constraining assumptions when using BiSSE modeling.

7. Speciation and diversification rates. 

(a) In the study by Santini et al (2009) 9, the speciation rates within the Percomorpha clade were calculated to be at least ten times greater than in the Gadiformes order. Yet, according to Malmstrøm et al (2016) 2, there were fewer than 20 copies of the U genes for each of the 5 species in the Perciformes clade, compared to more than 20 copies and up to 100 copies of the U genes in 16 of 30 species in the Gadiformes order (Fig 2b, Malmstrøm et al 2016 2). The use of only 5 species in the Perciformes order is a gross underestimate of the ten thousand or more species found in that order 9. Moreover, in the Gadiformes order, there were closely related species (n=16) with > 20 U genes and different groups of closely related species (n=14) with < 20 U genes. Again, the number of species that were sequenced in the Gadiformes order is grossly under-represented. The Gadiformes comprises 10 families and more than 600 species 9, whereas Malmstrøm et al (2016) 2  sequenced only 27, i.e.,  27 x 100/600 = 2,700/600 = 4.7% analysis of extant sequence, a percentage that is simply not good enough to support their extravagant conclusions.  Is < 5% of the 600 species really representative of the Gadiformes. Malmstrøm et al (2016) 2  have to be more temperate with their conclusions using such a small representative sample.There are clear inconsistencies with MHC I a3 fragment copy numbers in the Gadiformes order. The MHC I a3 frag copy numbers are low (<20) for Moridae and M. occidentalis in Macrourinae, and for Phycinae, Lotinae, and three species in Gadinae. Five species of Gadinae have between 20 and 40 copies. On the other hand, Bregmacerotidae, Merlucciidae, Melanonidae, Muraenolepidodae, and Trachyrincinae have MHC I a3 fragment copy numbers between 50 and 100. The threshold levels (20 to 25) are all over the place. Moreover, the lineage, genomic block duplication and hitchhiking (linkage) effects on MHC gene duplications (8 to >100 copies) in the Gadiformes have not been taken into account in the analysis of speciation rates (Fig 3, Malmstrøm et al  2016 2), and, therefore, make the entire analysis unreliable.

(b) “Diversification rate analyses were calculated on the basis of the time-calibrated phylogeny and counts of species richness in each of the 37 mutually exclusive clades of teleost fishes" 2; mostly from the Gadiformes order. The MHC I speciation model of Malmstrøm et al (2016) 2  appears contradictory for the Perciformes (10,033 species) that have speciation rates 18 times greater than Gadiformes (555 species) 9  and yet the MHC I a3 fragment copy numbers are at least two times lower in Perciformes than Gadiformes (Fig 2b, Malmstrøm et al  2016 2). Also, the Anabantiformes have 252 species – a species rate 40 times lower than Gadiformes 9 and yet their MHC I a3 fragment copy numbers are at least two times higher than in Perciformes.

(c) Considering that there are more than 29,000 species of teleost fishes 9, a highly limited analysis by Malmstrøm et al (2016) 2  of using a sample group of less than 0.2% of the available extant species cannot be considered to be statistically, taxonomically or biologically significant or sufficiently reliable to conclude that, “Evolution of the immune system influences speciation rates in teleost fish” 2.  What does a species half-life of 25 million years mean in the context of 29,000 species of teleost fishes? If the multiple-regime OU model is wrong, highly biased or misinterpreted then does it validate or support the overall hypothesis of Malmstrøm et al (2016) 2? Also, what does an optimal trait actually mean in the context of 29,000 species? If, a suboptimal number of MHC I copies are detrimental to a species, then how have divergent species managed to survive for so long with a half-life of 25 million years of adaptation?  Also, if, as Malmstrøm et al (2016) 2  say, ‘Such gene family expansions may promote biological diversification by introducing new raw genetic material, potentially resulting in sub- or neofunctionalization and thus novel immunological pathways.’, then which of the non-optimal (greater than or less than the threshold of 20 -25 copies) MHC I genes are detrimental to the species? In this regard, there must be a gradation of functionally good and bad MHC I genes as their copy numbers approach the threshold (optimally good) and then deteriorate beyond it. Is this assumption of a MHC I copy number functional trait value as a quantitative marker of speciation at all testable?

8. In their discussion, Malmstrøm et al (2016) 2   referenced the hypothesis of T cell depletion and hybrid fitness by Eizaguirre et al (2009) 10  and concluded that,"Our analyses identify this threshold at 20–25 MHC I copies, suggesting that the effect of T cell depletion on hybrid fitness becomes more pronounced in this range and that this might affect mate choice in species with copy numbers above this threshold, promoting inbreeding and reinforcement."  Eizaguirre et al (2009) 10  suggested that, “Super-optimal individual MHC diversity should be a common disadvantage for species hybrids in vertebrates, resulting in elevated parasite loads.”  In this regard, if high copy numbers of the MHC class I genes leads to hybridization and loss of the immune system as inferred by Eizaguirre et al (2009) 10, then this more than likely would lead to extinction of populations and species. Extinction would be the most extreme and bizarre form of immune system influence on speciation rates. Furthermore, it is extremely speculative for Malmstrøm et al (2016) 2  to say that high copy numbers of the MHC class I genes with copy numbers above the threshold of 20 to 25 copies, promotes inbreeding and reinforcement, because, in fact, there is no such evidence for it. A more reasonable hypothesis is that high copy numbers of linked MHC class I genes, such as in rhesus macaque, or the mouse 11, or the cod 12, might benefit the species to better adapt to microbial inhabitants in a greater variety of geographical environments, although the evidence for this is tenuous as well. Despite ongoing debates, the selective advantage of MHC diversity in host-pathogen coevolution might not be easily resolved (at the macroevolution level) because of the constant number of insults by large numbers of pathogens in the life-time of an individual organism, population or a species and the arms race or Red Queen effect. Studies on extant species will always discover an example of a pathogen associated with a polymorphic MHC gene that might favour selective advantage for host-pathogen coevolution, whereas the pathogen that caused the extinction of a species is rarely or never found. To conclude that the immune system (that is, different copy numbers of the class I MHC genes 2) influences speciation rates, it would have to be shown that the immunity gene products can commonly create reproductive barriers or genetic incompatibilities among populations that permit the maintenance of the genetic and phenotypic distinctive­ness of these populations in geographical proximity 13; and this was not done 2.

9. Malmstrøm et al (2016) 2  did not provide any reliable evidence to support their speculation that evolution of the immune system influences speciation rates in teleost fish or that increas­ing MHC I diversity facilitates speciation 7.  Instead, Malmstrøm et al (2016) 2  used their limited data and analyses using speculative models to jump to highly unsupported conclusions and quickly position the cart before the horse. Dijkstra & Grimholt 1 pointed out that the Malmstrøm et al (2016) 2  title “Evolution of the immune system influences speciation rates in teleost fish”, is unsubstantiated, and that their hypothesis seems to be “the wrong way around”. It should have been, “Speciation (rate) influences the evolution of the immune system in teleost fish.” Or, “Speciation rates are associated with diversity of MHC class I genes in teleost fishes”, which perhaps is too obvious and underwhelming. This is not simply the chicken or the egg causality dilemma; in fact, the change in title is better supported by the literature and the established theories of MHC genomic evolution in vertebrates 4. However, because it is less “sexy” and controversial than the original title, it might not have been so readily published. 

10. A large number and variety of genome-wide gene duplications have been associated with speciation 13, that is, genomic gene duplications are not limited to only class I MHC genes. If MHC I gene duplications effect or affect speciation, how do the other hundreds of gene duplications contribute to speciation rates? Also, do sequence variants or mutations in non-duplicated genes have any influence on speciation rates? It seems absurd to pick on only one group of gene duplications (e.g., MHC class I genes 2) as those that are responsible for speciation and ignore all the others as an inconvenience. For example, a relatively recent comparative genomic study revealed how genomes change with speciation in an examination of genomes from five cichlid fish species, an ancestral lineage from the Nile, and four species from the East Africa lakes, Tanganyika, Malawi, and Victoria 14. Compared to the ancestral Nile lineage, the East African cichlid genomes had many alterations in regulatory elements, accelerated evolution of protein-coding elements in genes for pigmentation, an excess of gene duplications, and other distinct features that affect gene expression associated with transposable element insertions and novel microRNA. Each species also contains a reservoir of mutations different from the other species 14. Much of the diversity between the cichlid fish species evolved in a nonparallel manner often rapidly due to sexual selection and genetic conflicts between males and females or between different regions of the genome at a regulatory level 14  rather than by the slower and weaker forces of classical natural selection 13. If sexual selection and genetic conflict at the genomic regulatory level are the prime movers of speciation rate, it is difficult to conclude that the variable diversity of a few MHC gene copies are responsible for speciation as well as for the many other associated genomic changes associated with speciation.

11. Malmstrøm et al (2016) 2  informed us in the introduction section of their publication that "Our results highlight the plasticity of the vertebrate adaptive immune system and support the role of MHC genes as ‘speciation genes’, promoting rapid diversification in teleost fishes."  MHC class I gene copy number variability occurs across many different species, families, orders and domains. Because there is such enormous variability in MHC class I gene copy number for hundreds 4  or possibly even thousands of different chordate species, it is not possible to conclude meaningfully that the expansion of MHC class I genes provides an undefined advantage of one species over another. For example, the great apes (humans, chimpanzees, gorillas and orangutans) have about six functional MHC class I genes, whereas the old and new world monkeys often have up to 15 or more 4 , 11. Is this evidence that the MHC class I genes influence the rate of speciation in primates? And if so, what does that really mean in the whole scheme of things? How do the species with low copy numbers of MHC class I genes survive so well over millions of years without the presence of another 90 to 100 copies of MHC class I genes? This question is often neglected, and yet it is important for a better understanding of the function and evolution of MHC genes between and within the vertebrate species. 

12. Taxonomic and lineage markers.

Mutations, indels and duplications drive diversity and evolution. However, most mutated genes within species and their families do not create or influence speciation rates in the sense that Malmstrøm et al (2016) 2  use the term, ‘speciation genes’. In comparative genomics and their sequence relationships between different species, most genomic sequences range between newly derived genes and the ultraconserved or the essential core coding and noncoding genes with varying amounts of sequence differences. Some genic and nongenic sequences such as the MHC genes and retrotransposons are highly polymorphic and therefore are useful taxonomic markers at the individual, population, species and broader lineage levels. The MHC gene sequences clearly are one of these useful taxonomic or lineage markers along with olfactory receptors, immunoglobulins, globins, HOX, TOLLs, KIRs, mitochondrial DNA, ribosomal RNA sequences and thousands of others that can be used comparatively in the phylogeny to undertake an examination of the accuracy and reliability of current taxonomical rankings and sister lineages. However, because many thousands of coding and noncoding genes (or sequences) are variants (polymorphic) or vary in copy numbers, we do not immediately or easily imply that all or some of them are responsible for speciation without providing further concrete evidence. This kind of extrapolation without the burden of proof is absurd and wrong. Similarly, to say that the polymorphisms demonstrate natural selection as if natural selection was a biological or molecular mechanism is meaningless without showing experimentally how these polymorphisms benefit or disadvantage the organism over all the other different polymorphisms. 

13. On the basis of either a priori or  a posteriori reasoning, the immune system obviously affects the wellbeing of individuals and populations, but whether it can be extrapolated to speciation events and speciation rates remains highly dubious and most probably unlikely. It seems too farfetched to blame MHC class I genes with high copy numbers over threshold levels of promoting inbreeding and reinforcement 2  because this in turn could create hybrid inviability or sterility resulting in postzygotic isolation. Although the population conditions in many models of rapid speciation do favour inbreeding and/or hybridization 13, none of the teleost species tested by Malmstrøm et al (2016) 2  were shown to be either inbreeding or in postzygotic isolation. The factors responsible for either prezygotic or postzygotic isolation are likely to be independent of the adaptive immune system, although zealots might argue otherwise. Hybridization between diverging lineages in post-zygotic reproductive isolation can trigger genome instability. For most animals without an adaptive immune system and for plants without a MHC, speciation depends on the shrinkage, expansion and equilibrium (e.g., aneuploidization and dysploidy) of the genome and the containment and functionality of all the essential genomic information to develop an optimal balance between stability and plasticity within the organism in order to first survive and then propagate and expand itself as a new species 13. In those rare and ‘traumatic’ transitional situations, there is no need for particular ‘speciation’ genes such as variable copies of the class I MHC genes to influence speciation. The rarely observed transition from population ‘trauma’ to a new speciation event depends on an array of totally different factors for creating postzygotic isolation events including interbreeding between semi-isolated populations and an elaboration on the existence of stress-induced changes in chromosomal and ploidy integrity both in hostile and non-hostile environments.

14. Finally, Malmstrom et al (2016) 2  admirably sequenced 66 teleost species by a next generation sequencing method and identified an array of MHC I and MHC II exonic fragments for phylogenetic and speciation analysis using the multiple-regime OU model to predict the optimal MHC I copy number as an evolutionary trait optimum affecting speciation. However, the conclusions of the paper by Malmstrom et al (2016) 2  especially for the MHC I gene copy numbers are unreliable because they are based on far too many assumptions, speculations, contradictions, incomplete or missing data and unproven predictive models with little or no empirical evidence in support. Nevertheless, their simple, but controversial hypothesis is published, and now it is up to them and others to test its validity and "consider plausible alternative hypotheses in a firm hypothesis-testing framework in which alternative hypotheses make clear [and sensible] predictions of emerging patterns that can be unambiguously associated with particular models." 7

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

References

  • 1. Dijkstra J, Grimholt U: Major histocompatibility complex (MHC) fragment numbers alone – in Atlantic cod and in general - do not represent functional variability. F1000Research.2018;7: 10.12688/f1000research.15386.1 10.12688/f1000research.15386.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Malmstrøm M, Matschiner M, Tørresen OK, Star B, Snipen LG, Hansen TF, Baalsrud HT, Nederbragt AJ, Hanel R, Salzburger W, Stenseth NC, Jakobsen KS, Jentoft S: Evolution of the immune system influences speciation rates in teleost fishes. Nat Genet.48(10) : 10.1038/ng.3645 1204-10 10.1038/ng.3645 [DOI] [PubMed] [Google Scholar]
  • 3. Shiina T, Suzuki S, Kulski J: MHC Genotyping in Human and Nonhuman Species by PCRbased Next-Generation Sequencing.2016; 10.5772/61842 10.5772/61842 [DOI] [Google Scholar]
  • 4. Kulski JK, Shiina T, Anzai T, Kohara S, Inoko H: Comparative genomic analysis of the MHC: the evolution of class I duplication blocks, diversity and complexity from shark to man. Immunol Rev.2002;190:95-122 [DOI] [PubMed] [Google Scholar]
  • 5. Kulski JK, Anzai T, Shiina T, Inoko H: Rhesus macaque class I duplicon structures, organization, and evolution within the alpha block of the major histocompatibility complex. Mol Biol Evol.2004;21(11) : 10.1093/molbev/msh216 2079-91 10.1093/molbev/msh216 [DOI] [PubMed] [Google Scholar]
  • 6. Hansen TF: STABILIZING SELECTION AND THE COMPARATIVE ANALYSIS OF ADAPTATION. Evolution.1997;51(5) : 10.1111/j.1558-5646.1997.tb01457.x 1341-1351 10.1111/j.1558-5646.1997.tb01457.x [DOI] [PubMed] [Google Scholar]
  • 7. Cooper N, Thomas GH, Venditti C, Meade A, Freckleton RP: A cautionary note on the use of Ornstein Uhlenbeck models in macroevolutionary studies. Biol J Linn Soc Lond.2016;118(1) : 10.1111/bij.12701 64-77 10.1111/bij.12701 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Maddison WP, Midford PE, Otto SP: Estimating a binary character's effect on speciation and extinction. Syst Biol.2007;56(5) : 10.1080/10635150701607033 701-10 10.1080/10635150701607033 [DOI] [PubMed] [Google Scholar]
  • 9. Santini F, Harmon LJ, Carnevale G, Alfaro ME: Did genome duplication drive the origin of teleosts? A comparative study of diversification in ray-finned fishes. BMC Evol Biol.2009;9: 10.1186/1471-2148-9-194 194 10.1186/1471-2148-9-194 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Eizaguirre C, Lenz TL, Traulsen A, Milinski M: Speciation accelerated and stabilized by pleiotropic major histocompatibility complex immunogenes. Ecol Lett.2009;12(1) : 10.1111/j.1461-0248.2008.01247.x 5-12 10.1111/j.1461-0248.2008.01247.x [DOI] [PubMed] [Google Scholar]
  • 11. Shiina T, Blancher A, Inoko H, Kulski JK: Comparative genomics of the human, macaque and mouse major histocompatibility complex. Immunology.2017;150(2) : 10.1111/imm.12624 127-138 10.1111/imm.12624 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Parham P: How the codfish changed its immune system. Nature Genetics.2016;48(10) : 10.1038/ng.3684 1103-1104 10.1038/ng.3684 [DOI] [PubMed] [Google Scholar]
  • 13. Seehausen O, Butlin RK, Keller I, Wagner CE, Boughman JW, Hohenlohe PA, Peichel CL, Saetre GP, Bank C, Brännström A, Brelsford A, Clarkson CS, Eroukhmanoff F, Feder JL, Fischer MC, Foote AD, Franchini P, Jiggins CD, Jones FC, Lindholm AK, Lucek K, Maan ME, Marques DA, Martin SH, Matthews B, Meier JI, Möst M, Nachman MW, Nonaka E, Rennison DJ, Schwarzer J, Watson ET, Westram AM, Widmer A: Genomics and the origin of species. Nat Rev Genet.2014;15(3) : 10.1038/nrg3644 176-92 10.1038/nrg3644 [DOI] [PubMed] [Google Scholar]
  • 14. Brawand D, Wagner CE, Li YI, Malinsky M, Keller I, Fan S, Simakov O, Ng AY, Lim ZW, Bezault E, Turner-Maier J, Johnson J, Alcazar R, Noh HJ, Russell P, Aken B, Alföldi J, Amemiya C, Azzouzi N, Baroiller JF, Barloy-Hubler F, Berlin A, Bloomquist R, Carleton KL, Conte MA, D'Cotta H, Eshel O, Gaffney L, Galibert F, Gante HF, Gnerre S, Greuter L, Guyon R, Haddad NS, Haerty W, Harris RM, Hofmann HA, Hourlier T, Hulata G, Jaffe DB, Lara M, Lee AP, MacCallum I, Mwaiko S, Nikaido M, Nishihara H, Ozouf-Costaz C, Penman DJ, Przybylski D, Rakotomanga M, Renn SCP, Ribeiro FJ, Ron M, Salzburger W, Sanchez-Pulido L, Santos ME, Searle S, Sharpe T, Swofford R, Tan FJ, Williams L, Young S, Yin S, Okada N, Kocher TD, Miska EA, Lander ES, Venkatesh B, Fernald RD, Meyer A, Ponting CP, Streelman JT, Lindblad-Toh K, Seehausen O, Di Palma F: The genomic substrate for adaptive radiation in African cichlid fish. Nature.2014;513(7518) : 10.1038/nature13726 375-381 10.1038/nature13726 [DOI] [PMC free article] [PubMed] [Google Scholar]
F1000Res. 2018 Jul 27. doi: 10.5256/f1000research.16766.r36541

Referee response for version 1

Brian Dixon 1

The critique of Malstrøm et al. presented here makes some very valid points that are well supported by the literature.

It has long been true in fish MHC research that the fact that a gene has not been reported to be present in a particular species does not mean that is not present. Modern genomics techniques has presented better proof for this assertion, with the lack of an MHCII/CD4 pathway in gadids being the most prominent example, but even modern genomics techniques are not iron clad 100% proof and should be checked very carefully before definitive statements are made. Thus the comments about verifying the presence or absence of specific genes in the numerous  species by other means is valid.

Additionally, the treatment of all U and Z genes as identical units while ignoring the allelic diversity of each gene within those classes is indeed a serious flaw in the reasoning of Malstrøm et al. There is significant variability in diversity U gene families which will have differing effects on T cell selection that simply counting gene numbers will not address.

Dijkstra and Grimholt's critique should be carefully read and addressed.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2018 Jul 19. doi: 10.5256/f1000research.16766.r35833

Referee response for version 1

Anthony B Wilson 1,2

Dijkstra & Grimholt present a critical analysis of Malmstrom et al.'s 2016 Nature Genetics article 1, which investigated the evolution of MHC I and II loci in gadiform fishes using a low coverage genomic screen of 66 species, inferring a link between adaptive immune evolution and speciation rates in this group.  Dijkstra & Grimholt’s criticisms are wide ranging – I deal with each of their major areas of concern below:

I. MHC Class II loss in gadiform fishes.  The authors highlight two serious flaws in the Malmstrom analysis, demonstrating that the original dataset contains sequence reads of MHC II and associated loci in several species that were overlooked in the original analysis.  Equally importantly, datasets for several of the outgroup taxa lack these genes, raising questions concerning the reliability of the underlying data.  Malmstrom et al.'s genomic screen is understandably low coverage given the taxonomic breadth of their survey, but I agree with Dijkstra & Grimholt that based on the existing evidence, one cannot confidently infer the timing of MHC II gene loss in this group.

II. MHC I allele counting strategy.  Djikstra & Grimholt challenge the allele counting strategy used by Malmstrom et al, particularly their focus on U & Z loci (teleost fish have at least 5 different MHC I lineages 2), based on their assumption that these loci are chiefly involved in binding peptide ligands.  While I agree that grouping U & Z loci together simplifies their known functional complexity (I was rather confused by this approach myself when reading the original paper), here I feel that Djikstra & Grimholt could be more constructive in their criticism.  At present, its not entirely clear what type of analysis they would feel would be most suitable.  I would also suggest providing slightly more context on the study system to assist readers who may be unfamiliar with the original work. 

While Dijkstra & Grimholt have elsewhere provided compelling evidence that Z loci may have a very different function, its not clear whether they’re suggesting that Malmstrom et al. should have focused solely on U loci, or whether it would have been more appropriate to include all MHC lineages in their analyses.  Either way, I would have liked to see whether analyzing the data in the manner preferred by the authors would impact the conclusions of the original article.

I agree that experimental evidence would be necessary to conclusively demonstrate a link between allelic diversity and function, but given the taxonomic breadth of Malmstrom et al.'s study, surely they wouldn’t expect experimental evidence for all species included in the original study - How much experimental evidence would they deem sufficient?  At present, its not clear whether they’re simply suggesting that Malmstrom et al. should have been more circumspect in their conclusions, or whether they feel that the results of the analysis are entirely unreliable.  Clarification of this point is essential.

III. Testing the relationship between MHC allelic diversity and speciation rates in gadiform fishes.The authors raise concerns on the modelling approach used by Malmstrom et al., including their combined analysis of U and Z loci (see above), and their lack of a biologically realistic model of gene evolution, incorporating MHC gene gain and loss 3 – I agree with these criticisms.  I do, however, take some issue with their contention that Malmstrom et al.'s hypothesis is wholly invalid.  While there is indeed strong evidence of trans-species MHC polymorphism in some well-studied vertebrate lineages, this does not invalidate an experimental test of an alternative hypothesis.  If Dijkstra & Grimholt feel that Malmstrom et al. have their hypothesis “the wrong way around”, are there any data/analyses that could convince them otherwise?

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

References

  • 1. Malmstrøm M, Matschiner M, Tørresen OK, Star B, Snipen LG, Hansen TF, Baalsrud HT, Nederbragt AJ, Hanel R, Salzburger W, Stenseth NC, Jakobsen KS, Jentoft S: Evolution of the immune system influences speciation rates in teleost fishes. Nat Genet.48(10) : 10.1038/ng.3645 1204-10 10.1038/ng.3645 [DOI] [PubMed] [Google Scholar]
  • 2. Grimholt U, Tsukamoto K, Azuma T, Leong J, Koop BF, Dijkstra JM: A comprehensive analysis of teleost MHC class I sequences. BMC Evol Biol.2015;15: 10.1186/s12862-015-0309-1 32 10.1186/s12862-015-0309-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Nei M, Rooney AP: Concerted and birth-and-death evolution of multigene families. Annu Rev Genet.2005;39: 10.1146/annurev.genet.39.073003.112240 121-52 10.1146/annurev.genet.39.073003.112240 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Data Availability Statement

    The data analyzed in this study are publicly available. Details are explained in Supplementary File 1, Supplementary File 2 and Supplementary File 3.


    Articles from F1000Research are provided here courtesy of F1000 Research Ltd

    RESOURCES