Abstract
A gene duplication can lead to all sorts of problems in a cell. However, it can also lead to all sorts of benefits. Beneficial or not, the gene duplicates might be kept in the genome because of several different reasons. For instance, if natural selection works towards optimizing one function of a gene at the expense of another, then gene duplication could resolve this conflict by separating the functions in two genes. Here, we outline evolutionary incentives to keep a duplicated gene in the genome, focusing on divergence in expression and trade-off resolution as featured in a new and exciting paper published in this edition of PLOS Biology.
Gene duplication can lead to problems for a cell, but can also afford a range of opportunities. This Primer explores the evolutionary incentives to keep a duplicated gene in the genome, focusing on a recent demonstration that duplication can help resolve conflicts in gene regulation.
Genes have life cycles of their own. They are, most of the time, born from the duplication of other genes (Box 1) and may eventually die and become pseudogenes. During the period between birth and death, the sequence and regulatory elements of a new gene change through mutations. This dynamic gain and loss of genes and the associated changes to regulation and function contribute to phenotypic differences between species and among populations of the same species [1]. Numerous studies over the past 50 years have investigated the role of different evolutionary forces such as natural selection and drift in shaping these life cycles, for example, by investigating the contribution of nucleotide and amino acid substitutions to the divergence of new genes. One category of molecular changes that appears to play a key role in the evolution of genes that originate from gene duplication (duplicates or paralogs) are regulatory changes, i.e., changes in the gene itself or elsewhere in the genome that determine when, where, and at what level a gene is transcribed and translated.
Box 1
Gene duplicates originate mainly by two mechanisms: small-scale duplication (SSD) and whole-genome duplication (WGD) [30]. In SSDs, only one or a few genes are duplicated, whereas in WGD, all genes are duplicated simultaneously. These two mechanisms have specific features that influence the retention of duplicates, which in turn influences the properties of genes that originated from either mechanism. One of the key differences is that SSD genes first originate in a single individual and must increase in frequency by drift or selection to be maintained. WGD would also occur in one individual, but it could potentially incite or co-occur with a speciation event [31], which would coincide with a population bottleneck and thus the fixation of all duplicates without the need for natural selection. However, WGDs have been associated with performance traits in plants, for instance [32], which means natural selection can also favor their fixation.
There is a major difference between SSDs and WGDs if we consider interactions among gene products—for instance, for proteins forming protein complexes. WGD will likely maintain the stoichiometric balance of the complexes, whereas the duplication of a single subunit through SSD would perturb the balance [33]. In a WGD, this principle predicts the preferential maintenance of proteins that are dosage sensitive and whose loss would lead to a fitness defect because it would perturb the balance. WGD genes being more dosage-balance sensitive [34] is supported by observations that they have fewer copy-number variations in human populations [35] and are overrepresented among genes with copy-number variations that are pathogenic [36]. The properties of genes may therefore influence the probability that their duplicates are maintained after SSDs or WGDs, thereby determining what is the extent of novelties that can evolve from these mechanisms. In the case of Chapal and colleagues [25], the Msn duplicates show a fitness trade-off when expression is increased and are thus dosage sensitive, which suggests that their duplication may initially have been maintained specifically because it originated via a WGD.
Regulatory evolution and the maintenance of gene duplicates
The immediate effect of gene duplication is typically an increase in gene dosage [4] (Fig 1). Higher dosage, however, does not always translate into increased fitness [5]. This means that at this stage, natural selection could favor gene retention or loss, or if the expression change is effectively neutral, the duplicate could evolve neutrally for extended periods of time (Box 1). If an increase in total expression is favored by internal or external conditions, a gene duplication could provide an immediate benefit. For example, genes coding for digestive enzymes, such as amylases that hydrolyze starch, vary in copy number among human populations. Copy-number correlates with the diet such that high-starch diets are associated with more copies, whereas low-starch diets are associated with fewer copies [6]. Diet is such a strong selective force that multiple copies have been maintained in many mammals [7]. Selection for higher dosage can sometimes lead to the maintenance of a large number of gene copies. An extreme example of this is the hundreds of duplicated copies of ribosomal RNA genes in certain microbial genomes: the adaptiveness of this most likely derives from some life-history traits that demand an increase in protein synthesis machinery [8]. When paralogs are maintained because of dosage effects, gene copies are maintained without the need for the individual copies to gain new functions.
Regulatory evolution could also favor the retention of a new gene by changing the tissue or timing of expression, a process called neofunctionalization (Fig 1). The duplicate’s newly gained expression pattern would favor its retention by contributing a new function to, for instance, a tissue. The gain of new expression specificity can also be accompanied by and facilitate the gain of new molecular functions at the protein level. Indeed, the change of cellular context for the protein can represent new opportunities for selective forces to act on the protein itself. The retinoic acid receptors (RARs) have evolved following this path in vertebrates. RARs are nuclear receptors that are bound by specific ligands and that activate the transcription of genes during key developmental steps. Three paralogs originated from the whole-genome duplications (WGDs) at the origin of vertebrates, two of which have evolved new ligand-binding specificities and expression patterns during development [9].
Although it is easy to conceive that natural selection may favor the maintenance of gene duplicates because of dosage effects or new regulatory programs, it may be less intuitive to imagine that gene duplicates could be maintained by degenerative mutations that lead to the “specialization” of each duplicate (subfunctionalization, Fig 1). The theory behind the role of this mechanism was formally derived by [10,11]. Briefly, the model showed that if a gene has multiple functions or tissues of expression, its duplication could be followed by the loss of different functions in each copy while still preserving all the ancestral functions. However, because now the two genes are required to perform the functions previously performed by the single progenitor, natural selection will act to maintain both copies. The power of this model is that it does not require the evolution of new and adaptive functions, which may be inaccessible for many genes and thus could not explain why gene duplicates would be maintained. Regulatory subfunctionalization was recently hypothesized to occur at the level of alternative splicing and subcellular localization of the plastid ascorbate peroxidase in plants, a key detoxifying enzyme. Some plants have a single gene that produces two distinct proteins by alternative splicing that localize in different cell compartments; others have two independent genes, each producing a single protein that localizes to one compartment or the other [12].
Recent work showed that subfunctionalization could also take place at the level of gene dosage (dosage subfunctionalization, Fig 1) and not necessarily implying the loss of other molecular functions such as tissue or timing expression specificity. Gout and Lynch [13] showed that natural selection to maintain the expression level of a gene could act on total expression of a gene pair rather than on each of them individually. This allows the two genes to accumulate mutations that change the expression levels without being filtered by natural selection, if total expression is maintained. For instance, if one copy accumulates mutations that reduce expression, then the other copy could accumulate mutations that increase expression, all the while maintaining the total expression level. Eventually, one copy could be expressed at such a low level that its loss would be effectively neutral.
Other dimensions of gene expression regulation
The examples mentioned thus far consider only a few dimensions of gene regulation (level, timing, localization). Yet gene expression systems are highly dynamic, and other features may contribute to the evolution and retention of gene duplicates. Two important expression features are (1) the responsiveness, which refers to the magnitude and propensity to change gene expression levels in response to intra- and extracellular signals, and (2) expression noise. Responsiveness has mostly been studied in single-celled organisms such as yeast, in which expression level has been studied in entire populations across hundreds of growth and stress conditions as well as at the single-cell level using various reporters. There are important differences in the sensitivity of genes to environmental changes and mutations: some genes rarely change expression levels, whereas others do so easily [14]. Interestingly, responsiveness also appears to correlate with divergence of gene expression levels among species [15]. More responsive genes show more differences in expression regulation between species. This observation suggests that responsiveness could be a gene property that favors divergence between species. However, this is not a universal trend, as responsiveness could also be selected against for many genes that would rather require stable dosage [16].
Expression noise appears to be strongly associated with responsiveness. Noise is linked to the architecture of the genes themselves and is manifested by expression differences among cells that are genetically identical. Noisiness is not always easily assessed because it requires cells to be examined individually. Although some studies have suggested that noise in gene expression could be advantageous [17], it is also likely to be deleterious because it prevents a large fraction of a population from attaining the optimal expression level at a given time [18]. Attesting to the importance of low noise, essential genes and (most importantly) those that reduce fitness when their dosage is reduced (haploinsufficient) tend to be less noisy than genes that do not show measurable effects on fitness upon deletion [19,20]. Furthermore, the study of the fitness consequences of noise and changes in average expression has revealed that noise could, in some cases, be as detrimental as changes in mean expression [21,22].
Nevertheless, gene expression noise is prevalent. This prevalence could be explained by the fact that for some gene classes and promoter architectures (those with a TATA box), responsiveness and noise seem to be intrinsically coupled [23]. Although natural selection may favor responsiveness, the inability of cells to reach a precise expression level comes as an unavoidable cost. This correlation between responsiveness and noise was detailed by Lehner [24], who also suggested that this trade-off could be alleviated by gene duplication because it would allow the system to maintain responsiveness while reducing noise. If the deviation from optimality of expression level of the two genes is not correlated, their average expression will be closer to optimal level than the expression of an individual gene with the same average expression and noise level. Gene duplication in this case would allow for two responsive genes but with reduced absolute noise. The consideration of gene regulation at the single-cell level therefore allowed geneticists to uncover potential mechanisms for the maintenance of gene duplication. However, a detailed example of what role these features of gene expression play in the maintenance of duplicates was yet to come.
Single-cell biology offers a new perspective on the role of regulatory evolution in the retention of gene duplicates
This edition of PLOS Biology [25] brings forth an elegant example of a gene duplication that did not result in neofunctionalization or subfunctionalization as typically defined. The transcription factor Msn was duplicated during the WGD in the budding yeast Saccharomyces cerevisiae and has since diverged into Msn2 and Msn4. Previous studies of these two genes have differed in their conclusion as to the divergence of function between them. Despite previous suggestions that these two transcription factors may have diverged in terms of function [27], Chapal and colleagues provide convincing evidence that they regulate the same target genes. That raises the question: How and why would yeast have kept these paralogs for the last 100 million years?
Chapal and colleagues bring forward compelling evidence that these two transcription factors are cooperating in the cell to minimize growth speed defects while maximizing stress responsiveness. The authors show that higher levels of Msn2 are detrimental to the growth of the cells but beneficial when cells are in stressful conditions. They propose this simple trade-off between growth speed and environmental responsiveness as the incentive for the retention of the two copies of Msn, even though they have the same target genes. In accordance with this, Chapal and colleagues [25] found that Msn2 has a low but steady expression with little noise, whereas Msn4 is environmentally responsive with a high level of noise. This allows for a regulatory dynamic that solves the conflict between a dynamic response, which comes with the trade-off of noisy expression, and a steady number of proteins in the cell during nonstress conditions (Fig 2). The expression of Msn2 does not change during the growth of a population, whereas Msn4 increases gradually along the growth curve and with it, the resistance to stress.
Interestingly, they compare the expression of the two paralogs with an ortholog from Kluyveromyces lactis, which diverged from S. cerevisiae before the WGD event (Box 1), and found that it had an expression profile that was intermediate to that of Msn2 and 4. The ortholog was induced throughout the growth curve, although at a lower level than Msn4, and its noisiness was intermediate between the two (Fig 2B). The authors suggest the following scenario: After the WGD, Msn2 gained a more stable expression by its transcription start site moving farther away from the open reading frame to the boundary of a nucleosome free region. Msn4, on the other hand, increased its dynamic range and noise by gaining new transcription factor binding sites.
The model proposed by this new study [25] is not, strictly speaking, about a case of subfunctionalization, because the initial model by Lynch and colleagues [10,11] does not require that the division of labor occurs with a gain in fitness. The case documented here rather suggests that division of labor allows for a gain in fitness by resolving a trade-off, as has been proposed for other pairs of paralogs that may have conflicting protein functions [28,29]. It is unclear how frequent this form of adaptive subfunctionalization is, given that many more types of conflicts may exist between the different functions of a given gene and may not necessarily be resolvable by simple mutational events. All cases of putative subfunctionalization may need to be dissected in detail as Chapal and colleagues did, to make sure that what appears to be a simple division of labor may not be accompanied with an exquisite functional specialization.
Acknowledgments
We thank Angel Cisneros, Simon Aubé, Damien Biot-Pelletier, Anna Fijarczyk, David Bradley, and Laurence Hurst for comments.
Abbreviations
- RAR
retinoic acid receptor
- SSD
small-scale duplication
- WGD
whole-genome duplication
Funding Statement
This work was funded by a Human Frontiers Science Program fellowship (LT000182/2019-L) to JH, a Canadian Institute for Health Research (CIHR) foundation grant (387697) to CRL. CRL holds the Canada Research Chair in Evolutionary Cell and Systems Biology. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Footnotes
Provenance: Commissioned; externally peer reviewed.
References
- 1.Chen S, Krinsky BH, Long M. New genes as drivers of phenotypic evolution. Nat Rev Genet. 2013;14: 645–660. 10.1038/nrg3521 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Loehlin DW, Carroll SB. Expression of tandem gene duplicates is often greater than twofold. Proc Natl Acad Sci U S A. 2016;113: 5988–5992. 10.1073/pnas.1605886113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dephoure N, Hwang S, O’Sullivan C, Dodgson SE, Gygi SP, Amon A, et al. Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast. Elife. 2014;3: e03023 10.7554/eLife.03023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ohno S. Other Mechanisms for Achieving Gene Duplication In: Ohno S, editor. Evolution by Gene Duplication. Berlin, Heidelberg: Springer; Berlin Heidelberg; 1970. pp. 107–110. [Google Scholar]
- 5.Moriya H, Makanae K, Watanabe K, Chino A, Shimizu-Yoshida Y. Robustness analysis of cellular systems using the genetic tug-of-war method. Mol Biosyst. 2012;8: 2513–2522. 10.1039/c2mb25100k [DOI] [PubMed] [Google Scholar]
- 6.Perry GH, Dominy NJ, Claw KG, Lee AS, Fiegler H, Redon R, et al. Diet and the evolution of human amylase gene copy number variation. Nat Genet. 2007;39: 1256–1260. 10.1038/ng2123 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Pajic P, Pavlidis P, Dean K, Neznanova L, Romano R-A, Garneau D, et al. Independent amylase gene copy number bursts correlate with dietary preferences in mammals. Elife. 2019;8 10.7554/eLife.44628 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Nelson JO, Watase GJ, Warsinger-Pepe N, Yamashita YM. Mechanisms of rDNA Copy Number Maintenance. Trends Genet. 2019; 10.1016/j.tig.2019.07.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Escriva H, Bertrand S, Germain P, Robinson-Rechavi M, Umbhauer M, Cartry J, et al. Neofunctionalization in vertebrates: the example of retinoic acid receptors. PLoS Genet. 2006;2: e102 10.1371/journal.pgen.0020102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lynch M, Force A. The probability of duplicate gene preservation by subfunctionalization. Genetics. 2000;154: 459–473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999;151: 1531–1545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Qiu Y, Van Tay Y, Ruan Y, Adams KL. Divergence of duplicated genes by repeated partitioning of splice forms and subcellular localization. New Phytol. 2019; 10.1111/nph.16148 [DOI] [PubMed] [Google Scholar]
- 13.Gout J-F, Lynch M. Maintenance and Loss of Duplicated Genes by Dosage Subfunctionalization. Mol Biol Evol. 2015;32: 2141–2148. 10.1093/molbev/msv095 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Landry CR, Lemos B, Rifkin SA, Dickinson WJ, Hartl DL. Genetic properties influencing the evolvability of gene expression. Science. 2007;317: 118–121. 10.1126/science.1140247 [DOI] [PubMed] [Google Scholar]
- 15.Tirosh I, Weinberger A, Carmi M, Barkai N. A genetic signature of interspecies variations in gene expression. Nat Genet. 2006;38: 830–834. 10.1038/ng1819 [DOI] [PubMed] [Google Scholar]
- 16.Duveau F, Yuan DC, Metzger BPH, Hodgins-Davis A, Wittkopp PJ. Effects of mutation and selection on plasticity of a promoter activity in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 2017;114: E11218–E11227. 10.1073/pnas.1713960115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhang Z, Qian W, Zhang J. Positive selection for elevated gene expression noise in yeast. Mol Syst Biol. 2009;5: 299 10.1038/msb.2009.58 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wang Z, Zhang J. Impact of gene expression noise on organismal fitness and the efficacy of natural selection. Proc Natl Acad Sci U S A. 2011;108: E67–76. 10.1073/pnas.1100059108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Fraser HB, Hirsh AE, Giaever G, Kumm J, Eisen MB. Noise minimization in eukaryotic gene expression. PLoS Biol. 2004;2: e137 10.1371/journal.pbio.0020137 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Batada NN, Hurst LD. Evolution of chromosome organization driven by selection for reduced gene expression noise. Nat Genet. 2007;39: 945–949. 10.1038/ng2071 [DOI] [PubMed] [Google Scholar]
- 21.Schmiedel JM, Carey LB, Lehner B. Empirical mean-noise fitness landscapes reveal the fitness impact of gene expression noise. Nat Commun. 2019;10: 3180 10.1038/s41467-019-11116-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Metzger BPH, Yuan DC, Gruber JD, Duveau F, Wittkopp PJ. Selection on noise constrains variation in a eukaryotic promoter. Nature. 2015;521: 344–347. 10.1038/nature14244 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Blake WJ, KAErn M, Cantor CR, Collins JJ. Noise in eukaryotic gene expression. Nature. 2003;422: 633–637. 10.1038/nature01546 [DOI] [PubMed] [Google Scholar]
- 24.Lehner B. Conflict between noise and plasticity in yeast. PLoS Genet. 2010;6: e1001185 10.1371/journal.pgen.1001185 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Chapal M, Mintzer S, Brodsky S, Carmi M, Barkai N. Resolving noise-control conflict by gene duplication. PLoS Biol. 2019;17(11):e3000289 10.1371/journal.pbio.3000289 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Moriya H, Shimizu-Yoshida Y, Kitano H. In vivo robustness analysis of cell division cycle genes in Saccharomyces cerevisiae. PLoS Genet. 2006;2: e111 10.1371/journal.pgen.0020111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kuang Z, Pinglay S, Ji H, Boeke JD. Msn2/4 regulate expression of glycolytic enzymes and control transition from quiescence to growth. Elife. 2017;6 10.7554/eLife.29938 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Storz JF. Genome evolution: gene duplication and the resolution of adaptive conflict. Heredity. 2009;102: 99–100. 10.1038/hdy.2008.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Piatigorsky J. The recruitment of crystallins: new functions precede gene duplication. Science. 1991;252: 1078–1079. 10.1126/science.252.5009.1078 [DOI] [PubMed] [Google Scholar]
- 30.Ohno S. Evolution by Gene Duplication. Berlin: Springer; Berlin Heidelberg; 2014. [Google Scholar]
- 31.Vanneste K, Maere S, Van de Peer Y. Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution. Philos Trans R Soc Lond B Biol Sci. 2014;369 10.1098/rstb.2013.0353 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.del Pozo JC, Ramirez-Parra E. Whole genome duplications in plants: an overview from Arabidopsis. J Exp Bot. 2015;66: 6991–7003. 10.1093/jxb/erv432 [DOI] [PubMed] [Google Scholar]
- 33.Papp B, Pál C, Hurst LD. Dosage sensitivity and the evolution of gene families in yeast. Nature. 2003;424: 194–197. 10.1038/nature01771 [DOI] [PubMed] [Google Scholar]
- 34.Qian W, Zhang J. Gene dosage and gene duplicability. Genetics. 2008;179: 2319–2324. 10.1534/genetics.108.090936 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Makino T, McLysaght A. Ohnologs in the human genome are dosage balanced and frequently associated with disease. Proc Natl Acad Sci U S A. 2010;107: 9270–9274. 10.1073/pnas.0914697107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.McLysaght A, Makino T, Grayton HM, Tropeano M, Mitchell KJ, Vassos E, et al. Ohnologs are overrepresented in pathogenic copy number mutations. Proc Natl Acad Sci U S A. 2014;111: 361–366. 10.1073/pnas.1309324111 [DOI] [PMC free article] [PubMed] [Google Scholar]