Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2019 May 29;116(24):11866–11871. doi: 10.1073/pnas.1900437116

Why haploinsufficiency persists

Summer A Morrill a,b, Angelika Amon a,b,c,d,1
PMCID: PMC6575174  PMID: 31142641

Significance

For most genes, a single copy is enough to support normal growth and development of diploid organisms, but a small subset of genes known as haploinsufficient (HI) genes exhibit extreme sensitivity to decreased gene dosage. Given the relatively high frequency of gene-inactivating mutations over the lifespan of an organism, and cell-to-cell variability in gene expression, haploinsufficiency represents a significant barrier to organismal fitness. Why the expression of these genes has not been modulated over evolutionary time to eliminate their haploinsufficiency remains unexplained. We find that the limit of haploinsufficient genes on organismal fitness cannot be overcome by an increase in expression because haploinsufficient genes also confer a fitness disadvantage when encoded in extra copy, leaving these genes evolutionarily “stuck.”

Keywords: haploinsufficiency, gene dosage, dosage sensitivity

Abstract

Haploinsufficiency describes the decrease in organismal fitness observed when a single copy of a gene is deleted in diploids. We investigated the origin of haploinsufficiency by creating a comprehensive dosage sensitivity data set for genes under their native promoters. We demonstrate that the expression of haploinsufficient genes is limited by the toxicity of their overexpression. We further show that the fitness penalty associated with excess gene copy number is not the only determinant of haploinsufficiency. Haploinsufficient genes represent a unique subset of genes sensitive to copy number increases, as they are also limiting for important cellular processes when present in one copy instead of two. The selective pressure to decrease gene expression due to the toxicity of overexpression, combined with the pressure to increase expression due to their fitness-limiting nature, has made haploinsufficient genes extremely sensitive to changes in gene expression. As a consequence, haploinsufficient genes are dosage stabilized, showing much more narrow ranges in cell-to-cell variability of expression compared with other genes in the genome. We propose a dosage-stabilizing hypothesis of haploinsufficiency to explain its persistence over evolutionary time.


For nearly a century, scientists and mathematicians have worked to formulate a theory to explain the origin of haploinsufficiency. Why do these genes exhibit an abnormal phenotype upon deletion of one of their two homologous copies, when the majority of genes do not? Early theories considered haploinsufficiency to be an artifact of diploidy, a rare failure of the wild-type allele to maintain protective dominance (1). This idea was ultimately disproven by the observation that equivalent rates of haploinsufficiency are present in organisms that primarily exist in the haploid state (2). Later theories evoked a more physiological explanation, whereby the specific function of the gene dictates its sensitivity to changes in dosage (3). For example, genes encoding enzymes are sparse among haploinsufficient (HI) genes, but genes encoding proteins that perform structural and regulatory functions in the cell are enriched among them (4). More recent studies suggest that the context of gene function is also important. In particular, genes whose products function as members of macromolecular complexes or cellular signaling networks may be especially vulnerable to changes in gene dosage (5).

High-throughput screens, metadata analyses, and computational predictions have been applied to define which genes are haploinsufficient. In budding yeast, about 3% of the genome is considered haploinsufficient under maximal growth conditions, resulting in substantial defects in cellular proliferation when heterozygously deleted (6). In humans, ∼300 genes are known to be haploinsufficient, contributing to a wide range of human health issues including neurodevelopmental disorders and tumorigenesis when heterozygously deleted (7) although computational predictions estimate this number to be much higher (8, 9). Importantly, haploinsufficiency of many genes is conserved from yeast to humans (10) indicating that strong selective forces exist that prevent the up-regulation of their expression.

Two theories have been put forth to explain the cause of haploinsufficiency: the dosage balance hypothesis and the insufficient amounts hypothesis. The dosage balance hypothesis (Fig. 1A) states that growth defects caused by changes to gene dosage—either over—or underexpression - are due to stoichiometric imbalances of protein complexes interfering with cellular functions (11, 12). This hypothesis predicts that haploinsufficient genes also confer a growth defect when present in excess by as little as one copy. In other words, haploinsufficiency and sensitivity to increased gene dosage are mutually defined. This hypothesis elegantly explains why haploinsufficiency has persisted over evolutionary time. Up-regulation of the gene is not possible because too much protein, like too little protein, disrupts protein complex stoichiometries that interfere with cellular function. The “insufficient amounts” hypothesis (Fig. 1B) postulates that haploinsufficiency is the physiological result of reduced levels of protein product being insufficient to perform its cellular function (6). This hypothesis, unlike the dosage balance hypothesis, makes neither predictions about the effects of overexpressing haploinsufficient genes nor explains why haploinsufficiency persisted over evolutionary time.

Fig. 1.

Fig. 1.

Models of haploinsufficiency. Theoretical plots relating gene dosage to the fitness of strains for haploinsufficient genes. HI = haploinsufficiency, SIC = sensitivity to increased copy number. (A) The dosage balance hypothesis. Strains exhibiting changes in HI gene dosage show decreased fitness for both underexpression and overexpression, due to altered stoichiometry of protein complexes or cellular pathways. (B) The insufficient amounts hypothesis: HI genes cause decreased fitness of cells as gene dosage decreases. HI gene products are limiting for growth.

In this study, we set out to experimentally test the dosage balance and insufficient amount hypotheses of haploinsufficiency, and conclude that neither adequately explains the persistence of haploinsufficiency. We find that while all haploinsufficient genes confer a growth disadvantage when subtly overexpressed, the reverse is not true. Many genes exist, including genes encoding known protein complex members, that impair proliferation when subtly overexpressed but not when heterozygously deleted, arguing against the dosage balance hypotheses as a general explanation for the persistence of haploinsufficiency. Instead, our analyses of the growth defects of strains heterozygously deleted for haploinsufficient genes indicate that HI genes are limiting for cellular growth and proliferation when present in one copy instead of two. Based on these observations, we propose an expansion of the current hypotheses for haploinsufficiency. Our “dosage-stabilizing” hypothesis stipulates that haploinsufficiency persists in organisms over evolutionary time because a balance must be struck between a gene product being limiting for a biological process, while avoiding the toxicity of its overproduction.

Results

Haploinsufficient Genes Are Sensitive to Increased Copy Number.

The dosage balance hypothesis of haploinsufficiency predicts that HI genes are toxic when subtly overexpressed; that is, they should also be sensitive to increased copy number (SIC; Fig. 1A). The budding yeast, Saccharomyces cerevisiae, is an ideal system to explore this prediction because several tools exist to generate comprehensive dosage-altered libraries of genes that are haploinsufficient or are toxic when overexpressed. The heterozygous deletion collection (13)—where one copy of each of the ∼6000 yeast genes has been systematically deleted in diploid strains—allowed us to study haploinsufficiency at genome-wide resolution. To study sensitivity to increased copy number, we utilized the previously constructed MoBY-CEN plasmid library, which is composed of centromeric plasmids that express nearly all yeast genes (4981/5915 confirmed ORFs) from their endogenous promoters (14). For the purposes of this study we consider genes introduced via MoBY-CEN vectors as present in “single-extra copy,” though of course the copy number of CEN vectors can vary, depending on ploidy and the type of selection (15).

We generated a high confidence data set of haploinsufficient genes in yeast. Deutschbauer et al. (2005) used the heterozygous deletion collection to identify 184 genes that are haploinsufficient under maximal growth conditions—in YEP medium containing 2% glucose at 30 °C (6). Of these, we chose 100 highly haploinsufficient genes to pursue further (henceforth top_HI) based on the following criteria: (i) accuracy of the gene deletion in the heterozygous deletion collection, (ii) a confirmed growth defect (>5%) in heterozygous knockout strains, and (iii) presence of the gene in the MoBY-CEN library of plasmids. We note that the growth defect we measured was in excellent agreement with previously defined fitness values (SI Appendix, Fig. S1B), except for a small number of strains that harbor deletions of ribosomal subunits, which are known to cause genomic alterations (16). The complete list of top_HI genes is shown in SI Appendix, Table S2, with excluded genes in SI Appendix, Table S3. The majority of genes that exhibit severe haploinsufficiency encode ribosomal proteins, and proteins required for transcription and translation as well as proteostasis (Fig. 2A).

Fig. 2.

Fig. 2.

Toxicity of haploinsufficient gene copy number increase. (A) Functional categories that define highly haploinsufficient genes (top_HI). (B) Doubling time of strains containing dosage-altered top_HI genes, with measurements made at 30 °C in YPD. 2N-1 are diploids with heterozygous deletions of HI genes (wild-type diploid) and 1N+1 are haploids with a single extra copy of HI genes, contained on a CEN plasmid (wild-type haploid plus empty vector). (C) The relationship between 2N-1 and 1N+1 relative growth rates. The black dots identify genes that are disproportionately toxic when in excess, which are thought to be correlation outliers due to high variability as shown in D. (D) A plot comparing doubling time and variability in growth measurements for 1N+1 cells. Note that data points highlighted in black are the same as black data points in C. (E) Degradation of HI genes when encoded in single extra copy (32). The fraction of excess protein degraded is the relative protein expression in 1N+1 compared with WT cells (1-log2 ratio), where a value of 0 represents no degradation and 1 represents complete degradation (two-tailed t test with Welch correction P < 0.0001). The percent of proteins in each category considered degraded (>0.4) is indicated below each bar. Central line = median. Plot whiskers 10–90th percentile. (F and G) Growth of strains containing top_HI genes on a CEN plasmid were treated with 100 μM MG132 (F) or 25 μM radicicol (G), alongside untreated controls (DMSO) in YPD at 30 °C. Green bars represent the doubling time of strains harboring an empty vector control. The number of strains that are more sensitive to drug treatment than expected is indicated below each condition.

Having created a high-confidence haploinsufficient gene set, we then compared growth rates of strains heterozygous for haploinsufficient genes with growth rates of strains harboring an extra copy of the top_HI genes. Remarkably, most top_HI genes interfered with proliferation when expressed on a CEN plasmid, with 85/100 strains exhibiting a statistically significant growth defect under maximal growth conditions (Fig. 2B and SI Appendix, Fig. S1A, Dunn’s multiple comparison test P < 0.05, Dataset S1). Of the 15 strains that did not meet statistical significance, nine strains displayed highly variable doubling times (identified as ‡ in SI Appendix, Fig. S1A). Six strains only showed a very slight increase in doubling time (identified as † in SI Appendix, Fig. S1A), despite evidence that the genes were expressed and that their coding sequences harbored no mutations. We propose that the genes present in excess in these six strains cause toxicity in situations other than maximal growth that yeast cells encounter as part of their natural life cycle. Growth defects were also observed in diploid cells expressing top_HI CEN plasmids, though the effect was smaller, indicating that increased ploidy buffers against phenotypes caused by copy number alteration (SI Appendix, Fig. S1C).

Comparison of the growth defect of strains heterozygously deleted for a haploinsufficient gene (henceforth 2N-1) with the growth defect of strains harboring an extra copy of the same gene (henceforth 1N+1) showed that the magnitude of the growth defect was proportional, with the phenotype of 2N-1 strains being generally more severe (Fig. 2C Spearman correlation P = 0.0008). This is in agreement with a previous study which found that out of ∼100 genes analyzed, the majority of genes whose fitness was affected by both overexpression and underexpression had more severe consequences for decreased expression than increased expression (17). We note that for several genes the growth of 2N-1 and 1N+1 strains was not particularly correlated (black data points in Fig. 2C). This is most likely due to significant variability in the doubling time measurements for these 1N+1 strains (black data points in Fig. 2D). Our data further suggest that this variability is a consequence of copy number variation between different strain isolates (SI Appendix, Fig. S1D), which disproportionately affected the doubling times of strains containing genes that are most sensitive to increased copy number.

Why Are Haploinsufficient Genes Toxic upon Dosage Increase?

Our results lead to the conclusion that, under maximal growth conditions, most genes that confer a significant growth defect in diploids when deleted in single copy also cause a growth disadvantage when in extra copy. Why are haploinsufficient genes toxic when overproduced? We can envision two nonmutually exclusive possibilities: (i) increased levels of the gene could interfere with a specific cellular function, or (ii) production and potential subsequent degradation of the excess gene product could be costly.

A well-known example for gene-specific toxicity (option 1) in budding yeast is the β-tubulin encoding gene TUB2. Expression of a single extra copy of the gene leads to severe growth defects, as does deletion of the two α-tubulin encoding genes TUB1 and TUB3 in diploids (18, 19). Is there evidence that production and degradation of HI genes is generally costly (option 2)? Looking at median expression, the collection of RNAs and proteins produced from the 100 most haploinsufficient genes are 11-fold more abundant than those for the rest of the genome (SI Appendix, Fig. S2 A and B). This preponderance of highly expressed genes among top_HI genes is driven by genes encoding ribosomal proteins (SI Appendix, Fig. S2 A and B). While producing large amounts of excess protein is known to place a burden on the cell’s transcription and translation machinery (20), more recent studies suggest that it is the demand for protein degradation that most impacts cellular growth when additional copies of highly expressed genes are introduced into cells (21). We found that ∼65% of HI genes produce proteins known to be degraded by the proteasome when encoded in single copy excess, compared with 26% of non-HI genes (Fig. 2E, Student’s t test P < 0.0001). The enrichment for proteasomal targets is again driven by ribosomal proteins (Fig. 2E). This observation raises the possibility that degradation of excess HI proteins could be costly to cells. Consistent with this idea, we found that 98/100 haploid strains bearing an extra copy of top_HI genes exhibited increased sensitivity to the proteasome inhibitor MG132 (Fig. 2F). Ninety-six of one hundred strains were more sensitive to the Hsp90 inhibitor radicicol (Fig. 2G). When we excluded ribosomal proteins, the majority of strains were still sensitive to these proteotoxicity-inducing agents (Fig. 2 F and G). These observations indicate that the toxicity of HI gene overexpression is in part due to excess proteins placing a burden on the cell’s protein homeostasis machinery.

Most Dosage-Sensitive Genes Are Not Haploinsufficient.

Our observation that top_HI genes are toxic when subtly overexpressed supports the dosage balance hypothesis of haploinsufficiency. An additional prediction of this theory is that haploinsufficiency and sensitivity to increased copy number are mutually defined, at least for members of protein complexes. To test this prediction, we needed to create a genome-wide data set defining the genes that confer a fitness disadvantage when present in single extra copy. While previous studies had characterized the sensitivity of genes to high level overexpression (22, 23) or tested their copy number limit (21), none had defined the genes that confer a fitness defect when expressed in single extra copy under conditions of maximal growth. Again utilizing the MoBY-CEN collection of yeast plasmids, we generated two independent transformant pools of haploid strains, where each strain contained a single extra copy of a gene for all genes in the genome. We competed pools in liquid culture and monitored plasmid representation by sequencing plasmid-specific tags every 8 h for a period of 48 h (∼30 generations). In following each tag’s abundance over time, we were able to extract a linear slope for 2646 strains (from 4981 plasmids), of which 1588 passed criteria for reproducibility and were assigned a fitness score (SI Appendix, Fig. S3 and Table S1). We converted each slope into a relative fitness value where 1 represents a zero slope with neutral fitness, and values <1 or >1 have negative or positive slopes, respectively. The distribution of fitness values showed approximately equal numbers of strains increasing and decreasing in the population (Fig. 3A and Dataset S2). Genes detected in both transformant pools showed good correlation between relative fitness values (Fig. 3B, Pearson correlation P < 0.0001). A gene was considered to have a negative impact on fitness, and defined as SIC, when its relative fitness was <1 and 1 SD below the population average. Using this criterion, we defined 251 genes to be SIC (FDR = 0.072, SI Appendix, Fig. S3C). Conversely, there were 247 genes which fell 1 SD above the mean and had a relative fitness >1. We note that strains whose abundance increases in the population likely do not have a growth advantage, based on a comparison with known reference points (SI Appendix, Fig. S3D). Rather, these strains increase in abundance because of the relative decrease of other strains in the culture.

Fig. 3.

Fig. 3.

Identification of genes that are sensitive to increased copy number (SIC). (A) The relative fitness distribution for 1588 strains across two independent pools (n = 6). Each strain contains a single extra copy of a gene on a centromeric vector. Strains were competed for ∼30 generations in YPD at 30 °C. Bins = 0.005. (B) Reproducibility between the two independent pools of transformants. (C) The percentage of genes encoding protein complex members that are haploinsufficient among top SIC genes—defined as those genes which show decreased fitness of greater than 1 SD in pooled competition. For comparison, the percentage of genes that are SIC among known haploinsufficient genes (from Fig. 2B) is shown on the Right.

Under identical growth conditions, Deutschbauer et al. (2005) found 3% of the yeast genome to be haploinsufficient, defined as heterozygous gene deletions that cause a decrease in fitness greater than 1 SD below the population mean (6). By the same definition, we found 251/1588 genes to confer a fitness defect when present in single extra copy, which when extrapolated to the entire yeast genome suggests that SIC genes may represent up to 15% of the genome. This observation indicates that genes are much more likely to be sensitive to increased copy number than gene copy loss, though more comprehensive studies in diploid yeast will be needed to make an absolute comparison. Strikingly, while 85% of haploinsufficient genes were SIC, only 10% of the genes identified as highly SIC were haploinsufficient (n = 26). When we restricted our analysis to members of protein complexes found in our competition data set, we observed a similar result: 15% of SIC genes (22/89) are HI, while 83% of HI genes (33/40) were identified as SIC by doubling time measurements. This is true even when we only considered genes whose products are known protein complex members (Fig. 3C). We conclude that while haploinsufficient genes are toxic when present in excess, the converse is not true—many genes which are found to be highly SIC are not haploinsufficient. This observation also leads to the conclusion that dosage imbalance among protein complex members alone cannot explain haploinsufficiency.

Dosage Imbalance Does Not Fully Explain Haploinsufficiency.

The finding that haploinsufficiency and sensitivity to increased copy number are not mutually defined prompted us to test additional predictions of the dosage balance model.

The dosage imbalance hypothesis predicts that deletion of a HI gene in a haploid cell results in the same phenotype as deleting one copy in a diploid cell because both strains experience the same degree of stoichiometric imbalance. This is not what we observe. Deletion of the nonessential top_HI genes in haploid cells causes an increase in doubling time of 11–100% compared with wild-type cells (Fig. 4A and Dataset S1). Doubling time increased only 5–29% in heterozygously deleted diploid cells (Fig. 2B and Dataset S1). Importantly, the growth defect of the haploid strain lacking the HI gene was invariably more severe than the growth defect of the corresponding heterozygously deleted diploid strain (Fig. 4A).

Fig. 4.

Fig. 4.

Testing the dosage balance and insufficient amount hypotheses. (A) Doubling time of haploid strains containing deletions in nonessential top_HI genes, with measurements made at 30 °C in YPD. 2N-1 diploids are included for comparison (paired t test, P < 0.0001). Connecting lines show corresponding genes. (B) Doubling time of strains carrying heterozygous deletions for genes encoding members of the eIF2 complex, in combination and in isolation. Measurements were made at 30 °C in YPD. Significance compared with WT is shown above each bar; select other comparisons are made with brackets (ANOVA with multiple comparisons, ****P < 0.0001 Bonferroni correction). (C) Volume of diploid cycling cells with heterozygous deletions for all top_HI genes, and a subset of HI genes encoding ribosomal proteins (two-tailed t test with Welch correction **P = 0.0076). (D) A fivefold serial dilution of strains containing confirmed heterozygous deletions for CCT complex members, plated on benomyl (15 ug/mL) and YPD control. (E) Haploinsufficient profiles of the heterozygous deletion collection for latrunculin treatment (0.9 μM) or benomyl treatment (27 μM). Data are from Hoepfner et al. (2014) (33). Genome position of a heterozygous deletion is plotted against normalized drug sensitivity score, where a negative score represents impaired proliferation in the presence of the drug. Purple dots are subunits of the CCT chaperone complex.

Another prediction of the dosage balance hypothesis is that deleting all subunits of a protein complex should alleviate the haploinsufficiency of deletions of individual complex members. We tested this prediction for the eIF2 complex. eIF2 is an initiation factor for translation with three obligate subunits, two of which (SUI2 and SUI3) are known to be haploinsufficient. If stoichiometry were the cause of the genes’ haploinsufficiency, a strain heterozygously deleted for all three subunit genes should not exhibit a growth defect. This is not what we observed. The dosage-balanced strain for eIF2 exhibited a strong growth defect (13 min increase in doubling time), on par with that of strains harboring single eIF2 gene deletions (9–18 min) (Fig. 4B). While the triple deletion strain grew significantly slower than the wild-type strain, the phenotype was not as severe as that of strains with single deletions of the SUI2 and SUI3 genes. Based on the relative differences in doubling time, we estimate that protein stoichiometry imbalance can explain only ∼33% of the haploinsufficient growth defect for this protein complex.

Finally, our results show that subtle overexpression of top_HI genes causes increased sensitivity to proteotoxic agents such as radicicol (Fig. 2G). According to the dosage balance hypothesis, this proteotoxicity should arise from stoichiometric imbalance of functional complexes (24) and should thus also occur in strains heterozygously deleted for HI genes. This is not the case. Strains containing heterozygous deletions of HI genes are not, on average, more sensitive to radicicol, although a subtle increase in sensitivity to radicicol might have been missed in this bulk measurement (SI Appendix, Fig. S4A). Taken together, our data indicate that stoichiometric imbalance is not the sole cause of haploinsufficiency.

Haploinsufficient Genes Are Rate Limiting for Cellular Fitness.

The conclusion that the dosage imbalance hypothesis alone cannot explain haploinsufficiency prompted us to test elements of the insufficient amounts hypothesis. We hypothesized that HI genes are limiting for organismal function, such as cell growth or proliferation when heterozygously deleted. To test the growth-limiting nature of haploinsufficiency, we examined the importance of HI genes for key processes in cell growth and proliferation. Ribosomal proteins are particularly enriched among HI genes as a class (6), and it is well established that ribosomes are rate limiting for cell growth (25). Consistent with previous reports (26), we found that deletion of one copy of ribosomal genes and of other haploinsufficient genes required for translation caused a reduction in mass accumulation in diploid strains, as judged by a smaller average cell size (Fig. 4C and Dataset S1). Strains carrying heterozygous deletions in HI genes not involved in protein synthesis were the same size or larger than control strains (Fig. 4C and Dataset S1). Importantly, introducing an extra copy of these ribosomal genes did not lead to a decrease in cell size (SI Appendix, Fig. S4B), arguing that stoichiometric imbalances among ribosome subunits are not responsible for the small-cell size phenotype observed in diploid cells heterozygously deleted for these ribosomal protein genes. Instead, as small size is a characteristic of reduced protein synthesis (25), our data indicate that insufficient amounts of ribosomal proteins result in the growth defect of strains heterozygously deleted for their genes.

Is there evidence that other haploinsufficient genes are rate limiting in processes critical for cell proliferation? The top_HI genes include 5/8 subunits of the CCT chaperone complex known to fold actin and tubulin. Strains harboring heterozygous deletions in these genes are sensitive to the actin and microtubule assembly inhibitors latrunculin A and benomyl, respectively (Fig. 4 D and E), suggesting CCT subunits are indeed rate limiting for folding these cytoskeleton constituents. We note that strains with excess CCT subunits are not sensitive to benomyl (SI Appendix, Fig. S4C), again arguing against stoichiometric imbalances and for insufficient amounts of protein as the source of this benomyl sensitivity. We predict that the remaining top_HI genes are also limiting for processes important for growth or proliferation under maximal growth conditions.

Haploinsufficient Genes Have a Narrow Expression Range.

Taken together, our results lead to the conclusion that two properties determine whether a gene is haploinsufficient in a specific environment or growth condition: (i) the gene product is rate limiting for maximal organismal fitness and at the same time (ii) the gene confers a fitness disadvantage when in excess. We propose that the fitness penalty when in excess prevents up-regulation of the gene to counteract haploinsufficiency, and the rate-limiting nature of the gene causes the observed fitness defect in heterozygotes. This model of dual, counteracting selective pressures over evolutionary time makes a very strong prediction: the expression range of HI genes, in particular its variation between cells in a population, should be narrow relative to that of other genes in the genome. Using a previously published set of single-cell gene expression data (27), we observe that the cell-to-cell variability in gene expression is significantly more narrow among HI genes compared with other genes in the genome (Fig. 5A). While low variation can be driven by high expression (the example in Fig. 5A shows the cell-to-cell variability in the expression of 100 highly expressed, nonhaploinsufficient genes), we still observed narrow ranges of expression among more lowly expressed HI genes (Fig. 5A and SI Appendix, Fig. S4D). We conclude that haploinsufficient genes are narrowly expressed irrespective of their expression level. This conclusion is consistent with recent data showing that variability in gene expression across a population of cells is decreased for genes where small changes in expression had a large impact on fitness (17).

Fig. 5.

Fig. 5.

The gene expression range of haploinsufficient genes is narrow. (A) Cell-to cell variability (CV2) in gene expression for HI genes compared with genes that are highly expressed but are not haploinsufficient (high expression not HI), genes that are haploinsufficient but not highly expressed (HI not high expression) and all non-HI genes in the genome (genome) based on FACS fluorescent measurements of promoter-YFP fusions for 1000 genes (27) (****P < 0.0001, **P = 0.0058, *P = 0.0197). Central line = median. Plot whiskers span 10–90th percentile. (B) The dosage-stabilizing hypothesis: HI genes cause a decrease in fitness when underexpressed and overexpressed. This narrowed fitness distribution is driven by the evolutionary pressure to both increase and decrease expression.

Discussion

While changes to gene dosage can lead to imbalances in protein complex stoichiometry that adversely affect cellular fitness (11, 12), several lines of evidence indicate that haploinsufficiency cannot be explained by fitness decrease due to stoichiometric imbalance alone. First, haploinsufficiency and overexpression toxicity are not mutually defined, even among genes encoding protein complex subunits. Second, deletion of haploinsufficient genes in a haploid strain is more detrimental than in a diploid strain even though the number of uncomplexed protein subunits is the same in both cell types. Third, overexpression of individual members of a complex lead to a different phenotype than their underexpression, as is showcased by ribosomal proteins and CCT subunits. Finally, at least in the case of eIF2, heterozygous deletion of all complex member genes does not alleviate the growth defect caused by deletion of individual subunit genes.

Our data support a hybrid model for haploinsufficiency, which we call the dosage-stabilizing hypothesis (Fig. 5B). It builds upon and incorporates core principles from both the insufficient amounts hypothesis and the dosage balance hypothesis. The dosage-stabilizing hypothesis posits that HI gene products are limiting for fitness when underexpressed (insufficient amount hypothesis), and toxic when overexpressed, most likely due to adverse effects on protein homeostasis and imbalances in protein complex stoichiometry (dosage balance hypothesis). In other words, haploinsufficient genes are evolutionarily “stuck,” unable to increase or decrease expression over time to accommodate fluctuations in gene dosage because a fitness penalty is associated with both downregulating and upregulating HI gene expression. Thus, haploinsufficient genes are “living on the edge,” representing a unique class of genes that must carefully balance their expression, ensuring that the cost of overproduction does not outweigh the potential benefit of maximizing growth.

It is worth noting that haploinsufficiency is extremely context dependent. Deutschbauer et al. (2005) showed that there is little overlap between genes that are haploinsufficient under maximal growth conditions and those that are limiting when cells are grown in minimal medium (6). Applying more stringent pressures on cell growth through glucose-, ammonium-, or phosphate-limited continuous culture revealed a set of haploinsufficient genes that is highly conserved across nutrient-limiting conditions, yet distinct from those defined under maximal growth conditions (28). Interestingly, under these severe growth restriction conditions, the frequency of haploinsufficient genes appears to increase (12–20%) (28) and could be as high as 76%, at least among essential yeast genes, based on single-cell morphological phenotyping (29). Together, these studies indicate that conditions exist for most if not all genes under which they are haploinsufficient. We speculate that this could account for why all genes are maintained in two copies over evolutionary time even in organisms that propagate by a predominantly nonsexual lifestyle.

Given the persistence of haploinsufficient genes, how has cellular physiology been driven by, and perhaps adapted to, their presence? Given the enrichment of transcription and translation factors among HI genes, which are by nature growth limiting, our data lead us to wonder whether haploinsufficient genes might play a role in setting the division rate of cells. In other words, the expression levels of haploinsufficient genes may be partially responsible for budding yeast cells having a doubling time of 90 min in YEPD medium at 30°. Changing the expression of any individual gene will have little effect on cell division length, but increasing their expression coordinately will, we predict, produce yeast cells with shorter doubling times. Additionally, if increasing gene expression of individual genes is not a solution to the problem of haploinsufficiency, do organisms have another way to escape it evolutionarily? Previous work (6, 30) has suggested that gene duplication, where two copies of the gene now split the overall expression level, may be better buffered against gene expression fluctuation and loss, providing organisms a way out of haploinsufficiency. Future studies should seek to address these questions.

Methods

Strains harboring heterozygous deletions of haploinsufficient genes were obtained from the BY4743 yeast knockout collection; haploid deletion strains are BY4741 (MATa) (13). Centromeric (CEN) plasmids were obtained from the Molecular barcoded ORF collection of yeast genes (MoBY) (14). Doubling times were measured in 96-well format in rich YEP medium +2% glucose at 30 °C, taking OD600nm measurements every 15 min for 24 h. MoBY-CEN transformant pools (∼30× genome coverage) were competed in 0.5 L YPD and collected every 8 h (∼5 generations) for 48 h. The relative contribution of strains was captured by amplification of plasmid-specific tag sequences as previously described (31), using primers with indices for multiplex sequencing on the Illumina Next-seq platform. For DNA copy number, DNA was isolated from individual transformants in stationary phase by zymolyase digestion and phenol extraction, followed by digestion with RNase A. qPCR of DNA was carried out on a Roche LightCycler 480 II using TaKaRa mix for real-time PCR: SYBR Premix Ex Taq. Cell volume was determined using a Multisizer 3 Coulter Counter, counting 105 particles per strain at a threshold diameter of 2 μm. Detailed experimental procedures are provided in SI Appendix.

Supplementary Material

Supplementary File
Supplementary File
pnas.1900437116.sd01.xlsx (64.4KB, xlsx)
Supplementary File

Acknowledgments

We thank Andrew Murray, Gene-Wei Li, and members of the A.A. laboratory for suggestions and critical reading of this manuscript. This work was supported by NIH grant CA206157 and GM118066 to A.A., who is an investigator of the Howard Hughes Medical Institute and the Glenn Foundation for Medical Research. On behalf of S.A.M., this material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant 1122374. Sequencing work at the MIT BioMicro Center was supported in part by the Koch Institute Support (core) Grant P30-CA14051 from the National Cancer Institute.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1900437116/-/DCSupplemental.

References

  • 1.Fisher R. A., The possible modification of the response of the wild type to recurrent mutations. Am. Nat. 62, 115–126 (1928). [Google Scholar]
  • 2.Orr H. A., A test of Fisher’s theory of dominance. Proc. Natl. Acad. Sci. U.S.A. 88, 11413–11415 (1991). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wright S., Physiological and evolutionary theories of dominance. Am. Nat. 68, 24–53 (1934). [Google Scholar]
  • 4.Kondrashov F. A., Koonin E. V., A common framework for understanding the origin of genetic dominance and evolutionary fates of gene duplications. Trends Genet. 20, 287–290 (2004). [DOI] [PubMed] [Google Scholar]
  • 5.Veitia R. A., Potier M. C., Gene dosage imbalances: Action, reaction, and models. Trends Biochem. Sci. 40, 309–317 (2015). [DOI] [PubMed] [Google Scholar]
  • 6.Deutschbauer A. M., et al. , Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics 169, 1915–1925 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Dang V. T., Kassahn K. S., Marcos A. E., Ragan M. A., Identification of human haploinsufficient genes and their genomic proximity to segmental duplications. Eur. J. Hum. Genet. 16, 1350–1357 (2008). [DOI] [PubMed] [Google Scholar]
  • 8.Huang N., Lee I., Marcotte E. M., Hurles M. E., Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 6, e1001154 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Steinberg J., Honti F., Meader S., Webber C., Haploinsufficiency predictions without study bias. Nucleic Acids Res. 43, e101 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.de Clare M., Pir P., Oliver S. G., Haploinsufficiency and the sex chromosomes from yeasts to humans. BMC Biol. 9, 15 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Papp B., Pál C., Hurst L. D., Dosage sensitivity and the evolution of gene families in yeast. Nature 424, 194–197 (2003). [DOI] [PubMed] [Google Scholar]
  • 12.Veitia R. A., Exploring the etiology of haploinsufficiency. BioEssays 24, 175–184 (2002). [DOI] [PubMed] [Google Scholar]
  • 13.Giaever G., et al. , Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387–391 (2002). [DOI] [PubMed] [Google Scholar]
  • 14.Ho C. H., et al. , A molecular barcoded yeast ORF library enables mode-of-action analysis of bioactive compounds. Nat. Biotechnol. 27, 369–377 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Karim A. S., Curran K. A., Alper H. S., Characterization of plasmid burden and copy number in Saccharomyces cerevisiae for optimization of metabolic engineering applications. FEMS Yeast Res. 13, 107–116 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hughes T. R., et al. , Widespread aneuploidy revealed by DNA microarray expression profiling. Nat. Genet. 25, 333–337 (2000). [DOI] [PubMed] [Google Scholar]
  • 17.Keren L., et al. , Massively parallel interrogation of the effects of gene expression levels on fitness. Cell 166, 1282–1294.e18 (2016). [DOI] [PubMed] [Google Scholar]
  • 18.Burke D., Gasdaska P., Hartwell L., Dominant effects of tubulin overexpression in Saccharomyces cerevisiae. Mol. Cell. Biol. 9, 1049–1059 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Katz W., Weinstein B., Solomon F., Regulation of tubulin levels and microtubule assembly in Saccharomyces cerevisiae: Consequences of altered tubulin gene copy number. Mol. Cell. Biol. 10, 5286–5294 (1990). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wagner A., Energy constraints on the evolution of gene expression. Mol. Biol. Evol. 22, 1365–1374 (2005). [DOI] [PubMed] [Google Scholar]
  • 21.Makanae K., Kintaka R., Makino T., Kitano H., Moriya H., Identification of dosage-sensitive genes in Saccharomyces cerevisiae using the genetic tug-of-war method. Genome Res. 23, 300–311 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gelperin D. M., et al. , Biochemical and genetic analysis of the yeast proteome with a movable ORF collection. Genes Dev. 19, 2816–2826 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sopko R., et al. , Mapping pathways and phenotypes by systematic gene overexpression. Mol. Cell 21, 319–330 (2006). [DOI] [PubMed] [Google Scholar]
  • 24.Veitia R. A., Birchler J. A., Models of buffering of dosage imbalances in protein complexes. Biol. Direct 10, 42 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jorgensen P., Tyers M., How cells coordinate growth and division. Curr. Biol. 14, R1014–R1027 (2004). [DOI] [PubMed] [Google Scholar]
  • 26.Jorgensen P., Nishikawa J. L., Breitkreutz B.-J., Tyers M., Systematic identification of pathways that couple cell growth and division in yeast. Science 297, 395–400 (2002). [DOI] [PubMed] [Google Scholar]
  • 27.Keren L., et al. , Noise in gene expression is coupled to growth rate. Genome Res. 25, 1893–1902 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Delneri D., et al. , Identification and characterization of high-flux-control genes of yeast through competition analyses in continuous cultures. Nat. Genet. 40, 113–117 (2008). [DOI] [PubMed] [Google Scholar]
  • 29.Ohnuki S., Ohya Y., High-dimensional single-cell phenotyping reveals extensive haploinsufficiency. PLoS Biol. 16, e2005130 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Guan Y., Dunham M. J., Troyanskaya O. G., Functional analysis of gene duplications in Saccharomyces cerevisiae. Genetics 175, 933–943 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Payen C., et al. , High-throughput identification of adaptive mutations in experimentally evolved yeast populations. PLoS Genet. 12, e1006339 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Dephoure N., et al. , Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast. eLife 3, e03023 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Hoepfner D., et al. , High-resolution chemical dissection of a model eukaryote reveals targets, pathways and gene functions. Microbiol. Res. 169, 107–120 (2014). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
Supplementary File
pnas.1900437116.sd01.xlsx (64.4KB, xlsx)
Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES