Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 May 25.
Published in final edited form as: Cell Syst. 2016 May 12;2(5):347–354. doi: 10.1016/j.cels.2016.03.009

The genomic landscape of position effects on protein expression level and noise in yeast

Xiaoshu Chen 1, Jianzhi Zhang 1,*
PMCID: PMC4882239  NIHMSID: NIHMS774665  PMID: 27185547

SUMMARY

Position effect, the influence of the chromosomal location of a gene on its activity, is a fundamental property of the genome. By placing a green florescent protein gene cassette at 482 different locations across all chromosomes in budding yeast, we quantified the position effects on protein expression level and noise at the genomic scale. The position effects are significant, altering the mean protein expression level by up to 15 times and expression noise by up to 20 times. DNA replication timing, three-dimensional chromosomal conformation, and several histone modifications are major covariates of position effects. Essential genes are enriched in genomic regions with inherently low expression noise, supporting the hypothesis that chromosomal clustering of essential genes results from selection against their expressional stochasticity. Position effects exhibit significant interactions with promoters. Together, our results suggest that position effects have shaped the evolution of chromosome organization and should inform future genome engineering efforts.

Graphical Abstract

graphic file with name nihms774665u1.jpg

INTRODUCTION

Position effect, a term coined over ninety years ago (Sturtevant, 1925), refers to the influence of the chromosomal location of a gene on its activity. Position effect is most famously known in a mutant fruit fly (Drosophila melanogaster) whose eyes exhibit a mottled appearance of white and red sectors, in contrast to solid red in the wild-type, due to the translocation of the X-linked white gene that controls the production of normal red eye pigment to a region near heterochromatin. Position effect is implicated in a number of genetic diseases (Milot et al., 1996; Kleinjan and van Heyningen, 1998), impacts the success of genetic engineering (Wilson et al., 1990; Milot et al., 1996), and has been hypothesized to underlie eukaryotic genome organization (Hurst et al., 2004; Gierman et al., 2007; Michalak, 2008). Although heterochromatin-induced position effect such as that of white has been extensively investigated (Henikoff, 1990; Weiler and Wakimoto, 1995; Ottaviani et al., 2008; Elgin and Reuter, 2013), the landscape of position effect across entire chromosomes as well as the underlying mechanisms are poorly understood.

Two groups recently studied variations in the mRNA concentration of a transgene integrated at multiple genomic positions in yeast and mouse cells, respectively, and reported relatively small position effects in yeast (Chen et al., 2013) but large effects in mouse (Akhtar et al., 2013). However, because variation in mRNA concentration appears to diminish at the protein level (Khan et al., 2013; Artieri and Fraser, 2014; McManus et al., 2014), it is important to examine the functionally more relevant position effect on protein concentration. Gierman et al. were the first to address this question systematically, but their study investigated relatively few positions and cannot be easily scaled up (Gierman et al., 2007). Very recently, Dey et al. also addressed this question by examining the position effects on mRNA and protein concentrations at 227 genomic positions in a human cell line, but because they did not determine the genomic positions where their reporter was placed, they could not study the mechanisms underlying the observed position effects (Dey et al., 2015).

Another aspect of position effect that is of significant interest is on gene expression noise, which refers to the variation in mRNA or protein concentration among isogenic individuals under the same environment (Kaern et al., 2005). It was discovered in yeast that, essential genes, which cause lethality when deleted, tend to be chromosomally clustered (Pal and Hurst, 2003). To explain the origin of this nonrandom genome organization, Batada and Hurst hypothesized that genomic regions with inherently low expression noise are sinks of essential genes (Batada and Hurst, 2007), because of the fitness advantage of reducing the expression noise of essential genes (Batada and Hurst, 2007; Lehner, 2008; Wang and Zhang, 2011). They explored this idea using the expression noise of yeast endogenous genes (Newman et al., 2006). However, in Newman et al.’s study, the expressions of the endogenous genes were driven by their endogenous promoters. This prevented Batada and Hurst from isolating the individual contributions of chromosomal position and promoter to a gene’s noise level. Although Dey et al. also measured the position effect on mRNA and protein expression noise (Dey et al., 2015), the lack of information on the genomic positions examined renders their dataset unsuitable for testing Batada and Hurst’s hypothesis or studying the mechanisms underlying the position effect on expression noise.

In this study, we built a library of strains that allowed us to measure the position effects on protein expression level and noise at a large number of genomic loci in the budding yeast Saccharomyces cerevisiae. The position effects are significant, altering the mean protein expression level by up to 15 times and expression noise by up to 20 times. We explored contributors to the position effects, including DNA replication timing, chromatin state, and chromosome conformation, and used our data to test Batada and Hurst’s hypothesis. Our analysis demonstrates that essential genes do cluster in regions of low noise and supports the hypothesis that position effect shapes the evolution of genome organization.

RESULTS

Experimental design

Protein expression level and noise can be influenced by many factors such as the genomic position, promoter, and sequence of the gene. To specifically study the position effect, we placed the same promoter driving the expression of the same protein-coding gene at different genomic positions in a panel of yeast strains. To construct the panel, we took advantage of the yeast heterozygous gene deletion collection. Each strain in the collection has one allele of an endogenous gene replaced with the kanMX module consisting of kanR and its promoter (Winzeler et al., 1999), allowing the same reporter gene cassette to be placed at many different genomic positions via homologous recombination using the same set of primers. Specifically, we systematically replaced in the heterozygous deletion strains the kanMX module with a GFP cassette that comprises a green florescence protein (GFP) gene with a promoter and the marker gene URA3 (Fig. S1a–c; Table S1). The loss of one allele of an endogenous gene in each of the constructed strains (referred to as GFP strains) presumably has minimal impacts on the position effects concerned (see below), because the vast majority of yeast genes are haplosufficient (Deutschbauer et al., 2005). The GFP strains were grown to mid-log phase in Yeast Extract-Peptone-Dextrose (YPD) medium, and GFP fluorescence intensities of ~10,000 isogenic cells were measured by florescence-activated cell sorting (FACS) in a 96-well plate format. After the control of cell size and shape, the GFP intensity for each cell in each well was divided by the mean GFP intensity of the corresponding well in control plates containing the reference strain, in which the RPL5 promoter-driven GFP (pRPL5 -GFP) was placed at the endogenous RPL5 locus, to eliminate systematic bias and to standardize the measurements. The average GFP intensity (μ), standard deviation (σ), and coefficient of variation (CV, = σ/μ) were then calculated for each GFP strain and averaged among at least five biological replicates. We presented both σ and CV as measures of expression noise, because they have different scaling relations with μ and can reveal different causes of expression noise (Kaern et al., 2005).

Promoter effects, position effects, and promoter-position interaction

To examine if position effect is promoter-specific, we first chose four different promoters to construct four different GFP cassettes and placed each of them at the same six randomly picked loci (Fig. 1a). These promoters vary in a number of properties, including the strength and noise when driving their respective endogenous genes at native genomic loci and the presence/absence of a TATA box (Fig. S1d). We found that, regardless of where the GFP cassette is placed, the GFP strains with the strongest promoter always have the highest expression mean and standard deviation (Fig. 1b, c) and lowest coefficient of variation (Fig. 1d). This demonstrates that the promoter effect is stronger than the position effect on mean expression (μ, P < 1×10−99, analysis of variance), standard deviation (σ, P < 1×10−99), and coefficient of variation (CV, P < 3×10−37). Nevertheless, position effect is significant on expression mean (P < 2×10−7) and standard deviation (P < 9×10−8), although promoter-position interaction also exists for expression mean (P < 2×10−7) and standard deviation (P < 3×10−8). Furthermore, the mean between-promoter correlations in expression mean, standard deviation, and coefficient of variation across positions are all significantly greater than expected by chance (P ≤ 0.05, randomization test; Fig. 1e–g). This analysis demonstrates that even using just one promoter can provide useful information about position effects on expression level and noise.

Figure 1. Position effects on GFP level and noise examined using four different promoters at the same set of six genomic loci.

Figure 1

See also Fig. S1 and Table S1.

(a) Schematic drawing of strain construction and protein expression quantification. Diploid yeast cells are shown. On each pair of homologous chromosomes, the green rectangle represents the GFP cassette, the grey rectangle represents the endogenous gene, and the black dots indicate the centromeres. The GFP level of each cell is measured by FACS, and the mean (μ) and standard deviation (σ) among isogenic cells are computed.

(b)–(d) Position effects on μ (b), σ (c), and coefficient of variation (CV) (d). Error bars show one standard error, estimated from at least five replicate populations of cells. All values are ratios to the corresponding values of the reference strain in which the pRPL5-GFP cassette is placed at the native RPL5 locus.

(e)–(g) Frequency distributions of the mean between-promoter correlation in μ (e), σ (f), and CV (g) in 1000 sets of data generated by randomly relabeling the six loci. An arrow indicates the corresponding value observed in the original data. P-value is the probability that the mean correlation from a randomized dataset exceeds the observed value.

Position effects across chromosome 1

We chose to use pRPL5 in our main experiment for two reasons. First, it is a strong promoter, ranked at the top 2.5 percentile among >2000 yeast promoters examined (Newman et al., 2006); estimation of GFP concentration is more accurate when its concentration is higher. Second, based on the expression noise of the endogenous RPL5, pRPL5 is close to the median noise level in yeast, ranked at the top 61 percentile (Newman et al., 2006). This suggests that pRPL5 may provide a more representative picture of position effect on expression noise than when a particularly noisy or quiet promoter is used. We constructed 63 strains by respectively placing the pRPL5-GFP cassette at 63 loci on chromosome 1 (Fig. 2a). Sixteen of them exhibited mean expression values significantly different from the average across the 63 strains (Q < 0.05, two-tailed t test followed by FDR correction of multiple testing; Fig. 2b). Similarly, we found 18 and 13 strains that significantly differ from the average standard deviation and coefficient of variation of the 63 strains, respectively (Fig. 2c, d). These findings reveal the presence of significant position effects on protein expression level and noise in yeast chromosome 1.

Figure 2. Position effects on GFP level and noise across 63 positions along chromosome 1.

Figure 2

(a) Schematic drawing of the strains used for examining position effects. In each strain, the pRPL5-GFP cassette replaces one allele of an endogenous gene.

(b)–(d) The μ (b), σ (c), and CV (d) of GFP levels in the 63 GFP strains. Error bars show one standard error, estimated from at least five replicate populations of cells. Red dots show significant deviations from the average of the 63 dots, which is indicated by the horizontal dashed line. The centromere and telomere regions are shaded grey.

(e) Schematic drawing of the HO-replaced strains used for examining the impact of heterozygous gene deletion on the study of position effects. The pRPL5-GFP cassette replaces one allele of the HO pseudogene in each of the 63 heterozygous gene deletion strains.

(f)–(h) The μ (f), σ (g), and CV (h) of the GFP levels in the 63 HO-replaced strains. Error bars show one standard error, estimated from at least five replicate populations of cells. Red dots show significant deviations from the average of the 63 dots, which is indicated by the horizontal dashed line. The centromere and telomere regions are shaded grey. In (b)–(d) and (f)–(h), all values are ratios to the corresponding values of the reference strain in which the pRPL5-GFP cassette is placed at the native RPL5 locus.

(i) Comparison between the μ in mRNA levels and protein levels of GFP across the 63 GFP strains in (a). Values for the X- and Y-axis are ratios to the average mRNA and protein levels of the 63 strains, respectively. Error bars show one standard error estimated from biological replicates. Spearman’s correlation coefficient and P-value are shown.

Nevertheless, the above interpretation is based on the assumption that the loss of one allele of different endogenous genes in the 63 strains had either no effect or the same effect on the quantities measured. To verify this assumption, we replaced one allele of the HO pseudogene with the pRPL5-GFP cassette in the same 63 heterozygous gene deletion strains (Fig. 2e). Indeed, no strain showed significantly different mean or standard deviation of GFP expression from their respective averages across the 63 HO-replaced strains (Fig. 2f, g), validating our assumption. Four strains exhibited significantly higher coefficient of variation in GFP level than the average coefficient of variation of the 63 strains (Fig. 2h). But the coefficient of variation for each of the four corresponding GFP strains is not significantly higher than the average (Fig. 2d), suggesting that the loss of one allele at each of these loci did not lead to overestimation of position effect on expression noise. Furthermore, there is no significant correlation in mean, standard deviation, or coefficient of variation between the 63 GFP strains and corresponding HO-replaced strains (P > 0.26). Hence, the loss of one allele of an endogenous gene in each GFP strain has effectively no impact on our position effect analysis.

As mentioned, position effect on protein expression could be smaller than that on mRNA expression. To test this hypothesis, we used quantitative reverse transcription polymerase chain reaction (qRT-PCR) to measure the GFP mRNA concentration in each of the 63 GFP strains. As expected (Ghaemmaghami et al., 2003), the mRNA and protein concentrations across the 63 strains are moderately correlated (ρ = 0.34, P < 5×10−3; Fig. 2i). The coefficient of variation among replicates is significantly smaller for GFP protein concentrations than for mRNA concentrations (P < 7×10−5, Mann-Whitney U test), indicating more precise measurements of the former than the latter. The coefficient of variation across the 63 strains is significantly smaller for mean GFP protein concentration than mRNA concentration (P < 3×10−17, two-tailed F-test). This difference is not fully explainable by the disparity in measurement error between protein and mRNA levels, because the same pattern is observed (P < 2×10−9) even when pseudo protein levels are generated based on the among-replicate coefficient of variation of mRNA levels. Thus, the position effect on protein concentration is indeed smaller than that on mRNA concentration, potentially due to some compensatory mechanisms at translational or posttranslational levels (Khan et al., 2013; Artieri and Fraser, 2014; McManus et al., 2014). Because florescence-based protein level measurement is more precise than qRT-PCR-based mRNA level measurement and because position effect on protein concentration is different from and biologically more relevant than that on mRNA concentration, it is valuable to examine protein-level position effects.

Position effects across the genome

In addition to the 63 loci on chromosome 1, we placed the pRPL5-GFP cassette at selected loci on the other 15 chromosomes, creating a total of 482 GFP strains that are suitable for analysis. Biological replicates of GFP measurements were highly correlated (Fig. S2a–c). The coefficient of variation of mean expression level across all examined loci is 0.11. The largest mean expression level is 15 times the smallest (Data S1). There are 109 loci exhibiting significantly different mean expression levels when compared with the average of all 482 loci (Q < 0.05, two-tailed t test followed by FDR correction of multiple testing; Fig. 3a). A comparison of the average variance of mean expression level among replicates of each strain (0.01) with the variance among all mean expression level measurements of all strains (0.11) indicates that position effect accounts for 90% of the observed variation in mean expression level measurements (P < 1×10−99, ANOVA). This observation stands in contrast to a previous yeast study where position effect accounts for only 35% of the observed variation (Chen et al., 2013), most likely because qRT-PCR-based mRNA level measurement is less precise than florescence-based protein level measurement (Fig. 2i).

Figure 3. Position effects on GFP expression level and noise across 482 loci on all chromosomes, and covariates of the position effects.

Figure 3

See also Fig. S2 and Data S1.

(a)–(c) The μ (a), σ (b) and CV (c) of GFP levels in the 482 strains ordered by the quantity concerned. Red dots show significant deviations from the average of the 482 strains.

(d) Correlations among mRNA expressions driven by native promoters (native mRNA), protein expressions driven by native promoters (native protein), and GFP expressions driven by pRPL5 (GFP) at the same set of loci. Spearman’s correlation coefficients and P-values are shown. Solid lines indicate statistical significance (P < 0.05) whereas dashed lines indicate insignificance.

(e) Mean relative μ, σ, and CV of GFP expression when the GFP cassette is placed at centromeres (yellow), telomeres (blue), or other chromosomal regions (red). Error bars indicate one standard error. Twenty-three and 11 examined loci are in centromeres and telomeres, respectively. P-values are from one-tailed t tests.

(f) Correlation between replication timing and mean GFP level.

(g) Correlation between replication timing and GFP expression noise.

(h) Correlation between H3K36me3 intensity and mean GFP level. Histone modification intensity is averaged for the 250-bp window starting at 100 bp downstream the start codon of the endogenous gene. In (f)–(h), circles show medians, and error bars extend to 10 and 90 percentiles of the data. Horizontal dashed line indicates the median for the middle bin. Spearman’s correlation coefficients and P-values are shown. In (a)–(c) and (f)–(h), all values are ratios to the corresponding values of the reference strain in which the pRPL5-GFP cassette is placed at the native RPL5 locus.

The coefficient of variation of expression noise σ across all examined loci is 0.27, and the highest σ is 13 times the lowest (Data S1). There are 176 loci showing significantly different σ values from the average of all 482 loci (Fig. 3b). In total, 88% of variation in σ measurements is explained by position effect (P < 1×10−99). The coefficient of variation of expression noise CV across all examined loci is 0.62, and the largest CV is 20 times the smallest (Data S1). There are 43 loci showing significantly different CV values from the average (Fig. 3c). In total, 86% of observed variation in CV measurements is explainable by position effect (P < 1×10−99).

Newman et al. measured endogenous protein expression level and noise in a collection of yeast strains where each gene is expressed as a carboxy-terminal GFP fusion protein from its endogenous promoter and native chromosomal position (Newman et al., 2006). As expected, these GFP levels are positively correlated with the mRNA concentrations of the tagged genes (ρ = 0.59, P < 1×10−99; Fig. 3d). However, no significant correlation exists between the GFP level (μ) expressed from our GFP cassette placed at a locus and either the mRNA or protein concentration of the endogenous gene at the locus (Fig. 3d). The same is true for the correlation in expression mean (σ) or noise (CV) between Newman et al.’s and our GFP levels (Fig. 3d). These results confirm that the variations in expression level and noise among yeast endogenous proteins (i.e., Newman et al.’s data) are uninformative in the study of position effect such as what was hypothesized by Batada and Hurst. They also demonstrate that the GFP expression in our experiment is not influenced by the endogenous promoter at the locus where the GFP cassette is placed.

Factors associated with position effects

We examined a number of factors potentially impacting the position effects on expression level and noise. We observed reduced expression mean (P = 0.03, one-tailed t test; Fig. 3e) and standard deviation (P < 0.01) in centromeres when compared with the rest of the genome. The mean GFP transcript level, however, is not significantly different between positions near the centromere (< 1kb) and the rest of positions examined on chromosome 1 (P = 0.18), as was reported previously (Chen et al., 2013). Both mean and standard deviation of the GFP level are moderately lower in telomeres than the rest of the genome, but the difference is not significant, likely due to the small number of telomeric positions examined (Fig. 3e). Because genomic regions subject to early DNA replication in a cell cycle have higher copy numbers per cell than late-replication regions, we predict that mean expression is higher in early than late replication regions. This is indeed the case (ρ = −0.11, P = 0.01; Fig. 3f). Having more gene copies should reduce expression noise (Wang and Zhang, 2011), a prediction supported by lower coefficient of variation in early than late replication regions (ρ = 0.13, P < 0.01; Fig. 3g). A previous study showed that a transgene inherits the chromatin landscape of its genomic position and reported that the mRNA level of the transgene is negatively impacted by H3K36me3 modification in the promoter (Chen et al., 2013). This impact of chromatin state on position effect is also evident for mean GFP concentration (ρ = −0.24, P < 6×10−6; Fig. 3h; Fig. S2d). Several other histone modifications such as H3K4me1 also correlate significantly with mean GFP expression and noise (nominal P < 0.05) (Fig. S2d–f).

Essential genes are enriched in genomic regions with low expression noise

As mentioned in the Introduction, to explain the origin of the chromosomal clustering of essential genes in yeast, Batada and Hurst (BH) hypothesized that genomic regions with inherently low expression noise are characterized by low nucleosome occupancies and are sinks of essential genes (Batada and Hurst, 2007). We used our GFP noise data to test the BH hypothesis directly.

To this end, we first estimated, for each GFP strain, the distance of its noise (CV) to the median noise (CV) of the strains with comparable mean expression (μ) (Data S1). This distance, named DM, is a measure of mean-controlled noise suitable for among-locus comparison (Newman et al., 2006; Batada and Hurst, 2007). Following Batada and Hurst, we divided the yeast genome into overlapping windows of nine consecutive genes with step size equal to one. We examined the variance in the number of essential genes per window. If essential gene clustering is common, this variance is expected to be high, with some windows having many essential genes and others lacking any. Indeed, the observed variance exceeds that in each of 1,000 randomly shuffled genomes (P < 1×10−3, randomization test; Fig. 4a). Similar results were obtained for other window sizes examined (3, 5, 7, …, and 41), confirming significant clustering of essential genes (Fig. 4a).

Figure 4. Essential genes are clustered at genomic regions with inherently low expression noise.

Figure 4

See also Figs. S3–S4 and Table S2.

(a) Chromosomal clustering of essential genes, inherent protein expression noise (DM), nucleosome occupancy, and three histone modifications. Presented is the fraction of times when a randomized genome exhibits a higher among-window variance in the mean within-window value of the property concerned than that observed in the actual genome. The genome is randomized by shuffling the positions where the relevant data are available. Window sizes examined are 3, 5, 7, …, and 41 genes. Invisible dots all have a Y-axis value of 0. The dashed horizontal line shows the fraction of 0.05.

(b) Essential genes tend to be located at positions with inherently low noise, when compared with nonessential genes. P-value is from Mann-Whitney U test. Dots show medians and error bars extend to 10 and 90 percentiles of the data.

(c) Correlation between essential gene density (number of essential genes in a nine-gene window) and the mean inherent expression noise (DM) of the window. Error bars indicate one standard error.

(d) Rank correlations between mean histone modification intensity in a window and mean DM of the window for three histone modifications.

(e) Rank correlations between mean histone modification intensity in a window and essential gene density of the window for three histone modifications.

(f) Reduced dissimilarities in essential gene density, three histone modifications, and expression noise (DM) between two inter-chromosomal windows when their central genes interact in the 3D chromosome architecture, compared to those when their central genes do not interact. P-values are based on two-tailed t test. Error bars indicate standard error. For all relevant panels, histone modification intensity is averaged for the 250-bp window starting 250 bp upstream the start codon of the endogenous gene.

In support of the BH hypothesis, positions occupied by essential genes have inherently lower noise (DM) than those occupied by nonessential genes (P < 4×10−3, Mann-Whitney U test; Fig. 4b). Because of this trend and the chromosomal clustering of positions with similar DM (Fig. 4a), essential genes are expected to be clustered at low-DM regions. We defined window DM as mean available DM values of the nine positions in a window, and indeed observed a significant negative correlation between essential gene density (number of essential genes per window) and window DM (ρ = −0.93, P < 7×10−3; Fig. 4c). This result is robust to different window sizes considered (Table S2).

Nonetheless, contrary to the BH hypothesis, nucleosome occupancy is not positively correlated with DM in either of two chromatin datasets (Lee et al., 2004; Pokholok et al., 2005) analyzed (ρ = −0.01, P = 0.84, Fig. S3a; ρ = −0.10, P = 0.03, Fig. S3b), despite significant chromosomal clustering of nucleosome occupancy (Fig. 4a) and a negative correlation between the mean nucleosome occupancy of a window and essential gene density in at least one of the datasets (Fig. S3c, d). Instead, we found window DM to be significantly positively associated with H3K4me1 and negatively associated with H3K4me3 and H3K79me3 modifications of the window (Fig. 4d; Fig. S4a–d). The finding on H3K4me3 echoes the recent report in human and mouse that H3K4me3 breadth negatively impacts expression noise quantified by single-cell mRNA sequencing (Benayoun et al., 2014). Concordantly, essential gene density is negatively associated with H3K4me1 and positively associated with H3K4me3 and H3K79me3 modifications (Fig. 4e; Fig. S4e). These observations suggest the scenario that the presence and absence of specific histone modifications create a chromatin state of low noise, which attracts essential genes.

Three-dimensional chromosomal conformation and position effects

Given that epigenetic markers reflect and possibly outline physical domains, which are fundamental units of chromosome three-dimensional (3D) folding (Sexton et al., 2012), we hypothesize that loci physically interacting in the 3D chromosome architecture tend to have similar position effects, analogous to adjacency on a linear chromosome (Fig. 4a). We compared inter-chromosomal pairs of nine-gene windows whose central loci interact in 3D and those whose central loci do not interact in 3D, based on a yeast 3D chromosome conformation model (Duan et al., 2010). We found that interacting windows have significantly lower dissimilarity than noninteracting windows in essential gene density (P < 1×10−99; Fig. 4f), H3K4me1, H3K4me3, and H3K79me3 modifications (P < 1×10−99; Fig. 4f), and window DM (P < 8×10−3; Fig. 4f), revealing previously unrecognized associations that 3D chromosome conformation has with position effect and genome organization.

DISCUSSION

The present study describes the genomic landscape of position effects on protein expression level and noise in yeast. We found that relocating a gene from one genomic position to another can alter the mean protein expression level by up to 15 times and expression noise by up to 20 times. This observation suggests that, in genetic engineering, where the high expression of a foreign gene in a host cell is often critical, choosing the right genomic position to place the foreign gene should be an important consideration. We identified several factors such as DNA replication timing and histone modifications that are associated with position effects. Position effects on expression noise are similar not only between loci adjacent on the linear chromosome but also those adjacent in the 3D chromosome conformation. We found evidence for the role of noise reduction in the formation of essential gene clusters and identified multiple histone modifications as potential determinants of the noise level of chromatin domains, supporting the hypothesis that position effect shapes the evolution of genome organization.

We also observed that the magnitude of position effect on expression level we detected in yeast is much smaller than that in a previous report in mouse cells (Akhtar et al., 2013). This discrepancy may be due to both study design and biological differences between the two systems. There are two important experimental design differences between our study and the earlier work in mouse. First, we measured protein concentration while the previous study measured mRNA concentration. Our observations demonstrate that position effect on protein expression is significantly smaller than that on mRNA expression for the same positions examined. Second, the transgene is inserted at existing gene loci in our study, whereas in the previous study it is inserted at both gene loci and intergenic regions. It is probable that many intergenic regions, especially gene deserts, have strong negative position effects on gene expression. It is notable that yeast and mammals have key differences in genome organization, with the latter containing substantially larger fractions of intergenic regions than the former. Thus, when intergenic regions are considered, the landscape of position effect is expected to differ between yeast and mammals. These differences imply potentially large deleterious consequences of human mutations that translocate a gene from its native genomic position to another (Milot et al., 1996; Kleinjan and van Heyningen, 1998). It is important to note that our conclusions are largely based on gene expressions driven by one promoter. As shown in Fig. 1, although position effect independent of promoter effect exists, promoter-position interactions abound. In the future, it will be valuable to expand our study to multiple promoters to uncover promoter-dependent and promoter-independent patterns and mechanisms of position effects.

EXPERIMENTAL PROCEDURES

Strain construction

Strains were constructed on the genetic background of S. cerevisiae strain BY4743. Details of the experimental procedure are provided in Supplementary Experimental Procedures. All primer sequences are supplied in Table S1.

Florescence activated cell sorting (FACS)

The GFP protein concentration of each cell was estimated using FACS following a recent study (Duveau et al., 2014), with experimental details provided in Supplementary Experimental Procedures.

Quantitative RT-PCR

Quantitative RT-PCR was used to measure the GFP transcript concentration, with detailed experimental procedures provided in Supplementary Experimental Procedures.

Replication timing, histone modification, and nucleosome occupancy data

We analyzed DNA replication timing data (Koren et al., 2010), high-resolution genome-wide ChIP-chip data for histone acetylation and methylation (Pokholok et al., 2005), and ChIP-chip data of Myc-tagged histone H4 (Lee et al., 2004). Detailed analysis protocols are provided in Supplementary Experimental Procedures.

Loci interacting in 3D chromosome architecture

We used the haploid yeast 3D chromosome architecture inferred by chromosome conformation capture-on-chip (4C) coupled with massively parallel sequencing (Duan et al., 2010). Analysis protocols are provided in Supplementary Experimental Procedures.

Statistical analyses

Many statistical analyses were conducted in this study and they are detailed in Supplementary Experimental Procedures.

Supplementary Material

1
2

Highlights.

  • Protein expression level and noise vary by the genomic location of the gene.

  • Position effects correlate with replication timing and chromosomal conformation.

  • Essential genes enrich in quiet genomic regions with certain epigenetic makers.

  • Position effects on expression level and noise vary with the promoter used.

Acknowledgments

We thank Wenfeng Qian and Jian-Rong Yang for stimulating discussions and valuable comments. This work was supported in part by U.S. National Institutes of Health grant R01GM103232 to J.Z.

Footnotes

AUTHOR CONTRIBUTIONS

J.Z. conceived the project, secured funding, and supervised the project. X.C. performed experiments and analyzed data. X.C. and J.Z. designed experiments and wrote the manuscript.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Akhtar W, de Jong J, Pindyurin AV, Pagie L, Meuleman W, de Ridder J, Berns A, Wessels LF, van Lohuizen M, van Steensel B. Chromatin position effects assayed by thousands of reporters integrated in parallel. Cell. 2013;154:914–927. doi: 10.1016/j.cell.2013.07.018. [DOI] [PubMed] [Google Scholar]
  2. Artieri CG, Fraser HB. Evolution at two levels of gene expression in yeast. Genome Res. 2014;24:411–421. doi: 10.1101/gr.165522.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Batada NN, Hurst LD. Evolution of chromosome organization driven by selection for reduced gene expression noise. Nat Genet. 2007;39:945–949. doi: 10.1038/ng2071. [DOI] [PubMed] [Google Scholar]
  4. Benayoun BA, Pollina EA, Ucar D, Mahmoudi S, Karra K, Wong ED, Devarajan K, Daugherty AC, Kundaje AB, Mancini E, et al. H3K4me3 breadth is linked to cell identity and transcriptional consistency. Cell. 2014;158:673–688. doi: 10.1016/j.cell.2014.06.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Chen M, Licon K, Otsuka R, Pillus L, Ideker T. Decoupling epigenetic and genetic effects through systematic analysis of gene position. Cell Rep. 2013;3:128–137. doi: 10.1016/j.celrep.2012.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Deutschbauer AM, Jaramillo DF, Proctor M, Kumm J, Hillenmeyer ME, Davis RW, Nislow C, Giaever G. Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics. 2005;169:1915–1925. doi: 10.1534/genetics.104.036871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dey SS, Foley JE, Limsirichai P, Schaffer DV, Arkin AP. Orthogonal control of expression mean and variance by epigenetic features at different genomic loci. Mol Syst Biol. 2015;11:806. doi: 10.15252/msb.20145704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Duan Z, Andronescu M, Schutz K, McIlwain S, Kim YJ, Lee C, Shendure J, Fields S, Blau CA, Noble WS. A three-dimensional model of the yeast genome. Nature. 2010;465:363–367. doi: 10.1038/nature08973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Duveau F, Metzger BP, Gruber JD, Mack K, Sood N, Brooks TE, Wittkopp PJ. Mapping small effect mutations in Saccharomyces cerevisiae: impacts of experimental design and mutational properties. G3 (Bethesda) 2014;4:1205–1216. doi: 10.1534/g3.114.011783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Elgin SC, Reuter G. Position-effect variegation, heterochromatin formation, and gene silencing in Drosophila. Cold Spring Harbor perspectives in biology. 2013;5:a017780. doi: 10.1101/cshperspect.a017780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, O’Shea EK, Weissman JS. Global analysis of protein expression in yeast. Nature. 2003;425:737–741. doi: 10.1038/nature02046. [DOI] [PubMed] [Google Scholar]
  12. Gierman HJ, Indemans MH, Koster J, Goetze S, Seppen J, Geerts D, van Driel R, Versteeg R. Domain-wide regulation of gene expression in the human genome. Genome Res. 2007;17:1286–1295. doi: 10.1101/gr.6276007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Henikoff S. Position-effect variegation after 60 years. Trends Genet. 1990;6:422–426. doi: 10.1016/0168-9525(90)90304-o. [DOI] [PubMed] [Google Scholar]
  14. Hurst LD, Pal C, Lercher MJ. The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet. 2004;5:299–310. doi: 10.1038/nrg1319. [DOI] [PubMed] [Google Scholar]
  15. Kaern M, Elston TC, Blake WJ, Collins JJ. Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet. 2005;6:451–464. doi: 10.1038/nrg1615. [DOI] [PubMed] [Google Scholar]
  16. Khan Z, Ford MJ, Cusanovich DA, Mitrano A, Pritchard JK, Gilad Y. Primate transcript and protein expression levels evolve under compensatory selection pressures. Science. 2013;342:1100–1104. doi: 10.1126/science.1242379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Kleinjan DJ, van Heyningen V. Position effect in human genetic disease. Hum Mol Genet. 1998;7:1611–1618. doi: 10.1093/hmg/7.10.1611. [DOI] [PubMed] [Google Scholar]
  18. Koren A, Soifer I, Barkai N. MRC1-dependent scaling of the budding yeast DNA replication timing program. Genome Res. 2010;20:781–790. doi: 10.1101/gr.102764.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Lee CK, Shibata Y, Rao B, Strahl BD, Lieb JD. Evidence for nucleosome depletion at active regulatory regions genome-wide. Nat Genet. 2004;36:900–905. doi: 10.1038/ng1400. [DOI] [PubMed] [Google Scholar]
  20. Lehner B. Selection to minimise noise in living systems and its implications for the evolution of gene expression. Mol Syst Biol. 2008;4:170. doi: 10.1038/msb.2008.11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. McManus CJ, May GE, Spealman P, Shteyman A. Ribosome profiling reveals post-transcriptional buffering of divergent gene expression in yeast. Genome Res. 2014;24:422–430. doi: 10.1101/gr.164996.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Michalak P. Coexpression, coregulation, and cofunctionality of neighboring genes in eukaryotic genomes. Genomics. 2008;91:243–248. doi: 10.1016/j.ygeno.2007.11.002. [DOI] [PubMed] [Google Scholar]
  23. Milot E, Fraser P, Grosveld F. Position effects and genetic disease. Trends Genet. 1996;12:123–126. doi: 10.1016/0168-9525(96)30019-x. [DOI] [PubMed] [Google Scholar]
  24. Newman JR, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, Weissman JS. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441:840–846. doi: 10.1038/nature04785. [DOI] [PubMed] [Google Scholar]
  25. Ottaviani A, Gilson E, Magdinier F. Telomeric position effect: from the yeast paradigm to human pathologies? Biochimie. 2008;90:93–107. doi: 10.1016/j.biochi.2007.07.022. [DOI] [PubMed] [Google Scholar]
  26. Pal C, Hurst LD. Evidence for co-evolution of gene order and recombination rate. Nat Genet. 2003;33:392–395. doi: 10.1038/ng1111. [DOI] [PubMed] [Google Scholar]
  27. Pokholok DK, Harbison CT, Levine S, Cole M, Hannett NM, Lee TI, Bell GW, Walker K, Rolfe PA, Herbolsheimer E, et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell. 2005;122:517–527. doi: 10.1016/j.cell.2005.06.026. [DOI] [PubMed] [Google Scholar]
  28. Sexton T, Yaffe E, Kenigsberg E, Bantignies F, Leblanc B, Hoichman M, Parrinello H, Tanay A, Cavalli G. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell. 2012;148:458–472. doi: 10.1016/j.cell.2012.01.010. [DOI] [PubMed] [Google Scholar]
  29. Sturtevant AH. The effects of unequal crossing over at the Bar locus in Drosophila. Genetics. 1925;10:117–147. doi: 10.1093/genetics/10.2.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Wang Z, Zhang J. Impact of gene expression noise on organismal fitness and the efficacy of natural selection. Proc Natl Acad Sci U S A. 2011;108:E67–76. doi: 10.1073/pnas.1100059108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Weiler KS, Wakimoto BT. Heterochromatin and gene expression in Drosophila. Annu Rev Genet. 1995;29:577–605. doi: 10.1146/annurev.ge.29.120195.003045. [DOI] [PubMed] [Google Scholar]
  32. Wilson C, Bellen HJ, Gehring WJ. Position effects on eukaryotic gene expression. Annual review of cell biology. 1990;6:679–714. doi: 10.1146/annurev.cb.06.110190.003335. [DOI] [PubMed] [Google Scholar]
  33. Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285:901–906. doi: 10.1126/science.285.5429.901. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES