Widespread epistasis shapes RNA Polymerase II active site function and evolution

Bingbing Duan; Chenxi Qiu; Sing-Hoi Sze; Craig Kaplan

doi:10.1101/2023.02.27.530048

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2023 Feb 28:2023.02.27.530048. [Version 1] doi: 10.1101/2023.02.27.530048

Widespread epistasis shapes RNA Polymerase II active site function and evolution

Bingbing Duan ¹, Chenxi Qiu ², Sing-Hoi Sze ^3,⁴, Craig Kaplan ¹

PMCID: PMC10002619 PMID: 36909581

Abstract

Structurally conserved multi-subunit RNA Polymerases (msRNAPs) are responsible for cellular genome transcription in all kingdoms of life. At the heart of these RNA polymerases is an ultra-conserved active site domain, the trigger loop (TL). The TL participates in substrate selection, catalysis, and translocation of transcription elongation by switching between an open, catalytic-disfavoring state and a closed, catalytic-favoring state in the active site and is therefore central to the msRNAP nucleotide addition cycle. Previous studies have observed diverse genetic interactions between eukaryotic RNA polymerase II (Pol II) TL residues, supporting the idea that the TL’s function is shaped by functional interactions of residues within the TL and between the TL and its proximal domains. The nature and conservation of these intra-TL and inter-TL residue interaction networks, as well as how they control msRNAP function, remains to be determined. To identify the residue interactions that shape TL function and evolution, we have dissected the Pol II intra-TL and TL-Pol II residue interactions by deep mutational scanning in Saccharomyces cerevisiae Pol II. Through analysis of over 15000 alleles, representing single mutants, a subset of double mutants, and evolutionarily observed TL haplotypes, we identify interaction networks within the TL and between the TL and other Pol II domains. Substituting residues creates allele-specific networks and propagates epistatic effects across the Pol II active site. Our studies provide a powerful system to understand the plasticity of RNA polymerase mechanism and evolution.

INTRODUCTION

Transcription from cellular genomes is carried out by multi-subunit RNA polymerases (msRNAPs)[1]. All msRNAPs share conserved structure and function[2, 3]. Bacteria and Archaea use a single type of msRNAP to transcribe all genomic RNAs [4–6], while Eukaryotes have at least three msRNAPs (Pol I, II, and III), each of which is specialized to a gene subset and has a varying subunit number [7–10]. RNA synthesis by msRNAPs is an iterative process involving nucleotide selection, catalysis and polymerase translocation (nucleotide addition cycle, NAC)[11–15]. A conserved active site accomplishes the NAC using strikingly conserved, conformationally flexible domains termed the bridge helix (BH) and the trigger loop (TL)[5, 12, 16–19].

The BH was an initial point of focus for function in the NAC as it was detected as a straight helix in most bacterial and all archaeal and eukaryotic msRNAP structures [7, 8, 12, 13, 20], but in a kinked confirmation in Thermus thermophilus (Bacteria) RNAP structures[5, 21] Potential flexibility has been supported by molecular dynamic simulations and has been proposed to promote msRNAP translocation[16, 21–23]. The BH conformational changes have been proposed be influenced or coupled to its adjacent domain, the TL[16]. The TL is especially important as nearly all catalytic cycle events are associated with its flexible and mobile nature, as has been observed in structural and biophysical/biochemical studies[12–14, 19, 24–27]. Among the observed conformations, two major states, a catalytic disfavoring “open” and a catalytic favoring “closed” conformation, ensure proper catalysis and translocation[12, 13, 28]. During each NAC, the TL nucleotide interaction region (NIR) facilitates the discrimination of correct NTP over dNTPs or non-matched NTPs. Matched substrate binding induces or captures a TL conformational change from the open to the closed state[11, 24, 29, 30]. The closure of the TL promotes phosphodiester bond formation [24, 31]. Pyrophosphate release accompanies TL opening. This transition is proposed to support polymerase translocation to the next position downstream on the template DNA, allowing for the subsequent NAC[32–34](Figure1A). Moreover, msRNAP pausing and backtracking have been associated with additional TL confirmations[35–38]. TL functions derive from its confirmational dynamics, which are likely balanced by residue interactions within the TL and its proximal domains[39–43].

Figure 1. — A. The Pol II active site is embedded in the center of a 12-subunit complex. Pol II functions are supported by distinct TL conformational states. An open TL (PDB: 5C4X)[28] and closed TL (PDB: 2E2H)[12] conformations are shown in the upper right panel. GOF mutations have been identified in the TL and its proximal domains (lower right panel) suggesting TL mobility and function may be impacted by adjacent residues. B. Examples of inter-residue genetic interactions. WT residues are shown in grey circles with number indicating residue position in Rpb1. Mutant substitutions are shown in colored circles, with color representing mutant class. Colored lines between mutant substitutions represent types of genetic interactions. C. Overview of experimental approach. We synthesized 10 libraries of TL variants represented by colored stars. Libraries were co-transformed with gapped *RPB1* WT or mutated *rpb1* plasmid into WT or mutated yeast strains, allowing for the formation of full-length *RPB1* gene with mutations by homologous recombination. Leu⁺ and 5FOA-resistant transformants (selecting for plasmids that complement essential functions of *RPB1*) were scraped and re-plated onto different media for phenotyping. DNA was extracted from yeast from all conditions, TL region amplification, and Illumina sequencing. Read counts for variants on SC-Leu+5FOA were used to determine growth fitness. Double mutant relationships were computed using growth fitness. Mutant conditional growth fitnesses were calculated using allele frequencies under selective growth conditions and subjected to two logistic regression models for classification/prediction of catalytic defects. Classification allowed epistatic interactions to be deduced from double mutant growth fitness.

Mutations identified in the TL affected every step of transcription, emphasizing the significance of TL residues in maintaining TL dynamics and function[13, 17, 27, 44–48]. Additionally, the TL is embedded in the conserved active site by other domains such as the BH, alpha-21, 46 and 47 helices, Rpb2 and Rpb9 subunits[12, 28]. Between the TL and its proximal domains, residue interactions have been observed. For example, these include the five-helix bundle composed of the BH, two TL helices, α46 and α47 packed by hydrophobic residues at the bundle core[28, 48] and the interaction between Rpb2 760–772, BH 824–827 and TL residues (all numbering relative to S. cerevisiae Pol II subunits), which have been speculated to form a network to stabilize the backtracked Pol II[12, 36, 37, 49] (Fig. 1A). Understanding how “connected” the TL is to the rest of the polymerase will reveal the networks that integrate its dynamics with the rest of the enzyme and pathways for how msRNAP activity might be controlled.

Intramolecular interactions in the active sites of msRNAPs control catalytic activity and underpin transcriptional fidelity. Catalytic activity and transcription fidelity can be altered by active site mutations within the TL and domains close to the TL in many msRNAPs[17, 21, 22, 27, 44–46]. These mutations phenotypes suggest that TL conformational dynamics and function are finely balanced and could be sensitive to allosteric effects from proximal domains[50–53]. For instance, mutations in the TL NIR impair interactions between TL and substrates, resulting in hypoactive catalysis and reduced elongation rate in vitro (Loss of function, LOF)[17, 27, 39, 45, 46, 54]. Mutations in the TL hinge region and C-terminal portion appear to disrupt interactions stabilizing the inactive state of TL (open state) and shift the TL towards the active state (closed state), leading to hyperactive catalysis and increased elongation rate but impaired transcription fidelity (Gain of function, GOF)[17, 28, 37]. GOF and LOF mutations have also been found in TL-proximal domains, including the BH GOF T834P and LOF T834A, funnel helix α21 GOF S713P, and Rpb2 GOF Y769F[17, 51, 52] (Fig. 1A). The fact that these GOF or LOF mutations at TL-proximal domains share similar phenotypes with TL GOF or LOF mutations suggests TL-proximal domains likely participate in transcription by interacting with the TL. However, whether they alter TL function beyond putatively altering the balance of conformational states observed in the WT enzyme and what TL residues they communicate through to ensure proper transcription remain unclear.

Physical intramolecular interactions between amino acids, the fundamental unit of proteins, define protein function and evolvability[55–58]. Interaction networks that run through proteins can be detected by statistical coupling analysis of residues across evolutionary assessments of protein function in large scale[59–65]. Dependence of mutant phenotypes on the identities of other amino acids (epistasis) contributes to protein evolvability by providing evolutionary windows in which some intolerable mutations may be tolerated (allowed to fix) after changes elsewhere alter the phenotype of previously intolerable mutations[66–68]. Recent studies with deep mutational scanning experiments and statistical coupling analysis have shown that perturbing residue interactions with mutations can alter the original protein function, allostery, and shape possibilities of future substitutions, leading to an interpretation that even conserved residues of homologous proteins are subject to distinct epistatic constraints[69–76]. This means that residues conserved over evolution between homologs that are presumed to have the same functions do not operate in the same environment and mutations in analogous positions between homologs may be unpredictable. Distinct phenotypes for conserved residues have been observed in a number of systems including between Pol I and Pol II in yeast[74, 75, 77]. Additionally, the yeast Pol I TL domain was incompatible when introduced into the Pol II context while the Pol III TL was compatible even though over 70% of residues in the three yeast TLs are identical[78]. The results strongly imply even functions of ultra-conserved residues and domains are shaped by individually evolved enzymatic contexts (higher order epistasis), and that enzyme-specific mechanisms developed with the divergence of specialized functions or species-specific controls on the activity of the ultra-conserved TL. Understanding the evolution of conserved msRNAPs requires an evaluation of how comparable the higher order epistasis is in closely related msRNAPs.

Functional interactions between residues and the requirements of individual mutants for their phenotypes (other residues that are required to support or restrict mutant phenotypes) can be revealed by genetic interactions of double mutants[17, 46, 79–81]. Previous studies on a small subset of site-directed substitutions from our lab have identified distinct types of Pol II double mutant interactions including suppression, enhancement, epistasis, and sign-epistasis [17, 46, 80]. Suppression was common between LOF and GOF mutants as expected if each mutant is individually acting in the double mutant and therefore, opposing effects on activity are balanced. Similarly, synthetic sickness and lethality were commonly observed between mutants of the same class, consistent with the combination of mutants with partial loss of TL function having greater defects when combined. However, we have also observed lack of enhancement between mutants of similar classes (epistasis) suggesting single mutants might be functioning at the same step, and also rarely sign-epistasis, where a mutant phenotype may be dependent on the identity of a residue at another position. For example, the GOF TL substitution Rpb1 F1084I was unexpectedly lethal with the LOF TL substitution Rpb1 H1085Y (instead of the predicted mutual suppression for independently acting mutants)[46, 80]. This was interpreted as F1084I requiring H1085 for its GOF characteristics and becoming a LOF mutant in the presence of H1085Y. How representative these interactions are, and the nature of interactions across the Pol II active site requires a more systemic analysis to fully describe and understand the networks that control Pol II activity and the requirements for each mutant phenotypes.

Deducing complex residue interaction networks on a large scale is challenging. To accomplish this for Pol II, we have established genetic phenotypes predictive of biochemical defects and coupled this with a yeast Pol II TL deep mutational scanning system[17, 46, 80, 82]. Here we extend this system to a wide range of double and multiple mutants within the S. cerevisiae Pol II TL and between the TL and adjacent domains. By analysis of 11818 alleles including single mutants and a curated subset of double mutants, we have identified intricate intra- and inter-TL residue interactions that strongly impact TL function. Additionally, the examination of 3373 haplotypes including evolutionarily observed TL alleles and co-evolved residues revealed that TL function is heavily dependent on the msRNAP context (epistasis between TL and the rest of Pol II). These results suggest that despite being highly conserved, the epistasis within msRNAPs contexts functions through derived residues and potentially reshapes functions of conserved residues. Finally, statistical coupling analyses reveals the pathways within Rpb1 that appear to converge on the TL and may modulate active site activity upon factor binding. Our analyses indicate TL function and evolution are dominated by widespread epistasis.

RESULTS

Systematic detection of residue interactions inside and outside of the Pol II TL by deep mutational scanning.

To detect residue interactions in the TL and between the TL and Pol II that shape TL function and evolution, we designed and synthesized 15174 variants representing all possible S. cerevisiae Pol II TL single mutants, a subset of targeted double mutants, evolutionary haplotypes and potential intermediates in ten synthesized libraries (details in the Supplemental Table 4). This approach follows our prior analysis of the TL[17] with slight alterations (see Methods and Figure S1A) Libraries were then transformed into yeast and screened under different conditions to quantify the growth defects of variants through a our high-throughput screening platform (Fig. 1C and Fig. S1A). Growth phenotypes of mutants in each selective condition is represented as a fitness score, which is calculated as the log₂ difference of a mutant allele frequency shift under selective growth relative to a control condition divided by change in WT under the same conditions. Biological replicates indicate high reproducibility (Fig. S1B). Individual libraries were min/max normalized[83] to account for scaling differences between libraries (Fig. S1C) and identical mutants present among libraries indicate high correlation of fitness determinations in each library (Fig. S1D–E). Furthermore, we used the median fitness value across three biological replicates to serve as estimate of mutant fitness in each condition.

To examine double mutant effects relative to single mutants, and to estimate epistasis, we assume that independence of mutant effects would result in log additive defect. This means that predicted double mutant fitness defects should be the combination of both single mutant defects as is standardly assumed[68, 79, 81, 84]. Deviation from log additive fitness defects represents potential epistasis between single mutants (either less than expected, i.e. epistasis or genetic suppression or more than expected, i.e. synthetic sickness or lethality). We wished to distinguish specific epistatic interactions from activity-dependent suppression or synthetic interactions with mutant catalytic defects, because of previously observed mutual suppression between some LOF and GOF TL alleles. For the purposes of our analysis, we defined an interaction as epistatic when we observed positive deviation in double mutant fitness versus expected when mutants of same class were combined (GOF+GOF, LOF+LOF). We defined sign epistasis for situations where we observed negative interaction for combinations between the classes (GOF+LOF), where we would expect suppression if mutants were functioning independently (Fig. S2).

Because of the complexity of having two classes of active site mutant (GOF and LOF) that each confer fitness defects, we wanted to accurately classify mutants to enable our epistasis analysis. We have previously demonstrated that mutant growth profiles across a select set of growth conditions are highly predictive of in vitro measured catalytic effects. GOF mutants are typically sensitive to Mn²⁺, the drug Mycophenolic Acid (MPA), and can confer the Spt⁻ phenotype (Lys⁺ growth in our strain). In contrast, LOF mutants are MPA-resistant, Spt⁺ (Lys⁻), and show a Gal-resistant phenotype on the gal10Δ56 transcription reporter. Using high-throughput phenotyping of our libraries on relevant media, we predicted mutant classes with logistic regression models. We trained two logistic regression models based on 65 mutants with measured in vitro catalytic defects and their conditional growth fitnesses to distinguish between GOF or LOF classes. Both models worked well in classifying GOF or LOF mutants (Fig. S3A). These two models were applied to all viable mutants (fitness score > −6.5 for control growth condition) and classified the mutants into three groups, GOF, LOF and those that did not belong to either one of the two groups (“unclassified”). To confirm the classification results, we applied t-SNE projection and k-means clustering for all measured mutants in all growth conditions to examine correlation with multiple logistic regression models. As shown in Figure S3B, we observed separated GOF and LOF clusters consistent with logistic regression classifications. Using all phenotypic data indicated that GOF and LOF mutants fell into more than one cluster apiece, suggesting more fine-grained separation using additional phenotypes and potential underlying distinctions within classes (Fig. S3C).

TL residues are embedded in complex interaction networks.

To uncover the TL-internal residue interaction networks, we selected 2–4 different amino acid substitutions with diverse phenotypes (GOF, LOF, lethal, or unclassified mutants) for each TL residue and combined them with the selected substitutions at all other TL positions. This curated set of 3790 double mutants represents potential interactions between any two TL residues (Fig. 2A). We compared the observed fitness of these double mutants with that expected by the additive model in an xy-plot. About half of the combinations (1776/3790) matched the additive model (observed fitness ≈ expected fitness), while the rest showed positive (observed fitness > expected fitness, n=612) or negative (observed fitness < expected fitness, n=1402) interactions (Fig. 2B, Fig. S2). From these positive or negative interactions, we distinguished the ratio of specific epistasis relative to activity-additive interactions in combinations between or within the classes (GOF/LOF, GOF/GOF, or LOF/LOF). We observed 43% positive interactions (activity-additive suppression) and 41% negative interactions (sign epistasis) in all GOF/LOF combinations. Rate of positive interactions between predicted GOF and LOF mutants (suppression) was much higher than within class combinations (LOF/LOF or GOF/GOF) as expected. Activity-additive synthetic sick or lethal interactions were much more common than epistasis in combinations within the same class. We observed ~2% positive (epistasis) and 95% negative (activity-additive synthetic sick or lethal) interactions in GOF/GOF combinations, and 6% positive (epistasis) and 84% negative (synthetic sick or lethal interactions) interactions in LOF/LOF combinations (Fig. 2C). Interactions were distributed throughout the TL and covered every TL residue, supporting connectivity across the TL. Observed epistasis was concentrated within the C-terminal TL helix and adjacent regions (Fig 2D), genetically supporting a proposed model that TL-C terminal residues collaboratively stabilize the TL open state.

Figure 2. — A. Design of the pairwise double mutant library. We curated 2–4 substitutions for each TL residue (in total 90 substitutions, n(GOF) = 18, n(LOF) = 30, n(Unclassified) = 19, n(Lethal) = 23), and combined them with each other to generate double mutants. 3910 double mutants representing combinations between any two TL residues were measured and 3790 of them passed the reproducibility filter. WT TL residues show in transparent circles with positions labeled. Phenotype classes of single substitutions are shown as colored circles (GOF in green, LOF in blue) while unclassified mutants are in grey and lethal mutants are in black. B. An XY-plot of observed double mutant growth fitness measured in our experiment (Y-axis) and expected fitness from the addition of two constituent single mutants’ fitnesses (X-axis). N(positive) = 612. N(Negative) = 1402. N(Additive) = 1776. N(Sum)=3790. Lethal threshold (−6.5) is labeled with dashed lines on X and Y axis. C. Percent of interactions observed from each combination group. N(LOF/LOF) = 412. N(GOF/GOF) = 156. N(GOF/LOF) = 534. D. Epistasis and sign epistasis, positive, negative, and synthetic lethal interactions are shown in network format. E. The intra-TL functional interaction landscape with interactions represented by double mutant fitnesses are shown in a heatmap. Annotations at the top and right indicate the 90 curated single mutants and their predicted phenotypic classes from multiple logistic regression modeling. The upper part of the heatmap shows single mutant growth fitness profiling across multiple phenotypes ordered by groups predicted with logistic regression models. The lower part of the heatmap shows double mutant growth fitness where the color at the interaction of X and Y coordinates indicates fitness of the double mutant. F. The intra-TL functional interaction landscape with interactions represented by deviation of observed fitness from predicted fitness (deviation score) in the heatmap.

Display of data in heat maps visually demonstrates a number of key conclusions (Fig. 2E–F). For example, most lethal mutants could be suppressed by at least one predicted GOF mutant, suggesting that most lethal mutants likely have reduced activity (LOF) below a viable threshold, as might be predicted from greater probability that any individual mutant would be a LOF and not a GOF. However, two lethal mutations could be suppressed by most LOF mutations or specific other lethal mutants, but not GOF mutants, implying that their lethality resulted from being GOF (A1076 mutations)) (Fig. 2F). Unclassified single mutants mostly did not show widespread interaction with GOF, LOF, or lethal classes. However, a few unclassified mutants showed suppression in combination with GOF mutants, suggesting potential atypical LOF not detected by phenotypic analysis, or potential sign epistasis (Fig. 2F).

Allele-specific interactions suggest unique properties of individual mutants with similar phenotypes.

TL conformational dynamics and function are balanced by residue interactions within the TL (TL-internal interactions) and between the TL and TL-proximal domains (TL-external interactions). The properties of GOF and LOF mutants outside of the TL appear similar to those inside but how they behave upon TL perturbation is not known. We analyzed the scope and nature of TL-internal and TL-external interactions by exploring interaction space of 12 previously studied GOF and LOF mutants (8 within the TL and 4 outside) each combined with all possible single TL mutants (Fig. 3A). These 12 mutants function as probes for the genetic interaction space of the TL and how it might be altered in allele-specific fashion by perturbation of the “probe” mutation. TL-external mutants showed similar scale of widespread interactions with TL substitutions as when TL-internal mutants were used as probes. For these example TL-external substitutions, we conclude their impact on Pol II function is of similar magnitude and connection as substitutions within the flexible and mobile TL.

Figure 3. — A. Design of the targeted double mutant libraries. Twelve “probe” mutations (eight within the TL and four in TL-proximal domains) were combined with all possible substitutions at each TL residue to generate 7280 double mutants. 7276 mutants passed the reproducibility filter and were used for interaction analyses. B. The percentage of functional interactions observed for each probe mutant with viable GOF or LOF TL substitutions. C. Pol II-TL functional interaction landscape with interactions represented by deviation of observed fitness from predicted fitness (deviation score). The upper part of the heatmap shows all Pol II TL single mutant growth fitness profiling across a number of phenotypes ordered by hierarchical clustering with Euclidean distance. The lower part of the heatmap shows double mutant growth fitness where the color at the interaction of X and Y coordinates indicates deviation scores of the double mutant.

We further compared the similarity of interaction networks for substitutions with superficially similar biochemical and phenotypic defects. These analyses were designed to detect if changes to TL function might reflect simple alteration to TL dynamics, or additional alteration to folding trajectories or conformations. In the former case, mostly additive interactions might be predicted due to TL operating in the same fashion in double mutants versus single mutants, with phenotypes deriving from differences in kinetics or distributions of existing states. In the latter case where a mutation alters TL folding trajectories or changes TL conformations, it might be predicted that individual mutants that are superficially similar will show allele-specific genetic interactions reflecting epistatic changes to TL function. A subset of probe substitutions showed widespread activity-additive suppression between GOF/LOF mutations and expected activity-additive synthetic lethality between same classes of substitution (LOF/LOF or GOF/GOF). However, allele-specific epistasis and sign epistasis were also observed and were much higher for some mutants than others (Fig. 3C and Fig. S5–S8). 127/620 TL substitutions showed unique interactions with specific probe mutants; for example, some lethal substitutions could only be suppressed by Y769F, a GOF TL-proximal probe mutant in Rpb2 (Fig. S8A). Moreover, two TL-adjacent GOF target mutants, Rpb1 S713P (funnel α-helix 21) and the BH allele Rpb1 T834P displayed greatly distinct interaction networks. Rpb1 S713P exhibited widespread suppression of LOF TL substitutions (96 instances) consistent with generic enhancement of activity but preservation of TL function. In contrast, Rpb1 T834P exhibited much lower suppression ability (33 instances). In addition to much lower ability to suppress, T834P showed a much greater amount of sign epistasis than Rpb1 S713P (102 instances to 38 instances) (Fig. S6–S7). These results are consistent with a model that perturbation to the BH structure is coupled to extensive changes to TL functional space and that T834P function as a GOF mutant requires most TL residues to be WT.

A similar distinction as above but between two internal TL GOF substitutions, Rpb1 E1103G and Rpb1 F1084I, was also apparent. Rpb1 E1103G showed widespread suppression of LOF TL substitutions (184 instances), consistent with site-directed mutagenesis studies[46]. These results suggest E1103G primarily may alter TL dynamics consistent with biochemical data that it promotes TL closure[27] and that it allows TL mutants primarily to maintain their effects. In contrast, Rpb1 F1084I showed more limited suppression of LOF alleles (43 instances) and showed much more widespread synthetic lethality. These results indicate F1084I has a much greater requirement for WT residues at many TL positions to maintain its GOF characteristics. When TL function is additionally perturbed, F1084I appears to switch from a GOF to a LOF (Fig. S6–S7). These results imply that individual probe mutants distinctly reshape the Pol II active site, though they might share catalytic and phenotypic defects as single mutants.

An even more striking example of this phenomenon can be observed by comparison of the interaction networks of two LOF substitutions at the exact same position, the ultra-conserved H1085 residue. This histidine contacts incoming NTP substrates[5, 12], is the target for the Pol II inhibitor α-amanitin[45], and promotes catalysis. Initial structural data and molecular dynamics simulations were interpreted as H1085 potentially functioning as a general acid for Pol II catalysis[85–88]. Our discovery that H1085L was especially well-tolerated[17], and subsequent experiments from the Landick lab[89, 90], have led to their proposal that the TL histidine functions as a positional catalyst and a similarly sized leucine supports catalysis with relatively mild effects on biochemistry and growth. If H1085Y and L substitutions are acting on a continuum of positional catalyst activity, we might predict their interaction networks would be similar and only be distinguished by magnitude of interactions, but not identity or type of interactions. In contrast to this prediction, distinct interaction patterns were observed (Fig. 3C, Fig. S5, S6B, S7). Most GOF mutants were able to suppress H1085Y but not H1085L. Instead, H1085L showed synthetic lethality with most GOF mutants (putative sign epistasis). For example, almost all substitutions at E1103 showed sign epistasis with H1085L but not H1085Y (Fig. S6B, S7). Distinction between H1085L and H1085Y is evident in the PCA plot of probe mutants (Fig. 4A). The partially unique nature of each probe mutant is also evident in the PCA plot (Fig 4A). Altogether, distinguishable interaction networks of probe mutants, despite their similarity in catalytic and growth defects, even within the same residue, suggest that each mutant has ability to propagate effects across the Pol II active site. To some extent, each Pol II mutant creates a new enzyme.

Figure 4. — A. Principal component analysis (PCA) of deviation scores across double mutant interactions for 12 probe mutants (see Methods). B. Specific epistatic interactions observed between A1076 and L1101, and M1079 and G1097 are shown as heatmaps. The X-axis of both heatmaps are 20 substitutions ordered by chemical properties of amino acids, and the color of substitution represents the phenotypic class of the substitution predicted by multiple logistic regression models. GOF substitution is in green, LOF is in blue, unclassified is in gray, and lethal (fitness < −6.5) is in black. The top and right parts of the heatmap are single mutant growth fitness measured at SC-Leu+5FOA condition. The bottom left is the deviation score of double mutants. C. The epistatic interactions we identified between A1076 and L1101, together with M1079 and G1097 were shown on the five-helix bundle of Pol II active site (PDB:5C4X)[28].

Several allele-specific interactions were observed. Some of the strongest epistatic interactions were between A1076 substitutions and L1101S, which differed from all other GOF target mutants (Fig. 4B, Fig. S8B), suggesting tight coupling between A1076 and L1101 for Pol II function. These two hydrophobic residues, together with other hydrophobic residues in TL proximal helices, form a five-helix bundle in the Pol II active site likely stabilizing the open TL conformation. Consistent with this, another pair adjacent residues, M1079 and G1097, also showed allele-specific epistasis (Fig. 4B, Fig. S8B).

The epistasis we identified in combinations within the same class (GOF/GOF or LOF/LOF) might also be sign epistasis (GOF suppressing GOF or LOF suppressing LOF due to a switch in residue class). We distinguished regular epistasis (lack of additivity) from sign epistasis suppression by checking conditional phenotypes predictive of biochemical defects. We reasoned that epistatic interactions would exhibit double mutant conditional phenotypes similar to single mutants while sign epistasis suppression would also exhibit suppression of conditional phenotypes. Therefore, we examined double mutants with our logistic regression models for determining phenotypic class. The majority of double mutants within each class showing positive epistasis (GOF/GOF or LOF/LOF) maintained single mutant classification. 6/10 GOF/GOF doubles showing positive epistasis were classified as GOF while 30/38 LOF/LOF doubles were classified as LOF, suggesting classic epistasis (Fig. S9A). In three cases of GOF/GOF combinations, all between L1101S and A1076 substitutions, the resulting double mutants were unclassified, consistent with nearly WT behavior. Here, each constituent single mutant conferred a GOF phenotype but the double mutants show mutual suppression. This suggests tight coupling between 1101 and 1076 (see Discussion).

We also observed allele-specific interactions for predicted lethal mutants. Our threshold for lethality is likely higher than that in actuality, and very slow growing mutants may fall below our lethal threshold while still having enough data on conditional fitness assessment for logistic regression to predict mutant class. For 21 ultra sick/lethal TL substitutions predicted as GOF themselves, we observed suppression when combined with other GOF mutants (Fig. S9B/C). Lethal substitutions of A1076 could be suppressed by LOF probe mutants and the GOF probe L1101S, consistent with specific combinations between 1076 and 1101 showing sign-epistasis suppression or mutual suppression. F1084R is a predicted lethal GOF but can be suppressed specifically by GOF probe Y769F. F1084 and Y769 are close to each other specifically when the TL is in the closed, substrate bound state. Additionally, 5 ultra-sick/lethal substitutions predicted as LOF could be suppressed by a LOF allele (Fig. S9B). As an example, S1091G could be suppressed by almost all curated GOF mutants, yet it was also specifically suppressed by the LOF V1094D (Fig. S9C). S1091G and V1094D appear to compensate for each other in a specific fashion. We suggest that these are the types of interactions that will allow the TL and adjacent residues to evolve and differentiate while maintaining essential functions. We note that strong epistasis is much more prevalent in the Pol II system than in other proteins where it has been quantified[58, 64, 91–93] (Fig. S9D). We attribute this difference to the much higher rate of suppressive interactions due to Pol II mutants having opposing effects on catalysis.

TL evolution is shaped by epistasis between TL and its enzymatic context.

We previously found that identical mutations in a residue conserved between the Pol I and Pol II TLs yielded different biochemical phenotypes[78, 94]. Furthermore, the yeast Pol I TL was incompatible within the yeast Pol II enzyme, implying that TL sequences could be coevolving and have enzyme-specific coupling (though the Pol III TL was well-tolerated in Pol II) [78, 94]. To determine the generality and scope of TL-Pol incompatibility, we designed a library containing evolutionary TL variants from bacterial, archaeal, and eukaryotic msRNAPs and determined their compatibility in the yeast Pol II context (Fig. 5A). TL alleles of eukaryotic Pols were more compatible than those from Archaea and Bacteria and Pol II alleles were the most compatible (Fig. 5B and Fig. S10A) consistent with evolutionary distance. The total number of TL substitutions in haplotypes were slightly negatively correlated with growth fitness in the Pol II background for Archaeal, Pol I, II and III sequences. We did not observe a correlation between number of TL substitutions relative to Pol II for Bacterial TL sequences, likely because these sequences were almost entirely incompatible (Fig. S10C). Conservation of TL sequence and function was high enough that some archaeal sequences could provide viability to yeast Pol II, yet at the same time a number of Pol II TLs from other species were defective if not lethal. These results suggest widespread coevolution of TL sequence outside ultra-conserved positions is greatly shaping TL function (see Discussion).

Figure 5. — A. Schematic for the TL evolutionary haplotypes library. We selected 662 TL haplotypes representing TL alleles from bacterial, archaeal and the three conserved eukaryotic msRNAPs. These TL alleles were co-transformed with gapped *RPB1* (TLΔ) into yeast and were phenotyped under different conditions. B. Fitness of evolutionarily observed TL haplotypes in the yeast Pol II background. The Pol II WT TL fitness (0) is labeled as dotted line. C. A comparison of the highest deviation score for each TL lethal single substitution that was present in any evolutionary TL haplotypes (n=3732 TL sequences) from bacterial, archaeal or eukaryotic Pols versus those that have not been observed in any species. 9 substitutions were found in an MSA of 542 archaeal TL sequences that are lethal when present in yeast as a single substitution. 17 were found in an MSA of 1403 bacterial TLs, 5 were found in 749 Pol I TLs, 7 were found in 499 Pol II TLs, and 5 were found in 539 Pol III TLs. Evolutionarily observed lethal substitutions were compared to those absent in our TL MSA. The percentage of non-suppressible lethal single mutants for each group is labeled at the bottom of the plot. Statistical comparison was done with the Mann-Whitney test and the P-values are shown.

We wanted to ask if lethal single substitutions in the yeast Pol II TL behaved differently if they were present in the msRNAP evolutionary tree or not. We reasoned that evolutionarily observed lethal substitutions might be closer to functional than non-evolutionarily observed and would therefore be more likely to be suppressible by Pol II GOF alleles. To compare suppressibility between evolutionarily observed and unobserved substitutions lethal to Pol II, we extracted the highest positive deviation scores among all double mutants containing each lethal substitution. Maximum deviation scores for Pol II lethal substitutions present in TLs of existing msRNAPs were higher than for lethal substitutions that were absent, indicating the Pol II lethal mutants present in existing msRNAPs on average maintain a greater functionality and/or are suppressible by single changes (Fig. 5C and Fig. S10B). The TL has been estimated as providing 500–1000 fold enhancement on catalytic activity[95–97], while we estimate only ~10-fold effects are tolerated for yeast viability[45]. We conclude that lethal mutants observed as functional residues in other species are more likely to be close to the viability threshold as might result from a series of small steps to allow them to function.

Rpb1 coevolutionary residue networks identified by Statistical Coupling Analysis (SCA).

Our analyses suggest that even a highly conserved domain such as the Pol II TL can be sensitive to identity of adjacent residues and that changing networks of interactions shape the Pol II active site across evolution. We employed SCA to identify coevolving residue networks in Rpb1 to ask about pathways that might converge on the TL. SCA “Sector” analysis is especially useful for identify subgroups of coupled residues that might form allosteric communication networks[60, 63, 65]. We extracted 410 yeast Pol II Rpb1 sequences from the recently published msRNAP large subunit multiple sequence alignment (MSA) from the Landick lab[90] and performed SCA (see Methods)[60]. We identified 40 coevolving sectors within Rpb1, and every single TL residue was found within one of only eight sectors. The eight sectors that contained TL residues are shown on the Rpb1 structure (Fig. 6). TL residues within the TL NIR were coupled with most BH residues and the alanine-glycine linker (Rpb1 1087–1088). Six of eight Rpb1 sectors containing TL residues also contained at least one BH residue, supporting the possibility of high functional coupling between these two domains. Coupling is not limited to residues that are close to each other. Distal residues can be in the same cluster. For example, the greatest distance between a TL residue and another Rpb1 residue in the same sector is ~ 55 Å. Interestingly, the residue pair 1076–1101, for which we observed extensive epistasis, are the sole TL residues within a very large cluster containing >150 residues across Rpb1. Our epistasis studies indicate multiple allele-specific interactions between 1076 and 1101 of exactly the type that might be reflected by evolutionary coupling between them. The hydrophobic TL pocket is an attractive linchpin for potential communication to the TL from throughout Pol II, and multiple sectors converge on this domain.

Figure 6. — The eight coevolution sectors containing the any TL residues that were identified from statistical coupling analysis are shown on the yeast Pol II Rpb1 structure (PDB:5C4X)[28]. The TL is labeled with magenta and the BH is labeled with cyan. The TL and BH residues in each sector are labeled at the bottom of each sector. The total number of residues within each sector is also shown. The details of the statistical coupling analysis are in Methods.

DISCUSSION

Functional networks within Pol II revealed.

How individual mutants alter a protein’s function is not necessarily straightforward at the mechanistic level. Amino acid substitutions both remove functionality of the WT residue but replace that functionality with something different. By altering the local environment within protein or potentially propagating effects to distant locations through allosteric changes, each substitution potentially can be quite different. These differences may not be apparent as phenotypic outputs may not have granularity to distinguish different biophysical behaviors if they result in similar outputs. For Pol II mutants, even high-resolution phenotypic analysis, such as gene expression profiling or genetic interaction profiling between Pol II mutants and deletions in other yeast genes[52], suggest that LOF and GOF mutants represent a continuum of defects that match enzymatic activity in vitro. Therefore, these profiles also appear dependent on the output of Pol II activity defects and can’t distinguish potential differences in underlying mechanism.

Through systematic detection of functional interactions within the Pol II active site, we have identified functional relationships between amino acids across the TL and between TL substitutions and others. In the absence of double mutant analyses able to detect epistasis it would not be possible to differentiate similar alleles from one another, L1101S from E1103G, for example, – two GOF alleles very close to each other in Pol II structure. Here, we find that their distinct interactions support that substitutions at 1101 and 1103 target distinct residue networks. 1101 functions in the five-helix bundle hydrophobic pocket with 1103 showing statistical coupling with a number of TL external residues that together support interactions that promote the open TL conformation. We also observed connections between TL C-terminal residues that suggest a limit to how disruptions to structure there can alter Pol II activity. Helix-disrupting LOF proline substitutions in at least two TL positions showed epistasis with multiple substitutions in the back of the TL (1094–1098), suggesting that their functions require TL C-terminal helix structure and in the absence of that structure (proline disruption) effects are no longer additive.

The strongest interactions observed were between two pairs of hydrophobic residues, A1076 and L1101, and M1079 and G1097. Each of these contributes to the structure of a hydrophobic pocket that bundles two TL proximal helices with the BH and two others in a five-helix bundle. Supporting the dependence of these residues on each other for maintaining function, identity at these positions over evolution also shows coupling. Interestingly, these A1076 and L1101 were coupled uniquely out of TL residues with a great number of other positions in Rpb1 (Fig. 6).

Elongation factors bind Pol II and alter its activity, but the mechanisms by which they do so are not known[98, 99]. We observe a highly level of genetic interactions between residues outside the TL and residues within it, including allele-specific reshaping of TL mutant space upon single substitution outside the TL. The fact that minor mutational changes outside the TL can apparently functionally perturb the TL would be consistent with the idea that minor alterations to Pol II structure upon elongation factor binding could easily propagate into the active site via the TL or the BH. As an example, human Rtf1 has been observed to project a domain into the Pol II structure adjacent to the BH (in yeast, this region is occupied instead by Rpb2[100]). These contacts have been proposed to alter Pol II activity. We would propose that the paths for such alteration activity would follow the coupling sectors we have observed by SCA.

How different individual substitutions are under the surface is critical for understanding plasticity in protein mechanisms and how they might be altered by evolutionary change. A key open question in nucleic acid polymerase mechanisms is the paths for protons in the reaction (for example, deprotonation of the synthesized strand 3′-OH and protonation of pyrophosphate leaving group, for example) (e.g.[85, 86, 88, 90, 101, 102]). For msRNAPs, the association with incoming NTP by a nearly universally conserved histidine led to the proposal that this residue might donate a proton during the reaction[12, 87, 101]. Some substitutions at this position can provide minimal essential function (e.g. tyrosine, arginine), while others are only moderately defective (glutamine). Surprisingly, we found that H1085L was very-well tolerated for growth[17] and the Landick lab has proposed this substitution supports catalysis through positional but not chemical effects[89, 90]. Our studies here were quite surprising in that they indicated that L1085 Pol II has unique behavior when perturbed by all possible TL substitutions and is entirely distinct from H1085Y (where we have direct observations of all possible intra-TL doubles) or H1085A or H1085Q (curated doubles). These residue specific behaviors suggest that each substitution may have different properties, and compatibility with function may not necessarily represent similar function under the surface.

Evolutionary change over time can alter protein function but it can also alter protein functional plasticity. Recent work from the Thornton lab elegantly demonstrates that phenotypes of substitutions to residues conserved over hundreds of millions of years can change over evolutionary time and can do so unpredictably and transiently during evolution[70]. msRNAPs have structures and functions conserved over billions of years, and deep within their active sites is a mobile domain, the TL, that has large functional constraints on its sequence. The TL sequence must be able to fold into multiple states and maintain recognition of the same substrates across evolutionary space and is shows high identity even between distantly related species. Here we show that the TL, and likely the entire Pol II active site, exhibits a great amount of plasticity through non-conserved positions that are essential for compatibility of the TL and surrounding domains. Our results illustrating widespread epistasis and allele-specific effects of single and double mutants predict that comparative analyses among Pol I, II, and III will reveal widespread and enzyme-specific mechanisms due to higher order epistasis shaping function of conserved residues.

METHODS

Design and Synthesis of TL mutant libraries.

We updated and extended the fitness dataset of Qiu et al[17]. Using a similar methodology, but with adjusted conditions and a second-generation mutant library strategy, in order to generate a complete Pol II TL mutation-phenotype map and examine genetic interactions. Mutants were constructed by synthesis with Agilent and screened for phenotypes previously established as informative for Pol II mutant biochemical defects. Programmed oligonucleotide library pools included all 620 single TL residue substitutions and deletions for Rpb1 amino acids 1076–1106 (Library 1), 3914 pairwise double substitutions (Library 2), 4800 targeted double substitutions (Library 6), and 3373 multiple substitutions (Library 3–5), along with the WT S. cerevisiae Pol II TL allele at a level of ~15% of the total variants, enabling precise quantification (see Supplemental Table 4). Each synthesized region contained a mutated or WT Pol II TL sequence and two flanking regions at the 5′ and 3′’ ends of the TL-encoding sequence. These flanking regions also contained designed “PCR handle” (20bp) sequences, allowing distinct subsets of oligos to be amplified from synthesized pools using selected primers for PCR, and additional flanking WT Pol II sequences allow for further extension of homology arms by PCR “sewing” (Details are in Supplemental Method 2 and 3).

Introduction of Libraries into yeast and phenotyping.

Synthesized mutant pools were transformed into yeast (CKY283) along with an RPB1-encoding plasmid where the TL-encoding sequence was replaced with an MluI restriction site for linearization as described in Qiu et al[17]. This strategy allows construction of rpb1 mutant libraries by gap repair between library fragments and the linearized vector. Briefly, the synthesized oligo pools were amplified by limited cycles of emulsion PCR to limit template switching. Extension of flanking homology arms of ~200 bp were added by PCR sewing. Amplified TL sequences with extended flanking regions were co-transformed with linearized pRS315-derived CEN LEU2 plasmid (pCK892) into CKY283, allowing gap repair via homologous flanking regions. To detect potential residue-residue interactions between the TL and TL-proximal domains including the Rpb1 Bridge Helix (BH), Funnel Helix alpha-21 and Rpb2, the Pol II TL single mutant pool (Library 1, 620 mutant alleles and 111 WT alleles) was co-transformed individually with gapped plasmids encoding an additional rpb1 allele (Rpb1 BH T834P, T834A, or Funnel Helix alpha-21 S713P) into CKY283 respectively, or with the gapped WT RPB1 plasmid into a strain with the genomic mutation, rpb2 Y769F. These co-transformations created double mutants between the TL and TL-proximal mutants. The WT allele in single mutant pool represented the single probe mutant due to substitutions outside the TL on the plasmid or in the strain background. To distinguish between a fully WT TL and a WT TL representing the TL of a mutant allele elsewhere, a WT Pol II TL allele with a silent mutant at T1083 (WT codon ACC was replaced with ACT) was co-transformed with plasmid containing gapped WT RPB1 in a WT strain in parallel. 15% of the transformants with silent mutation were mixed with transformants of double mutants. The silent mutation allowed us to distinguish the WT and the single mutants. Each transformation was done in three biological replicates. After transformation, Leu⁺ colonies were collected from SC-Leu plates by scraping into sterile water and replated on SC-Leu+5FOA to select for cells having lost the RPB1 URA3 plasmid. 5-FOA-resistant colonies were scraped into sterile water from SC-Leu+5FOA and replated on SC-Leu, SC-Leu + 20mg/ml MPA (Fisher Scientific), SC-Leu + 15 mM Mn (Sigma), YPRaf, YPRafGal, SC-Lys, and SC-Leu + 3% Formamide (JT Baker) for phenotyping. Details of cell numbers plated on each plate and screening time of each plate are in the phenotyping details table. Details of high efficiency transformation protocol is in supplemental method 1.

Generation of libraries for quantification by amplicon sequencing.

Genomic DNA of each screened library was extracted using the Yeastar genomic DNA kit according to manufacturer’s instructions (Zymo Research). To ensure adequate DNA for sequencing, the TL regions of all libraries were amplified with PCR cycles that were verified to be in the linear range by qPCR to minimize disturbance of allele distributions, and under emulsion PCR conditions (EURx Micellula DNA Emulsion & Purification (ePCR) PCR kit) to limit template switching. Details are in Supplemental Method 2 and 3. To multiplex samples, we employed a dual indexing strategy wherein 10 initial barcodes for differentiating 10 mutant libraries were added during the initial amplification using 10 pairs of custom primers. In a second amplification, 28 primers containing 28 NEB indices were used to add a second index for distinguishing conditions and replicates (NEBNext Multiplex Oligos for Illumina) (see Supplemental Table 2). As a result, a sample-specific barcodes were present for each set of variants. The indexed, pooled samples were sequenced by single end sequencing on an Illumina Next-Seq (150nt reads). On average, over 11 million reads were obtained for individual samples with high reproducibility from two rounds of sequencing.

Data cleaning and fitness calculation and normalization.

Reads of mutants were sorted into appropriate libraries and conditions by detecting particular indices after sequencing. Read counts were estimated by a codon-based alignment algorithm to distinguish reads that exactly matched designated codons of mutants[103]. To clean the data, mutant reads with coefficients of variation greater than 0.5 in the control condition (SC-Leu) were excluded from the analysis. The mutant read count was increased by 1 to calculate the allele frequency under different conditions. To measure and compare the phenotypes of all mutants, mutant phenotypic score (fitness) was calculated by allele frequency change of a mutant under selective conditions relative to the unselective condition comparing to the frequency change of WT. The formula for calculating fitness is shown below.

Fitness (mut) = \log [f^{mut, sele} / f^{mut, unsele} - \log [f^{WT, sele} / f^{WT, unsele}]

We applied min-max normalization to bring the median growth fitness of mutants measured at ten libraries to the same level for direct comparison (formula is shown below). In each library, we divided mutants into several groups based on their allele counts on the control condition. Mutants with read count differences of less than 10 are present in one group. The WT growth fitness was set as the maximum value and the minimum fitness in each group was the minimum. Min-max normalization was used to equalize the growth fitness into the same range between various groups inside each library. Additionally, we utilized min-max normalization to level the mutant fitness across all ten libraries with WT fitness as Max and minimal fitness in each library as the minimum. As a result, mutant growth fitness was scaled to one range and could be used to determine genetic interactions.

X^{'} = \frac{X - X m i n}{X m a x - X m i n}

Determination of functional interactions.

The genetic interactions between single substitutions were determined by comparing the multiple-substitution mutant normalized median fitness to the log additive of the single substitution normalized median fitness. The simplified formula is as follows:

Deviation score (M_{1} M_{2} M_{3}) = Fitness (M_{1} M_{2} M_{3}) - [Fitness (M_{1}) + Fitness (M_{1}) + Fitness (M_{3})]

−1 < Deviation score < 1, the interaction among the constituent single mutants is additive and mutants are acting independently.
Deviation score ≥ 1, the interaction is non-additive and is positive, including suppression and epistatic interactions.
Deviation score ≤ −1. the interaction is non-additive and is negative, including synthetic sick, synthetic lethal, and sign epistasis interactions.

Any mutation with fitness smaller than the lethal threshold (−6.50) was classified as an ultra-sick/lethal mutant and its fitness was normalized to −6.50 for calculation of the deviation score. Synthetic sickness and synthetic lethality were distinguished by whether a double mutant is viable or lethal (fitness is greater than or equals to the lethal threshold −6.5) when two constituent mutations are viable. Synthetic lethality can be further classified into two types. First, additive-synthetic lethality was determined when the expected double mutant fitness calculated by additive model was lethal (expected fitness = −6.5) and the observed double mutant fitness was also lethal (fitness =−6.5) (in this case the deviation score =0). Second, the beyond-additive synthetic lethality was determined when the expected double mutant was viable (expected fitness > −6.5) while the observed double mutant fitness was lethal (fitness = −6.5) (in this case the deviation score <0). To separate these two situations in our figures, we labeled additive synthetic lethality as black and beyond-additive synthetic lethality as purple.

Details of formulas are in Supplemental Method 4. The codes for calculating deviation scores and generating figures are available in GitHub(https://github.com/Kaplan-Lab-Pitt/TLs_Screening.git).

Mutant classification using two multiple logistic regression models.

We trained two multiple logistic regression models to distinguish GOF and LOF mutants using the phenotypic fitness on SC-Leu+MPA, SC-Lys, and YPRafGal conditions of 65 single mutants, including 25 previously identified GOF mutants, 33 LOF mutants, one WT, and six that were not GOF or LOF mutants. Intercept, main effects, and two-way interactions were involved in defining both models. 0.75 was used as the cutoff threshold for both the GOF and LOF models.

Model for predicting the probability of a mutant being a GOF:

\begin{array}{l} y \\ = \frac{1}{1 + e^{\land} (1.816 + 2.542 * f M P A - 1.942 * f L y s + 0.06566 * f G a l - 0.5297 * f M P A * f L y s - 0.08373 * f M P A * f G a l + 0.02556 * f L y s * f G a l)} \end{array}

Model for predicting the probability of a mutant being LOF:

\begin{array}{l} y \\ = \frac{1}{1 + e^{\land} (1.916 - 1.392 * f M P A - 1.328 * f L y s - 0.8353 * f G a l - 0.01112 * f M P A * f L y s - 0.2992 * f M P A * f G a l + 0.8823 * f L y s * f G a l)} \end{array}

Both models showed accuracy, with the area under ROC close to one (Fig. S3A). The details are provided in Supplemental Table 5.

t-SNE projection.

Allele frequencies for all mutants in nine conditions with three replicates are analyzed by t-SNE (Perplexity = 50) or k-means (clusters =20). Thirteen clusters with ultra-sick to lethal mutants as majority were eliminated. The remaining mutants were analyzed again with t-SNE (Perplexity = 100) and k-means (cluster =10). The scripts utilizing the R language (https://www.R-project.org/), along with R packages Rtsne v0.15 (https://github.com/jkrijthe/Rtsne), ggplot2 v3.3.3 (https://ggplot2.tidyverse.org), k-means (stats v4.0.3(https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/kmeans)), are available through GitHub(https://github.com/Kaplan-Lab-Pitt/TLs_Screening.git).

Statistical coupling analysis.

A published multiple sequence alignment (MSA) containing 5787 eukaryotic homologous sequences of yeast Rpb1 was used in the statistical coupling analysis[90].A published multiple sequence alignment (MSA) containing 5787 eukaryotic homologous sequences of yeast Rpb1 was used in the statistical coupling analysis[90]. 1464 sequences were retained after sequence identity reducing to 90% with T-coffee package v12.00.7fb08c2[104] through conda v4.6.14. Pol I, II, and III sequences were separated based on an ML tree constructed with FastTree 2[105] and 410 Pol II Rpb1 homologous sequences were re-aligned with T-coffee, and the newly generated MSA was used for statistical coupling analysis with the python-based package pySCA v6.1[60]. The scripts were adapted from https://github.com/ranganathanlab/pySCA and are available via GitHub(https://github.com/Kaplan-Lab-Pitt/TLs_Screening.git).

Supplementary Material

Supplement 1

Supplemental Method 1. High efficiency large scale chemical yeast transformation protocol.

Supplemental Method 2. Emulsion PCR set up with EURx Micellula DNA Emulsion & Purification (ePCR) PCR kit.

Supplemental Method 3. Amplification/transformation/screening of mutant libraries and sequencing pool preparation.

Supplemental Method 4. Formulas of calculating functional interactions.

media-1.pdf^{(636.5KB, pdf)}

Supplement 2

Supplemental Table 1. Strains and plasmids.

media-2.xlsx^{(19.8KB, xlsx)}

Supplement 3

Supplemental Table 2. Primers.

media-3.xlsx^{(28.7KB, xlsx)}

Supplement 4

Supplemental Table 3. Phenotyping details.

media-4.xlsx^{(9KB, xlsx)}

Supplement 5

Supplemental Table 4. Library summary.

media-5.xlsx^{(9.5KB, xlsx)}

Supplement 6

Supplemental Table 5. MLR_models_summary.

media-6.xlsx^{(12.4KB, xlsx)}

Supplement 7

NIHPP2023.02.27.530048v1-supplement-7.pdf^{(6.4MB, pdf)}

ACKNOWLEDGMENTS

We thank Dr. Anne-Ruxandra Carvunis (U. Pittsburgh) and Dr. Steven Lockless (Texas A&M) for discussions and advice. We thank Zhizhen Wang and Muyao Lin from the Pitt Statistical Consulting Center for their advice on checking the reproducibility of our data. We acknowledge funding from NIH R01GM097260 for initiation of this project and NIH R35GM144116 for this work. This research was supported in part by the University of Pittsburgh Center for Research Computing, RRID:SCR_022735, through the resources provided. Specifically, this work used the HTC cluster, which is supported by NIH award number S10OD028483.

REFERENCES

1.Cramer P., Multisubunit RNA polymerases. Curr Opin Struct Biol, 2002. 12(1): p. 89–97. [DOI] [PubMed] [Google Scholar]
2.Werner F. and Grohmann D., Evolution of multisubunit RNA polymerases in the three domains of life. Nat Rev Microbiol, 2011. 9(2): p. 85–98. [DOI] [PubMed] [Google Scholar]
3.Allison L.A., et al. , Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases. Cell, 1985. 42(2): p. 599–610. [DOI] [PubMed] [Google Scholar]
4.Zhang G., et al. , Crystal Structure of Thermus aquaticus Core RNA Polymerase at 3.3 Å Resolution. Cell, 1999. 98(6): p. 811–824. [DOI] [PubMed] [Google Scholar]
5.Vassylyev D.G., et al. , Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 A resolution. Nature, 2002. 417(6890): p. 712–9. [DOI] [PubMed] [Google Scholar]
6.Hirata A., Klein B.J., and Murakami K.S., The X-ray crystal structure of RNA polymerase from Archaea. Nature, 2008. 451(7180): p. 851–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Gnatt A.L., et al. , Structural basis of transcription: an RNA polymerase II elongation complex at 3.3 A resolution. Science, 2001. 292(5523): p. 1876–82. [DOI] [PubMed] [Google Scholar]
8.Cramer P., Bushnell D.A., and Kornberg R.D., Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science, 2001. 292(5523): p. 1863–76. [DOI] [PubMed] [Google Scholar]
9.Fernandez-Tornero C., et al. , Crystal structure of the 14-subunit RNA polymerase I. Nature, 2013. 502(7473): p. 644–9. [DOI] [PubMed] [Google Scholar]
10.Hoffmann N.A., et al. , Molecular structures of unbound and transcribing RNA polymerase III. Nature, 2015. 528(7581): p. 231–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Malinen A.M., et al. , Active site opening and closure control translocation of multisubunit RNA polymerase. Nucleic Acids Res, 2012. 40(15): p. 7442–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Wang D., et al. , Structural basis of transcription: role of the trigger loop in substrate specificity and catalysis. Cell, 2006. 127(5): p. 941–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Kaplan C.D., Basic mechanisms of RNA polymerase II activity and alteration of gene expression in Saccharomyces cerevisiae. Biochimica Et Biophysica Acta-Gene Regulatory Mechanisms, 2013. 1829(1): p. 39–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Dangkulwanich M., et al. , Complete dissection of transcription elongation reveals slow translocation of RNA polymerase II in a linear ratchet mechanism. Elife, 2013. 2: p. e00971. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Bar-Nahum G., et al. , A ratchet mechanism of transcription elongation and its control. Cell, 2005. 120(2): p. 183–93. [DOI] [PubMed] [Google Scholar]
16.Weinzierl R.O., The nucleotide addition cycle of RNA polymerase is controlled by two molecular hinges in the Bridge Helix domain. BMC Biol, 2010. 8: p. 134. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Qiu C., et al. , High-Resolution Phenotypic Landscape of the RNA Polymerase II Trigger Loop. PLoS Genet, 2016. 12(11): p. e1006321. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Da L.T., et al. , Bridge helix bending promotes RNA polymerase II backtracking through a critical and conserved threonine residue. Nat Commun, 2016. 7: p. 11244. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Mazumder A., et al. , Closing and opening of the RNA polymerase trigger loop. Proc Natl Acad Sci U S A, 2020. 117(27): p. 15642–15649. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Liu X., Bushnell D.A., and Kornberg R.D., RNA polymerase II transcription: structure and mechanism. Biochim Biophys Acta, 2013. 1829(1): p. 2–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Kaplan C.D. and Kornberg R.D., A bridge to transcription by RNA polymerase. J Biol, 2008. 7(10): p. 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Tan L., et al. , Bridge helix and trigger loop perturbations generate superactive RNA polymerases. J Biol, 2008. 7(10): p. 40. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Silva D.A., et al. , Millisecond dynamics of RNA polymerase II translocation at atomic resolution. Proc Natl Acad Sci U S A, 2014. 111(21): p. 7665–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Wang B., et al. , Energetic and structural details of the trigger-loop closing transition in RNA polymerase II. Biophys J, 2013. 105(3): p. 767–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Larson M.H., et al. , Trigger loop dynamics mediate the balance between the transcriptional fidelity and speed of RNA polymerase II. Proc Natl Acad Sci U S A, 2012. 109(17): p. 6555–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Fouqueau T., et al. , The RNA polymerase trigger loop functions in all three phases of the transcription cycle. Nucleic Acids Res, 2013. 41(14): p. 7048–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Kireeva M.L., et al. , Transient reversal of RNA polymerase II active site closing controls fidelity of transcription elongation. Mol Cell, 2008. 30(5): p. 557–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Barnes C.O., et al. , Crystal Structure of a Transcribing RNA Polymerase II Complex Reveals a Complete Transcription Bubble. Mol Cell, 2015. 59(2): p. 258–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Fong N., et al. , Pre-mRNA splicing is facilitated by an optimal RNA polymerase II elongation rate. Genes Dev, 2014. 28(23): p. 2663–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Xu L., et al. , Dissecting the chemical interactions and substrate structural signatures governing RNA polymerase II trigger loop closure by synthetic nucleic acid analogues. Nucleic Acids Res, 2014. 42(9): p. 5863–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Vassylyev D.G., et al. , Structural basis for substrate loading in bacterial RNA polymerase. Nature, 2007. 448(7150): p. 163–8. [DOI] [PubMed] [Google Scholar]
32.Da L.T., Wang D., and Huang X., Dynamics of pyrophosphate ion release and its coupled trigger loop motion from closed to open state in RNA polymerase II. J Am Chem Soc, 2012. 134(4): p. 2399–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Liu B., Zuo Y., and Steitz T.A., Structures of E. coli sigmaS-transcription initiation complexes provide new insights into polymerase mechanism. Proc Natl Acad Sci U S A, 2016. 113(15): p. 4051–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Seibold S.A., et al. , Conformational coupling, bridge helix dynamics and active site dehydration in catalysis by RNA polymerase. Biochim Biophys Acta, 2010. 1799(8): p. 575–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Zhang J., Palangat M., and Landick R., Role of the RNA polymerase trigger loop in catalysis and pausing. Nat Struct Mol Biol, 2010. 17(1): p. 99–104. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Wang D., et al. , Structural basis of transcription: backtracked RNA polymerase II at 3.4 angstrom resolution. Science, 2009. 324(5931): p. 1203–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Cheung A.C. and Cramer P., Structural basis of RNA polymerase II backtracking, arrest and reactivation. Nature, 2011. 471(7337): p. 249–53. [DOI] [PubMed] [Google Scholar]
38.Mosaei H. and Zenkin N., Two distinct pathways of RNA polymerase backtracking determine the requirement for the Trigger Loop during RNA hydrolysis. Nucleic Acids Res, 2021. 49(15): p. 8777–8784. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Nayak D., et al. , Cys-pair reporters detect a constrained trigger loop in a paused RNA polymerase. Mol Cell, 2013. 50(6): p. 882–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Kettenberger H., Armache K.J., and Cramer P., Complete RNA polymerase II elongation complex structure and its interactions with NTP and TFIIS. Mol Cell, 2004. 16(6): p. 955–65. [DOI] [PubMed] [Google Scholar]
41.Lennon C.W., et al. , Direct interactions between the coiled-coil tip of DksA and the trigger loop of RNA polymerase mediate transcriptional regulation. Genes Dev, 2012. 26(23): p. 2634–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Sekine S., et al. , The ratcheted and ratchetable structural states of RNA polymerase underlie multiple transcriptional functions. Mol Cell, 2015. 57(3): p. 408–21. [DOI] [PubMed] [Google Scholar]
43.Hein P.P., et al. , RNA polymerase pausing and nascent-RNA structure formation are linked through clamp-domain movement. Nat Struct Mol Biol, 2014. 21(9): p. 794–802. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Malagon F., et al. , Mutations in the Saccharomyces cerevisiae RPB1 gene conferring hypersensitivity to 6-azauracil. Genetics, 2006. 172(4): p. 2201–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Kaplan C.D., Larsson K.M., and Kornberg R.D., The RNA polymerase II trigger loop functions in substrate selection and is directly targeted by alpha-amanitin. Mol Cell, 2008. 30(5): p. 547–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Kaplan C.D., et al. , Dissection of Pol II trigger loop function and Pol II activity-dependent control of start site selection in vivo. PLoS Genet, 2012. 8(4): p. e1002627. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Larsona Matthew H., Zhoub Jing, Kaplanc Craig D., Palangatd Murali, Kornberge Roger D., Landickd Robert, and Blocka Steven M., Trigger loop dynamics mediate the balance between the transcriptional fidelity and speed of RNA polymerase II. 2012. [DOI] [PMC free article] [PubMed]
48.Kireeva M.L., et al. , Molecular dynamics and mutational analysis of the catalytic and translocation cycle of RNA polymerase. BMC Biophys, 2012. 5: p. 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Sydow J.F., et al. , Structural basis of transcription: mismatch-specific fidelity mechanisms and paused RNA polymerase II with frayed RNA. Mol Cell, 2009. 34(6): p. 710–21. [DOI] [PubMed] [Google Scholar]
50.Taatjes A.C.S.a.D.J., Structure and mechanism of the RNA polymerase II transcription machinery. 2020. [DOI] [PMC free article] [PubMed]
51.Leng X.Y., et al. , Organismal benefits of transcription speed control at gene boundaries. Embo Reports, 2020. 21(4). [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Braberg H., et al. , From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II. Cell, 2013. 154(4): p. 775–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Kaster B.C., et al. , RNA Polymerase II Trigger Loop Mobility: INDIRECT EFFECTS OF Rpb9. J Biol Chem, 2016. 291(28): p. 14883–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Windgassen T.A., et al. , Trigger-helix folding pathway and SI3 mediate catalysis and hairpin-stabilized pausing by Escherichia coli RNA polymerase. Nucleic Acids Res, 2014. 42(20): p. 12707–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Tesileanu T., Colwell L.J., and Leibler S., Protein sectors: statistical coupling analysis versus conservation. PLoS Comput Biol, 2015. 11(2): p. e1004091. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Phillips P.C., The language of gene interaction. Genetics, 1998. 149(3): p. 1167–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Breen M.S., et al. , Epistasis as the primary factor in molecular evolution. Nature, 2012. 490(7421): p. 535–8. [DOI] [PubMed] [Google Scholar]
58.Starr T.N. and Thornton J.W., Epistasis in protein evolution. Protein Sci, 2016. 25(7): p. 1204–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Lockless S.W. and Ranganathan R., Evolutionarily conserved pathways of energetic connectivity in protein families. Science, 1999. 286(5438): p. 295–9. [DOI] [PubMed] [Google Scholar]
60.Rivoire O., Reynolds K.A., and Ranganathan R., Evolution-Based Functional Decomposition of Proteins. PLoS Comput Biol, 2016. 12(6): p. e1004817. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Socolich M., et al. , Evolutionary information for specifying a protein fold. Nature, 2005. 437(7058): p. 512–8. [DOI] [PubMed] [Google Scholar]
62.Russ W.P., et al. , Natural-like function in artificial WW domains. Nature, 2005. 437(7058): p. 579–83. [DOI] [PubMed] [Google Scholar]
63.Halabi N., et al. , Protein sectors: evolutionary units of three-dimensional structure. Cell, 2009. 138(4): p. 774–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Araya C.L., et al. , A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci U S A, 2012. 109(42): p. 16858–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Salinas V.H. and Ranganathan R., Coevolution-based inference of amino acid interactions underlying protein function. Elife, 2018. 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
66.Ortlund E.A., et al. , Crystal structure of an ancient protein: evolution by conformational epistasis. Science, 2007. 317(5844): p. 1544–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Karageorgi M., et al. , Genome editing retraces the evolution of toxin resistance in the monarch butterfly. Nature, 2019. 574(7778): p. 409–412. [DOI] [PMC free article] [PubMed] [Google Scholar]
68.Phillips P.C., Epistasis--the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet, 2008. 9(11): p. 855–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
69.Faure A.J., et al. , Mapping the energetic and allosteric landscapes of protein binding domains. Nature, 2022. 604(7904): p. 175–183. [DOI] [PubMed] [Google Scholar]
70.Park Y., Metzger B.P.H., and Thornton J.W., Epistatic drift causes gradual decay of predictability in protein evolution. Science, 2022. 376(6595): p. 823–830. [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Ding D., et al. , Co-evolution of interacting proteins through non-contacting and non-specific mutations. Nat Ecol Evol, 2022. 6(5): p. 590–603. [DOI] [PMC free article] [PubMed] [Google Scholar]
72.Kondrashov A.S., Sunyaev S., and Kondrashov F.A., Dobzhansky-Muller incompatibilities in protein evolution. Proc Natl Acad Sci U S A, 2002. 99(23): p. 14878–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Lunzer M., Golding G.B., and Dean A.M., Pervasive cryptic epistasis in molecular evolution. PLoS Genet, 2010. 6(10): p. e1001162. [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Natarajan C., et al. , Epistasis among adaptive mutations in deer mouse hemoglobin. Science, 2013. 340(6138): p. 1324–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Doud M.B., Ashenberg O., and Bloom J.D., Site-Specific Amino Acid Preferences Are Mostly Conserved in Two Closely Related Protein Homologs. Mol Biol Evol, 2015. 32(11): p. 2944–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
76.Starr T.N., et al. , Shifting mutational constraints in the SARS-CoV-2 receptor-binding domain during viral evolution. Science, 2022. 377(6604): p. 420–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
77.Haddox H.K., et al. , Mapping mutational effects along the evolutionary landscape of HIV envelope. Elife, 2018. 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
78.Viktorovskaya O.V., et al. , Divergent contributions of conserved active site residues to transcription by eukaryotic RNA polymerases I and II. Cell Rep, 2013. 4(5): p. 974–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Mani R., et al. , Defining genetic interaction. Proc Natl Acad Sci U S A, 2008. 105(9): p. 3461–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Qiu C. and Kaplan C.D., Functional assays for transcription mechanisms in high-throughput. Methods, 2019. 159–160: p. 115–123. [DOI] [PMC free article] [PubMed] [Google Scholar]
81.Lin X., et al. , Nested epistasis enhancer networks for robust genome regulation. Science, 2022. 377(6610): p. 1077–1085. [DOI] [PMC free article] [PubMed] [Google Scholar]
82.Fowler D.M. and Fields S., Deep mutational scanning: a new style of protein science. Nat Methods, 2014. 11(8): p. 801–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
83.Sergey Ioffe C.S., Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv:1502.03167v3, 2015. [Google Scholar]
84.Hill W.G., Goddard M.E., and Visscher P.M., Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet, 2008. 4(2): p. e1000008. [DOI] [PMC free article] [PubMed] [Google Scholar]
85.Carvalho A.T., Fernandes P.A., and Ramos M.J., The Catalytic Mechanism of RNA Polymerase II. J Chem Theory Comput, 2011. 7(4): p. 1177–88. [DOI] [PubMed] [Google Scholar]
86.Huang X., et al. , RNA polymerase II trigger loop residues stabilize and position the incoming nucleotide triphosphate in transcription. Proc Natl Acad Sci U S A, 2010. 107(36): p. 15745–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
87.Castro C., et al. , Nucleic acid polymerases use a general acid for nucleotidyl transfer. Nat Struct Mol Biol, 2009. 16(2): p. 212–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
88.Unarta I.C., et al. , Nucleotide addition and cleavage by RNA polymerase II: Coordination of two catalytic reactions using a single active site. J Biol Chem, 2023. 299(2): p. 102844. [DOI] [PMC free article] [PubMed] [Google Scholar]
89.Mishanina T.V., et al. , Trigger loop of RNA polymerase is a positional, not acid-base, catalyst for both transcription and proofreading. Proc Natl Acad Sci U S A, 2017. 114(26): p. E5103–E5112. [DOI] [PMC free article] [PubMed] [Google Scholar]
90.Palo M.Z., et al. , Conserved Trigger Loop Histidine of RNA Polymerase II Functions as a Positional Catalyst Primarily through Steric Effects. Biochemistry, 2021. 60(44): p. 3323–3336. [DOI] [PMC free article] [PubMed] [Google Scholar]
91.Melamed D., et al. , Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA, 2013. 19(11): p. 1537–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
92.Harms M.J. and Thornton J.W., Analyzing protein structure and function using ancestral gene reconstruction. Curr Opin Struct Biol, 2010. 20(3): p. 360–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
93.Fowler D.M., et al. , High-resolution mapping of protein sequence-function relationships. Nat Methods, 2010. 7(9): p. 741–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
94.Scull C.E., et al. , A Novel Assay for RNA Polymerase I Transcription Elongation Sheds Light on the Evolutionary Divergence of Eukaryotic RNA Polymerases. Biochemistry, 2019. 58(16): p. 2116–2124. [DOI] [PMC free article] [PubMed] [Google Scholar]
95.Toulokhonov I., et al. , A central role of the RNA polymerase trigger loop in active-site rearrangement during transcriptional pausing. Mol Cell, 2007. 27(3): p. 406–19. [DOI] [PubMed] [Google Scholar]
96.Yuzenkova Y., et al. , Stepwise mechanism for transcription fidelity. BMC Biol, 2010. 8: p. 54. [DOI] [PMC free article] [PubMed] [Google Scholar]
97.Wang W., et al. , Structural basis of transcriptional stalling and bypass of abasic DNA lesion by RNA polymerase II. Proc Natl Acad Sci U S A, 2018. 115(11): p. E2538–E2545. [DOI] [PMC free article] [PubMed] [Google Scholar]
98.Cramer P., Organization and regulation of gene transcription. Nature, 2019. 573(7772): p. 45–54. [DOI] [PubMed] [Google Scholar]
99.Schier A.C. and Taatjes D.J., Structure and mechanism of the RNA polymerase II transcription machinery. Genes Dev, 2020. 34(7–8): p. 465–488. [DOI] [PMC free article] [PubMed] [Google Scholar]
100.Vos S.M., et al. , Structure of complete Pol II-DSIF-PAF-SPT6 transcription complex reveals RTF1 allosteric activation. Nat Struct Mol Biol, 2020. 27(7): p. 668–677. [DOI] [PubMed] [Google Scholar]
101.Castro C., et al. , Two proton transfers in the transition state for nucleotidyl transfer catalyzed by RNA- and DNA-dependent RNA and DNA polymerases. Proc Natl Acad Sci U S A, 2007. 104(11): p. 4267–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
102.Gregory M.T., et al. , Multiple deprotonation paths of the nucleophile 3’-OH in the DNA synthesis reaction. Proc Natl Acad Sci U S A, 2021. 118(23). [DOI] [PMC free article] [PubMed] [Google Scholar]
103.Sing-Hoi Sze C.D.K., Codon-Based Sequence Alignment for Mutation Analysis by High-Throughput Sequencing. 2018 IEEE 8th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), 2018. [Google Scholar]
104.Notredame C., Higgins D.G., and Heringa J., T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol, 2000. 302(1): p. 205–17. [DOI] [PubMed] [Google Scholar]
105.Price M.N., Dehal P.S., and Arkin A.P., FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One, 2010. 5(3): p. e9490. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

Supplemental Method 1. High efficiency large scale chemical yeast transformation protocol.

Supplemental Method 2. Emulsion PCR set up with EURx Micellula DNA Emulsion & Purification (ePCR) PCR kit.

Supplemental Method 3. Amplification/transformation/screening of mutant libraries and sequencing pool preparation.

Supplemental Method 4. Formulas of calculating functional interactions.

media-1.pdf^{(636.5KB, pdf)}

Supplement 2

Supplemental Table 1. Strains and plasmids.

media-2.xlsx^{(19.8KB, xlsx)}

Supplement 3

Supplemental Table 2. Primers.

media-3.xlsx^{(28.7KB, xlsx)}

Supplement 4

Supplemental Table 3. Phenotyping details.

media-4.xlsx^{(9KB, xlsx)}

Supplement 5

Supplemental Table 4. Library summary.

media-5.xlsx^{(9.5KB, xlsx)}

Supplement 6

Supplemental Table 5. MLR_models_summary.

media-6.xlsx^{(12.4KB, xlsx)}

Supplement 7

NIHPP2023.02.27.530048v1-supplement-7.pdf^{(6.4MB, pdf)}

[R1] 1.Cramer P., Multisubunit RNA polymerases. Curr Opin Struct Biol, 2002. 12(1): p. 89–97. [DOI] [PubMed] [Google Scholar]

[R2] 2.Werner F. and Grohmann D., Evolution of multisubunit RNA polymerases in the three domains of life. Nat Rev Microbiol, 2011. 9(2): p. 85–98. [DOI] [PubMed] [Google Scholar]

[R3] 3.Allison L.A., et al. , Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases. Cell, 1985. 42(2): p. 599–610. [DOI] [PubMed] [Google Scholar]

[R4] 4.Zhang G., et al. , Crystal Structure of Thermus aquaticus Core RNA Polymerase at 3.3 Å Resolution. Cell, 1999. 98(6): p. 811–824. [DOI] [PubMed] [Google Scholar]

[R5] 5.Vassylyev D.G., et al. , Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 A resolution. Nature, 2002. 417(6890): p. 712–9. [DOI] [PubMed] [Google Scholar]

[R6] 6.Hirata A., Klein B.J., and Murakami K.S., The X-ray crystal structure of RNA polymerase from Archaea. Nature, 2008. 451(7180): p. 851–4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Gnatt A.L., et al. , Structural basis of transcription: an RNA polymerase II elongation complex at 3.3 A resolution. Science, 2001. 292(5523): p. 1876–82. [DOI] [PubMed] [Google Scholar]

[R8] 8.Cramer P., Bushnell D.A., and Kornberg R.D., Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science, 2001. 292(5523): p. 1863–76. [DOI] [PubMed] [Google Scholar]

[R9] 9.Fernandez-Tornero C., et al. , Crystal structure of the 14-subunit RNA polymerase I. Nature, 2013. 502(7473): p. 644–9. [DOI] [PubMed] [Google Scholar]

[R10] 10.Hoffmann N.A., et al. , Molecular structures of unbound and transcribing RNA polymerase III. Nature, 2015. 528(7581): p. 231–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Malinen A.M., et al. , Active site opening and closure control translocation of multisubunit RNA polymerase. Nucleic Acids Res, 2012. 40(15): p. 7442–51. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Wang D., et al. , Structural basis of transcription: role of the trigger loop in substrate specificity and catalysis. Cell, 2006. 127(5): p. 941–54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Kaplan C.D., Basic mechanisms of RNA polymerase II activity and alteration of gene expression in Saccharomyces cerevisiae. Biochimica Et Biophysica Acta-Gene Regulatory Mechanisms, 2013. 1829(1): p. 39–54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Dangkulwanich M., et al. , Complete dissection of transcription elongation reveals slow translocation of RNA polymerase II in a linear ratchet mechanism. Elife, 2013. 2: p. e00971. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Bar-Nahum G., et al. , A ratchet mechanism of transcription elongation and its control. Cell, 2005. 120(2): p. 183–93. [DOI] [PubMed] [Google Scholar]

[R16] 16.Weinzierl R.O., The nucleotide addition cycle of RNA polymerase is controlled by two molecular hinges in the Bridge Helix domain. BMC Biol, 2010. 8: p. 134. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Qiu C., et al. , High-Resolution Phenotypic Landscape of the RNA Polymerase II Trigger Loop. PLoS Genet, 2016. 12(11): p. e1006321. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Da L.T., et al. , Bridge helix bending promotes RNA polymerase II backtracking through a critical and conserved threonine residue. Nat Commun, 2016. 7: p. 11244. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Mazumder A., et al. , Closing and opening of the RNA polymerase trigger loop. Proc Natl Acad Sci U S A, 2020. 117(27): p. 15642–15649. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Liu X., Bushnell D.A., and Kornberg R.D., RNA polymerase II transcription: structure and mechanism. Biochim Biophys Acta, 2013. 1829(1): p. 2–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Kaplan C.D. and Kornberg R.D., A bridge to transcription by RNA polymerase. J Biol, 2008. 7(10): p. 39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Tan L., et al. , Bridge helix and trigger loop perturbations generate superactive RNA polymerases. J Biol, 2008. 7(10): p. 40. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Silva D.A., et al. , Millisecond dynamics of RNA polymerase II translocation at atomic resolution. Proc Natl Acad Sci U S A, 2014. 111(21): p. 7665–70. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Wang B., et al. , Energetic and structural details of the trigger-loop closing transition in RNA polymerase II. Biophys J, 2013. 105(3): p. 767–75. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Larson M.H., et al. , Trigger loop dynamics mediate the balance between the transcriptional fidelity and speed of RNA polymerase II. Proc Natl Acad Sci U S A, 2012. 109(17): p. 6555–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Fouqueau T., et al. , The RNA polymerase trigger loop functions in all three phases of the transcription cycle. Nucleic Acids Res, 2013. 41(14): p. 7048–59. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Kireeva M.L., et al. , Transient reversal of RNA polymerase II active site closing controls fidelity of transcription elongation. Mol Cell, 2008. 30(5): p. 557–66. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Barnes C.O., et al. , Crystal Structure of a Transcribing RNA Polymerase II Complex Reveals a Complete Transcription Bubble. Mol Cell, 2015. 59(2): p. 258–69. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Fong N., et al. , Pre-mRNA splicing is facilitated by an optimal RNA polymerase II elongation rate. Genes Dev, 2014. 28(23): p. 2663–76. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Xu L., et al. , Dissecting the chemical interactions and substrate structural signatures governing RNA polymerase II trigger loop closure by synthetic nucleic acid analogues. Nucleic Acids Res, 2014. 42(9): p. 5863–70. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Vassylyev D.G., et al. , Structural basis for substrate loading in bacterial RNA polymerase. Nature, 2007. 448(7150): p. 163–8. [DOI] [PubMed] [Google Scholar]

[R32] 32.Da L.T., Wang D., and Huang X., Dynamics of pyrophosphate ion release and its coupled trigger loop motion from closed to open state in RNA polymerase II. J Am Chem Soc, 2012. 134(4): p. 2399–406. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Liu B., Zuo Y., and Steitz T.A., Structures of E. coli sigmaS-transcription initiation complexes provide new insights into polymerase mechanism. Proc Natl Acad Sci U S A, 2016. 113(15): p. 4051–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Seibold S.A., et al. , Conformational coupling, bridge helix dynamics and active site dehydration in catalysis by RNA polymerase. Biochim Biophys Acta, 2010. 1799(8): p. 575–87. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Zhang J., Palangat M., and Landick R., Role of the RNA polymerase trigger loop in catalysis and pausing. Nat Struct Mol Biol, 2010. 17(1): p. 99–104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Wang D., et al. , Structural basis of transcription: backtracked RNA polymerase II at 3.4 angstrom resolution. Science, 2009. 324(5931): p. 1203–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Cheung A.C. and Cramer P., Structural basis of RNA polymerase II backtracking, arrest and reactivation. Nature, 2011. 471(7337): p. 249–53. [DOI] [PubMed] [Google Scholar]

[R38] 38.Mosaei H. and Zenkin N., Two distinct pathways of RNA polymerase backtracking determine the requirement for the Trigger Loop during RNA hydrolysis. Nucleic Acids Res, 2021. 49(15): p. 8777–8784. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Nayak D., et al. , Cys-pair reporters detect a constrained trigger loop in a paused RNA polymerase. Mol Cell, 2013. 50(6): p. 882–93. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Kettenberger H., Armache K.J., and Cramer P., Complete RNA polymerase II elongation complex structure and its interactions with NTP and TFIIS. Mol Cell, 2004. 16(6): p. 955–65. [DOI] [PubMed] [Google Scholar]

[R41] 41.Lennon C.W., et al. , Direct interactions between the coiled-coil tip of DksA and the trigger loop of RNA polymerase mediate transcriptional regulation. Genes Dev, 2012. 26(23): p. 2634–46. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Sekine S., et al. , The ratcheted and ratchetable structural states of RNA polymerase underlie multiple transcriptional functions. Mol Cell, 2015. 57(3): p. 408–21. [DOI] [PubMed] [Google Scholar]

[R43] 43.Hein P.P., et al. , RNA polymerase pausing and nascent-RNA structure formation are linked through clamp-domain movement. Nat Struct Mol Biol, 2014. 21(9): p. 794–802. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Malagon F., et al. , Mutations in the Saccharomyces cerevisiae RPB1 gene conferring hypersensitivity to 6-azauracil. Genetics, 2006. 172(4): p. 2201–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] 45.Kaplan C.D., Larsson K.M., and Kornberg R.D., The RNA polymerase II trigger loop functions in substrate selection and is directly targeted by alpha-amanitin. Mol Cell, 2008. 30(5): p. 547–56. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Kaplan C.D., et al. , Dissection of Pol II trigger loop function and Pol II activity-dependent control of start site selection in vivo. PLoS Genet, 2012. 8(4): p. e1002627. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Larsona Matthew H., Zhoub Jing, Kaplanc Craig D., Palangatd Murali, Kornberge Roger D., Landickd Robert, and Blocka Steven M., Trigger loop dynamics mediate the balance between the transcriptional fidelity and speed of RNA polymerase II. 2012. [DOI] [PMC free article] [PubMed]

[R48] 48.Kireeva M.L., et al. , Molecular dynamics and mutational analysis of the catalytic and translocation cycle of RNA polymerase. BMC Biophys, 2012. 5: p. 11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] 49.Sydow J.F., et al. , Structural basis of transcription: mismatch-specific fidelity mechanisms and paused RNA polymerase II with frayed RNA. Mol Cell, 2009. 34(6): p. 710–21. [DOI] [PubMed] [Google Scholar]

[R50] 50.Taatjes A.C.S.a.D.J., Structure and mechanism of the RNA polymerase II transcription machinery. 2020. [DOI] [PMC free article] [PubMed]

[R51] 51.Leng X.Y., et al. , Organismal benefits of transcription speed control at gene boundaries. Embo Reports, 2020. 21(4). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] 52.Braberg H., et al. , From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II. Cell, 2013. 154(4): p. 775–88. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] 53.Kaster B.C., et al. , RNA Polymerase II Trigger Loop Mobility: INDIRECT EFFECTS OF Rpb9. J Biol Chem, 2016. 291(28): p. 14883–95. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] 54.Windgassen T.A., et al. , Trigger-helix folding pathway and SI3 mediate catalysis and hairpin-stabilized pausing by Escherichia coli RNA polymerase. Nucleic Acids Res, 2014. 42(20): p. 12707–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] 55.Tesileanu T., Colwell L.J., and Leibler S., Protein sectors: statistical coupling analysis versus conservation. PLoS Comput Biol, 2015. 11(2): p. e1004091. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] 56.Phillips P.C., The language of gene interaction. Genetics, 1998. 149(3): p. 1167–71. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] 57.Breen M.S., et al. , Epistasis as the primary factor in molecular evolution. Nature, 2012. 490(7421): p. 535–8. [DOI] [PubMed] [Google Scholar]

[R58] 58.Starr T.N. and Thornton J.W., Epistasis in protein evolution. Protein Sci, 2016. 25(7): p. 1204–18. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] 59.Lockless S.W. and Ranganathan R., Evolutionarily conserved pathways of energetic connectivity in protein families. Science, 1999. 286(5438): p. 295–9. [DOI] [PubMed] [Google Scholar]

[R60] 60.Rivoire O., Reynolds K.A., and Ranganathan R., Evolution-Based Functional Decomposition of Proteins. PLoS Comput Biol, 2016. 12(6): p. e1004817. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R61] 61.Socolich M., et al. , Evolutionary information for specifying a protein fold. Nature, 2005. 437(7058): p. 512–8. [DOI] [PubMed] [Google Scholar]

[R62] 62.Russ W.P., et al. , Natural-like function in artificial WW domains. Nature, 2005. 437(7058): p. 579–83. [DOI] [PubMed] [Google Scholar]

[R63] 63.Halabi N., et al. , Protein sectors: evolutionary units of three-dimensional structure. Cell, 2009. 138(4): p. 774–86. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R64] 64.Araya C.L., et al. , A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci U S A, 2012. 109(42): p. 16858–63. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R65] 65.Salinas V.H. and Ranganathan R., Coevolution-based inference of amino acid interactions underlying protein function. Elife, 2018. 7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R66] 66.Ortlund E.A., et al. , Crystal structure of an ancient protein: evolution by conformational epistasis. Science, 2007. 317(5844): p. 1544–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R67] 67.Karageorgi M., et al. , Genome editing retraces the evolution of toxin resistance in the monarch butterfly. Nature, 2019. 574(7778): p. 409–412. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R68] 68.Phillips P.C., Epistasis--the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet, 2008. 9(11): p. 855–67. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R69] 69.Faure A.J., et al. , Mapping the energetic and allosteric landscapes of protein binding domains. Nature, 2022. 604(7904): p. 175–183. [DOI] [PubMed] [Google Scholar]

[R70] 70.Park Y., Metzger B.P.H., and Thornton J.W., Epistatic drift causes gradual decay of predictability in protein evolution. Science, 2022. 376(6595): p. 823–830. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R71] 71.Ding D., et al. , Co-evolution of interacting proteins through non-contacting and non-specific mutations. Nat Ecol Evol, 2022. 6(5): p. 590–603. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R72] 72.Kondrashov A.S., Sunyaev S., and Kondrashov F.A., Dobzhansky-Muller incompatibilities in protein evolution. Proc Natl Acad Sci U S A, 2002. 99(23): p. 14878–83. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R73] 73.Lunzer M., Golding G.B., and Dean A.M., Pervasive cryptic epistasis in molecular evolution. PLoS Genet, 2010. 6(10): p. e1001162. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R74] 74.Natarajan C., et al. , Epistasis among adaptive mutations in deer mouse hemoglobin. Science, 2013. 340(6138): p. 1324–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R75] 75.Doud M.B., Ashenberg O., and Bloom J.D., Site-Specific Amino Acid Preferences Are Mostly Conserved in Two Closely Related Protein Homologs. Mol Biol Evol, 2015. 32(11): p. 2944–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R76] 76.Starr T.N., et al. , Shifting mutational constraints in the SARS-CoV-2 receptor-binding domain during viral evolution. Science, 2022. 377(6604): p. 420–424. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R77] 77.Haddox H.K., et al. , Mapping mutational effects along the evolutionary landscape of HIV envelope. Elife, 2018. 7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R78] 78.Viktorovskaya O.V., et al. , Divergent contributions of conserved active site residues to transcription by eukaryotic RNA polymerases I and II. Cell Rep, 2013. 4(5): p. 974–84. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R79] 79.Mani R., et al. , Defining genetic interaction. Proc Natl Acad Sci U S A, 2008. 105(9): p. 3461–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R80] 80.Qiu C. and Kaplan C.D., Functional assays for transcription mechanisms in high-throughput. Methods, 2019. 159–160: p. 115–123. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R81] 81.Lin X., et al. , Nested epistasis enhancer networks for robust genome regulation. Science, 2022. 377(6610): p. 1077–1085. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R82] 82.Fowler D.M. and Fields S., Deep mutational scanning: a new style of protein science. Nat Methods, 2014. 11(8): p. 801–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R83] 83.Sergey Ioffe C.S., Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv:1502.03167v3, 2015. [Google Scholar]

[R84] 84.Hill W.G., Goddard M.E., and Visscher P.M., Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet, 2008. 4(2): p. e1000008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R85] 85.Carvalho A.T., Fernandes P.A., and Ramos M.J., The Catalytic Mechanism of RNA Polymerase II. J Chem Theory Comput, 2011. 7(4): p. 1177–88. [DOI] [PubMed] [Google Scholar]

[R86] 86.Huang X., et al. , RNA polymerase II trigger loop residues stabilize and position the incoming nucleotide triphosphate in transcription. Proc Natl Acad Sci U S A, 2010. 107(36): p. 15745–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R87] 87.Castro C., et al. , Nucleic acid polymerases use a general acid for nucleotidyl transfer. Nat Struct Mol Biol, 2009. 16(2): p. 212–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R88] 88.Unarta I.C., et al. , Nucleotide addition and cleavage by RNA polymerase II: Coordination of two catalytic reactions using a single active site. J Biol Chem, 2023. 299(2): p. 102844. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R89] 89.Mishanina T.V., et al. , Trigger loop of RNA polymerase is a positional, not acid-base, catalyst for both transcription and proofreading. Proc Natl Acad Sci U S A, 2017. 114(26): p. E5103–E5112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R90] 90.Palo M.Z., et al. , Conserved Trigger Loop Histidine of RNA Polymerase II Functions as a Positional Catalyst Primarily through Steric Effects. Biochemistry, 2021. 60(44): p. 3323–3336. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R91] 91.Melamed D., et al. , Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA, 2013. 19(11): p. 1537–51. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R92] 92.Harms M.J. and Thornton J.W., Analyzing protein structure and function using ancestral gene reconstruction. Curr Opin Struct Biol, 2010. 20(3): p. 360–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R93] 93.Fowler D.M., et al. , High-resolution mapping of protein sequence-function relationships. Nat Methods, 2010. 7(9): p. 741–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R94] 94.Scull C.E., et al. , A Novel Assay for RNA Polymerase I Transcription Elongation Sheds Light on the Evolutionary Divergence of Eukaryotic RNA Polymerases. Biochemistry, 2019. 58(16): p. 2116–2124. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R95] 95.Toulokhonov I., et al. , A central role of the RNA polymerase trigger loop in active-site rearrangement during transcriptional pausing. Mol Cell, 2007. 27(3): p. 406–19. [DOI] [PubMed] [Google Scholar]

[R96] 96.Yuzenkova Y., et al. , Stepwise mechanism for transcription fidelity. BMC Biol, 2010. 8: p. 54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R97] 97.Wang W., et al. , Structural basis of transcriptional stalling and bypass of abasic DNA lesion by RNA polymerase II. Proc Natl Acad Sci U S A, 2018. 115(11): p. E2538–E2545. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R98] 98.Cramer P., Organization and regulation of gene transcription. Nature, 2019. 573(7772): p. 45–54. [DOI] [PubMed] [Google Scholar]

[R99] 99.Schier A.C. and Taatjes D.J., Structure and mechanism of the RNA polymerase II transcription machinery. Genes Dev, 2020. 34(7–8): p. 465–488. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R100] 100.Vos S.M., et al. , Structure of complete Pol II-DSIF-PAF-SPT6 transcription complex reveals RTF1 allosteric activation. Nat Struct Mol Biol, 2020. 27(7): p. 668–677. [DOI] [PubMed] [Google Scholar]

[R101] 101.Castro C., et al. , Two proton transfers in the transition state for nucleotidyl transfer catalyzed by RNA- and DNA-dependent RNA and DNA polymerases. Proc Natl Acad Sci U S A, 2007. 104(11): p. 4267–72. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R102] 102.Gregory M.T., et al. , Multiple deprotonation paths of the nucleophile 3’-OH in the DNA synthesis reaction. Proc Natl Acad Sci U S A, 2021. 118(23). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R103] 103.Sing-Hoi Sze C.D.K., Codon-Based Sequence Alignment for Mutation Analysis by High-Throughput Sequencing. 2018 IEEE 8th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), 2018. [Google Scholar]

[R104] 104.Notredame C., Higgins D.G., and Heringa J., T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol, 2000. 302(1): p. 205–17. [DOI] [PubMed] [Google Scholar]

[R105] 105.Price M.N., Dehal P.S., and Arkin A.P., FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One, 2010. 5(3): p. e9490. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

This is a preprint.

Widespread epistasis shapes RNA Polymerase II active site function and evolution

Bingbing Duan

Chenxi Qiu

Sing-Hoi Sze

Craig Kaplan

Abstract

INTRODUCTION

Figure 1. Systematic detection of Pol II residue by deep mutational scanning.

RESULTS

Systematic detection of residue interactions inside and outside of the Pol II TL by deep mutational scanning.

TL residues are embedded in complex interaction networks.

Figure 2. Pol II intra-TL interaction networks.

Allele-specific interactions suggest unique properties of individual mutants with similar phenotypes.

Figure 3. Pol II-TL functional interaction networks.

Figure 4. Distinct interaction networks across mutants of similar phenotypes.

TL evolution is shaped by epistasis between TL and its enzymatic context.

Figure 5. Detection of higher order epistasis in Pol II context.

Rpb1 coevolutionary residue networks identified by Statistical Coupling Analysis (SCA).

Figure 6.

DISCUSSION

Functional networks within Pol II revealed.

METHODS

Design and Synthesis of TL mutant libraries.

Introduction of Libraries into yeast and phenotyping.

Generation of libraries for quantification by amplicon sequencing.

Data cleaning and fitness calculation and normalization.

Determination of functional interactions.

Mutant classification using two multiple logistic regression models.

t-SNE projection.

Statistical coupling analysis.

Supplementary Material

ACKNOWLEDGMENTS

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases