Significance
Ribosomal RNA is among the most heavily modified RNAs, and chemical modifications to such are known to have profound impacts on ribosome function. Recent investigations into archaeal epitranscriptomes have demonstrated that ribosomes from Thermococcus kodakarensis, a heat-loving archaeon, are hypermodified with 4-acetylcytidine and 5-methylcytidine, and the epitranscriptome supports growth at extreme temperatures. In this study, we add an addition to the epitranscriptome that extends beyond the Archaea: the synthesis of the nucleobase N4,N4-dimethylcytidine, and define the enzyme responsible for its synthesis. Our finding advances scientific understanding of how unique epitranscriptomic marks are critical for ribosome function under changing environmental conditions.
Keywords: N4,N4-dimethylcytidine; RNA modifications; epitranscriptome; archaea; hyperthermophile
Abstract
Ribosome structure and activity are challenged at high temperatures, often demanding modifications to ribosomal RNAs (rRNAs) to retain translation fidelity. LC-MS/MS, bisulfite-sequencing, and high-resolution cryo-EM structures of the archaeal ribosome identified an RNA modification, N4,N4-dimethylcytidine (m42C), at the universally conserved C918 in the 16S rRNA helix 31 loop. Here, we characterize and structurally resolve a class of RNA methyltransferase that generates m42C whose function is critical for hyperthermophilic growth. m42C is synthesized by the activity of a unique family of RNA methyltransferase containing a Rossman-fold that targets only intact ribosomes. The phylogenetic distribution of the newly identified m42C synthase family implies that m42C is biologically relevant in each domain. Resistance of m42C to bisulfite-driven deamination suggests that efforts to capture m5C profiles via bisulfite sequencing are also capturing m42C.
Translation errors exponentially increase at elevated temperatures (1, 2). Strategies employed by hyperthermophiles to maintain translation accuracy are likely multifaceted and include chemical modifications to RNA (3–8). rRNA is decorated with modifications concentrated in structurally conserved regions that are known to impact the function and molecular interactions of the ribosome (9, 10). Dynamic and unique RNA modification profiles in hyperthermophilic microbes promote thermophily and are often necessary for normal ribosomal biogenesis and function (4–7, 11, 12). The biosynthetic pathways employed and the biological function(s) of site-specific RNA modifications are still emerging as new technologies are only now enabling the mapping of substoichiometric modifications with single nucleotide precision. The unprecedented levels of select RNA modifications in hyperthermophilic archaea (7, 11, 13), including temperature-dependent dynamic modifications to ribosomes (7), established that the generation and maintenance of the epitranscriptome is essential to hyperthermophilic growth (4–6, 12).
The ribosomal decoding center and P-site are epicenters for modified residues that contribute to ribosome stability (14). Helix 31 (H31) is a stem-loop structure in the small subunit rRNA and directly contacts the P-site tRNA, translation initiation factors, and ribosomal associated proteins. Structural and biochemical studies have suggested the importance of H31 nucleotides in cellular viability and translational fidelity (15). The H31 loop often harbors unique modifications that impact translation dynamics. Essential nucleotides in the H31 stem-loop of the small subunit rRNA extend toward the P-site (15); despite the conserved location of H31, there is substantial diversity in the modification profile of this region across model species (16). In eukaryotes, a highly modified uridine analog, such as 1-methyl-3-α-amino-α-carboxyl-propylated pseudouridine (m1acp3Ψ) in humans, is present at the apex of the H31 loop, whereas Bacteria bear a conserved 2-methylguanosine (m2G) and 5-methylcytidine (m5C) in the H31 loop (17). Ribosomes from the halophilic model archaeon Haloferax volcanii contain an acp3Ψ in the H31 loop (16), but little else is known about the modification status of H31 across the Archaea.
In the present work, we established a unique modification signature in H31 in the hyperthermophilic archaeal species, Thermococcus kodakarensis. We demonstrate that the H31 loop harbors the adjacent modified nucleosides, 1-methylpseudouridine (m1Ψ) and N4,N4-dimethylcytidine (m42C), and 2’-O-methylguanosine (Gm) sits in the H31 stem. We mapped a single m42C residue to the 16S rRNA H31 loop based on the observation that m42C is largely resistant to chemical deamination by sodium bisulfite. We demonstrate the in vivo and in vitro activities and determine the atomic structure of m42C synthase, the writer enzyme that generates m42C. We map m42C with single nucleotide resolution and identify the bona fide enzyme responsible for its synthesis. Strains lacking m42C exhibit a growth defect with increased severity at higher temperatures, but not at moderate temperatures, suggesting that the single m42C residue stabilizes the ribosome for thermophily and likely impacts translation outcomes under thermal insults.
Results
m42C sits in the 16S rRNA Helix 31 loop.
Efforts to establish the m5C-epitranscriptome of T. kodakarensis via bisulfite-sequencing (BS-seq) suggested deamination resistance normally associated with m5C in the H31 loop (Fig. 1 A, Upper panel). The complex modification profiles of H31 across model species implied that additional modifications may also be present in T. kodakarensis. To establish the modification status of 16S rRNA H31 in T. kodakarensis, we excised the H31-containing fragment from total RNA by tiling DNA oligos with complementarity to the 16S rRNA sequence adjacent to the H31 fragment and digesting the RNA with RNase H (Fig. 1B). The H31-containing fragment was then purified using a complementary, biotinylated DNA oligo (Fig. 1 B and C) and analyzed by LC-MS/MS. As a negative control, an unmodified in vitro transcribed (IVT) RNA with an identical H31 base sequence was analyzed in parallel. The mass of the isolated H31 fragment from T. kodakarensis was consistent with the addition of 4 methyl groups relative to the IVT control RNA (SI Appendix, Fig. S1A). To identify the putative methylation events, we performed a nucleoside analysis of the H31-fragment by gas-phase neutral loss fragmentation and comparison of fragmentation profiles and retention times to authentic nucleoside standards (SI Appendix, Fig. S1B and Table S1). We detected four modifications in the H31 fragment: m1Ψ, Gm, Ψ and unexpectedly, a modified cytidine nucleoside consistent with a dimethylation of the exocyclic amine at the N4 position, N4,N4-dimethylcytidine (m42C; Fig. 1E and SI Appendix, Fig. S1B and Table S1). We reasoned m42C was likely the bisulfite-resistant cytidine detected by BS-seq within the H31 fragment.
Fig. 1.

Bisulfite-sequencing reveals a single m42C residue in the 16S rRNA helix 31 loop and the enzyme responsible for its installation. (A) Bisulfite-sequenced RNA collected from TS559 or ΔTK2045 and visualized using Integrative Genomics Viewer where all cytidines were replaced with thymines in the reference genome. Sites of modification are indicated where a cytidine is retained (red) within a sequenced read. Coordinate genome positions in wild-type strain KOD1 and 16S rRNA positions are listed. (B) Purification schematic to isolate the H31-containing fragment from ribosomal or total RNA. (C) The supernatant (wash) and streptavidin capture fractions were analyzed on a polyacrylamide gel. Only when the biotinylated DNA oligo was included was the H31-containing fragment purified (arrow). (D) LC-MS/MS analysis of H31 fragments either derived from the cell or synthesized by in vitro transcription (IVT). (E) The nucleoside structure of m42C. (F) Sequence and modification profile of the H31 region. The colors of subfragments are consistent with those illustrated in panel D. The H31 sequence is underlined. (G) Cryo-EM structure of H31 nucleotides in the T. kodakarensis 70S ribosome. (H) Genomic deletion of gene TK2045 was confirmed using whole genome sequencing (DNA-seq) and RNA-bisulfite sequencing (RNA-seq). Black lines that connect gray bars represent single reads with a gapped alignment. (I) The frequency of cytidine retention in TS559 (n = 3 replicates) and ΔTK2045 (n = 2 replicates) at position C918 in the 16S rRNA during exponential (expo.) or stationary (stat.) growth phase.
To site specifically map m42C, H31-containing rRNA fragments were digested with RNase T1, and the resulting subfragments were analyzed by LC-MS/MS (Fig. 1 D and F and SI Appendix, Fig. S1C). High-energy collision dissociation (HCD) fragmentation of RNase T1 digests indicated a mass shift corresponding to a dimethylated cytidine at position C918 of H31, which is consistent with the m42C detected by nucleoside analysis (Fig. 1 D and F). We also site-specifically mapped the m1Ψ and Gm (Fig. 1 D and F). The m1Ψ modification was determined to be directly adjacent to m42C, while the Gm was mapped to the stem region. Furthermore, we detected signature fragmentation ions consistent with the presence of a Ψ in the H31 fragment (SI Appendix, Fig. S1D). A close inspection of the reported high-resolution cryo-EM structure of the T. kodakarensis 70S ribosome (PDB: 6TH6) revealed densities within H31 that largely appear consistent with an m42C modification (Fig. 1G). However, the position U917 is annotated as m5U instead of m1Ψ, as determined here, indicating that the cryo-EM structure lacks sufficient resolution to unambiguously identify the isomerization of the uracil base immediately upstream of the m42C residue. Of note, in this study, m42C has been identified in a living organism. A previous report of m42C was associated with altered modification profiles caused by viral infection in human cultured Huh7 cells (18).
TK2045 Encodes a bona fide m42C Synthase.
Efforts to establish the m5C epitranscriptome relied on sequencing of bisulfite-treated RNA collected from T. kodakarensis during exponential or stationary phase growth. Cytidines modified at the C5 position are resistant to bisulfite-driven deamination and retain their cytidine identity while unmodified cytidines are converted to uridine and sequenced as thymine. We sequenced RNA isolated from the parent strain TS559 as well as from strains uniquely deleted for one of 17 putative RNA methyltransferases in an attempt to correlate a loss in site-specific modifications with the loss of the enzyme responsible for its installation (11). Deletion of the gene encoding a predicted RNA methyltransferases (TK2045) was confirmed using whole genome Minion sequencing (Fig. 1 H, Top panel labeled DNA-seq). The absence of DNA sequencing reads mapping the TK2045 genomic loci, combined with reads that span the endpoints of the targeted deletion sequences, confirms that the sequences encoding TK2045 were markerlessly deleted from the T. kodakarensis genome. Subsequent RNA BS-seq (Fig. 1 H, Bottom panel labeled RNA-seq) of rRNA-depleted RNA from this strain confirmed that no long RNA reads map to the sequence that previously encoded TK2045, confirming that deletion of the coding sequences of TK2045 results in the loss of TK2045 transcription in vivo.
A T. kodakarensis strain deleted for TK2045 (ΔTK2045) is viable and exhibits a single loss in cytidine retention following BS-seq that maps to C918 in the 16S rRNA (Fig. 1 A, Lower panel). In the parent strain, TS559, the m42C modification appears to resist even harsh bisulfite treatment ~60% of the time (a sequencing depth of >10,000 was achieved in three biological replicates and across two growth conditions) (Fig. 1I). In the ΔTK2045 strain, the modification frequency at C918 was reduced to zero across two biological replicates, regardless of growth phase (Fig. 1I), suggesting that the enzyme encoded by TK2045 is responsible for modifying C918 in vivo. No other epitranscriptomic changes were detected in ΔTK2045 strains, suggesting that TK2045 encodes the writer enzyme that generates m42C at C918 in the T. kodakarensis 16S rRNA, and further, that m42C is highly resistant to bisulfite-driven deamination.
Interestingly, we did not detect any traces of unmodified cytidine at the C918 in rRNA fragments recovered from the parental strain TS559 by LC-MS/MS analyses, indicating a completely modified cytidine was recovered from in vivo RNAs; this contrasts with the ~60% modification frequency observed by BS-seq. Where m5C is nearly completely resistant to bisulfite deamination, m42C appears to only be partially resistant to the deamination chemistry. Taken together, m42C appears robustly detectable by BS-seq and gene TK2045 encodes an m42C writer, which we have named m42C synthase.
m42C Synthase Methylates Assembled Ribosomes.
The in vivo, stoichiometric, and site-specific loss of m42C upon deletion of TK2045 strongly suggests that the enzyme encoded by TK2045 is directly and likely solely responsible for modifying C918. To better understand the putative role of m42C synthase encoded by TK2045 in the generation of m42C, we sought to demonstrate the bona fide activity of rTK2045 in vitro. rRNA substrates of varying lengths ranging from 201 nt to 966 nt were synthesized by in vitro transcription, resulting in unmodified RNA substrates (SI Appendix, Table S2). We also purchased unmodified 23 nt and modified 46 nt oligonucleotide substrates containing the adjacent modifications detected within H31 (Fig. 1F and SI Appendix, Table S2). Recombinant m42C synthase (rTK2045) (Fig. 2A) was challenged to transfer a radiolabeled methyl group from the common methyl cofactor, S-adenosyl-L-methionine ([3H-methyl]SAM), to these RNA substrates. We did not detect SAM-dependent methylation activity on either modified or unmodified substrates (SI Appendix, Fig. S2 A–C), likely indicating that the strategies for substrate recognition or enzymatic activity may extend beyond RNA sequence and secondary structure.
Fig. 2.

Mature ribosomes are the identified substrate of m42C synthase. (A) SDS-PAGE of recombinant m42C synthase (rTK2045, ~34 KDa). M = marker. (B) The Cryo-EM resolved structure of the T. kodakarensis 70S ribosome, inclusive of rRNA (16S in blue, 23S in magenta, 5S in black) and ribosomal proteins (green). SDS-PAGE (protein) and agarose gel electrophoresis (rRNA) analysis of ribosomes purified from the parent TS559 and ΔTK2045 strains. (C) In vitro methyltransferase radiolabel assay schematic. (D) SAM-dependent methyltransferase activity of rTK2045 on ribosomes and rRNA purified from TS559 or ΔTK2045. n = 3 replicates. **P = 0.0014. (E and F) Mass spectrometry analysis of H31 fragments derived from in vitro methylated ribosomes. (G and H) Site-specific analysis C918 in H31 fragments methylated in vitro. (I and J) Site-specific analysis m4C918 in H31 fragments methylated in vitro.
As the methyltransferase activity of m42C synthase is not apparent on in vitro synthesized RNA, we speculated that substrate recognition may rely on additional features present in 70S ribosomes or just the small 30S subunit. We purified fully assembled ribosomes from the parent strain TS559 and ΔTK2045 (Fig. 2B), then challenged m42C synthase to methylate either ribosome species under low magnesium conditions where the large and small ribosomal subunits largely dissociate (19). Ribosomes derived from TS559 contain m42C and cannot be modified further at C918, whereas ribosomes derived from ΔTK2045 will have an unmodified C918 available for m42C installation (Fig. 2C). Site-specific activity of m42C synthase should therefore only be apparent in ribosomes derived from ΔTK2045. We also purified rRNA from these ribosomes, generating substrates that are fully modified excluding m42C. Indeed, we detected robust methyltransferase activity of m42C synthase on ribosomes derived from ΔTK2045 but not from the otherwise isogenic parent strain TS559, nor purified and fully modified rRNA regardless of its source (Fig. 2D). Reactions lacking 3H-SAM or m42C synthase displayed background levels of radioactivity recovered on substrates. To ensure that m42C synthase methylates RNA rather than ribosomal proteins, we quenched reactions with either Proteinase K or RNase A to degrade all proteins or RNA, respectively, from the reaction. While reactions treated with Proteinase K retain radiolabel transfer, addition of RNase A results in the complete loss of radiolabel recovery (SI Appendix, Fig. S2C), confirming that methylation is to the RNA. Taken together, these data show that m42C synthase is directly involved in the methylation of C918, and assembled ribosomes emerge as the only identified substrate for m42C synthase.
To further clarify the role of m42C synthase in the installation of m42C, we performed mass spectrometry analysis of ΔTK2045-derived ribosomes methylated by recombinant m42C synthase. Detection of m42C in ΔTK2045-derived ribosomes was confirmed by single nucleoside analysis (Fig. 2 E and F and SI Appendix, Fig. S3 A and B). As expected, reactions that lacked SAM or m42C synthase did not produce any m42C/C ratio changes. In support of m42C synthase as the sole enzyme responsible for direct conversion of C918 to m42C, mass spectrometry analysis of H31 fragments indicated the presence of two methyl groups at C918 (Fig. 2 G and H and SI Appendix, Fig. S3 C–E). A significant portion of H31 fragments were partially modified in the form of 4-methylcytidine (m4C) at C918 (Fig. 2 G–J and SI Appendix, Fig. S3 C–E), suggesting that the m42C synthase installs methyl groups sequentially. These results demonstrate a direct and sole role for the protein product of TK2045 as an m42C synthase and eliminate concerns of a multienzyme pathway or the participation of other methyltransferases in the production of m42C918 in T. kodakarensis rRNA.
m42C Synthase Adopts a Distinct Structure.
To better understand catalysis, SAM binding, and substrate recognition, we determined the crystal structure of the m42C synthase to ~2 Å resolution (PDB: 8VYD, SI Appendix, Table S3). The asymmetric unit consists of two nearly identical monomers (RSMD = 0.494 Å for all aligned atoms) each with two distinct domains (Fig. 3 A and B). The C-terminal domain (red, yellow, and blue; ~⅔ of the protein by mass) adopts a canonical class I Rossman fold, and this domain likely contains the catalytic residues that bind C918 and SAM (yellow). In contrast, neither the structure nor sequence of the N-terminal domain (purple; ~⅓ of the protein by mass) are suggestive of its function.
Fig. 3.

Structural and enzymatic characterization of m42C synthase. (A) Atomic structure of m42C synthase at 2 Å. The N and C termini, α-helices, β-sheets, and SAM molecule are labeled. (B) Linear representation of m42C synthase. The N-terminal domain (magenta), C-terminal domain (blue, red, and yellow), methyltransferase domain (yellow and red), and SAM binding residues (yellow) are labeled. Amino acid residues of interest are annotated. (C) Surface structure (D) SAM and substrate binding pocket, and (E) electrostatic potential of m42C synthase. (F) Cytidine and SAM docked into the active site. (G) Proposed interaction between C918 and the DPPR motif. (H) Western blot analysis of recombinant m42C synthase and its variants. (I and J) Methyltransferase activity of each enzyme variant relative to wild-type protein. WT = wild type, ns = not significant, *P-value ≤ 0.05. (K) Methyltransferase activity over time of each enzyme variant. NE = no enzyme. n = 3 replicates.
An asymmetric pocket at the center of the C-terminal domain appears to be capable of accommodating both a SAM cofactor and a cytidine base (Fig. 3 C and D). The calculated electrostatic potential surface shows a highly negatively charged pocket where the SAM cofactor likely sits and several positively charged residues near the putative cytidine binding site (Fig. 3E). The m42C synthase C-terminal domain shares significant structural homology with TGS1 (PDB: 3GDH), a human methyltransferase that dimethylates guanosine at N2 (20, 21). A superposition of the two protein structures showed that the SAM cofactor resolved in the TGS1 structure fits well within the narrow neck of the presumptive m42C synthase active center. Replacing the bound guanosine from the TGS1 structure with a cytidine in the active site pocket allowed us to approximate the relative positions of the cofactor and nucleotide base substrate within the putative active site of m42C synthase (Fig. 3 D and F). There are two stretches of the protein chain in the C-terminal domain that could not be traced in the electron density map. One of these untraced chains (residues 103 to 110) sits at the mouth of the central pocket and may play a role in controlling access to the active site.
The active centers of known RNA m5C methyltransferases often encode a DAPC motif, where an aspartic acid (D) acts as the catalytic residue (22, 23). While exocyclic amino methylating enzymes typically encode a DPPY catalytic motif, a DPPR motif common to many RNA m5U methyltransferases is found in the presumptive active center of m42C synthase (23). The first three amino acids of this motif (DPP) are entirely conserved within the 100 closest homologs (SI Appendix, Fig. S4), indicating these residues are likely important for catalytic function and are a shared signature of enzymes that generate m42C. Previous reaction models lead us to speculate that the aspartic acid (D205 in T. kodakarensis m42C synthase) provides critical hydrogen bonds with the cytidine likely at N4 and N3 and that the carboxyl group linking the subsequent two prolines (P206 and P207) ultimately hydrogen bonds with cytidine at N4 (Fig. 3G). The importance of either proline is unclear, as is that of the terminal arginine within the DPPR motif.
To establish the impact of individual amino acid residues on the activity of m42C synthase, we recombinantly expressed and purified nine variants of m42C synthase (Fig. 3H). A mutant lacking amino acids 2 to 77 (the start codon was retained) was generated to probe the function of the N-terminal domain. Additional mutants included single alanine substitutions at residues of the DPPR motif, as well as at positively charged residues that may interact with negatively charged RNA substrates. None of the variants showed activity on TS559-derived ribosomes, indicating no gain in substrate promiscuity (Fig. 3I). The D205A variant did not show activity above background, establishing D205 as the likely catalytic residue within the DPPR motif (Fig. 3 I–K). The m42C synthase variant lacking the N-terminal domain retains ~20% activity compared to the wild-type enzyme (Fig. 3 I–K), indicating this domain is not required but is important for efficient methylation of the ribosome. To our surprise, P206 and R208 residues within the DPPR motif do not appear to be essential for catalysis, and the R208A substitution results in a slightly higher midpoint activity level compared to the wild-type enzyme (Fig. 3J). Two additional variants, R103A and R106A, were predicted to potentially impact RNA binding due to their positive charges; however, both variants showed nearly threefold higher midpoint activity levels. Time-course assays using purified ribosomes demonstrate that R103A, R106A, and R208A substitutions result in markedly faster activity rather than higher endpoint activity (Fig. 3K). It is possible that single amino acid substitutions that reduce binding affinity to the substrate or SAM cofactor led to faster release of the methylated product and therefore faster reaction times while retaining substrate specificity.
The N-Terminal Domain Is Essential for In Vivo m42C Installation.
Although the C-terminal domain of m42C synthase is highly conserved in sequence and structure, the N-terminal domain lacks homology to any protein with a known function. The closest, albeit poor, structural alignment (Z-score = 4.0) using the DALI server (24) was to the ~40 residue β-sheet region in the N-terminal domain of the vesicular stomatitis virus RNA polymerase (SI Appendix, Fig. S5) (25). Unique domains of unknown function are not uncommon in RNA methylating enzymes, and speculation remains that these unique domains are likely to participate in substrate recognition. We demonstrated in vitro that the N-terminal domain is not required for site-specific methylation and enzymes lacking the N-terminal domain retain ~20% activity compared to full-length enzymes (Fig. 3J). To probe the impact of the unique N-terminal domain in living cells, we measured the in vivo modification frequency at C918 in strains lacking nucleotides Δ4 to 231 (amino acids 2 to 77) of gene TK2045 (SI Appendix, Fig. S6A). BS-seq of the RNA isolated from this strain indicated complete loss of modification at C918 at >13,000X coverage (SI Appendix, Fig. S6B), indicating the N-terminal domain is essential for in vivo installation of m42C. The exact role of the N-terminal domain remains unclear.
m42C918 Enhances Thermophilic Growth.
A sequence comparison of the H31 region across common model species revealed a high sequence similarity within and across domains. We compared 16 nucleotides in H31 and an additional 10 nucleotides up and downstream; 22 of the 36 nucleotides (~61%) share sequence identity (Fig. 4A, stars). Despite high sequence and structural conservation, there is an incredible diversity in the modification profile of H31 across domains and species (Fig. 4B). We report the modification status of H31 nucleotides in a hyperthermophilic archaeon. The sequence and structural conservation of H31 across evolutionary lineages and its essentiality (15, 17) are highly suggestive that H31 modifications are biologically important, but the exact role of divergent modifications to H31 is unclear. When viewed with respect to the A, P, and E site tRNAs, both m1Ψ917 and m42C918 interface with the anticodon loop of the P-site tRNA (Fig. 4C). The interface between H31 modifications and the P-site tRNA likely impacts how tRNAs function within the ribosome.
Fig. 4.

Chemical modifications decorate H31 across domains and are fitness relevant in T. kodakarensis. (A) Sequence alignment of the H31 region across domains (Eu. = Eukarya, Ar. = Archaea, Ba. = Bacteria). Blue and green highlighted nucleotides correspond to H31 stem and loop, respectively. C918 in T. kodakarensis and the equivalent positions in other species are boxed. The coordinate position of the 3'-most aligned nucleotide is listed. Stars indicate conserved nucleotides within the alignment. (B) Modifications to H31 across domains. (C) The interface between H31 nucleotides and the anticodon loop of the P-site tRNA. The E-, P-, and A-site tRNAs are shown in blue, gray, and brown, respectively. The translated mRNA is represented in yellow, and the H31 nucleotides are green and blue, consistent with the color scheme in panel A. (D) Head-to-head growth competition between the parent strain TS559, ΔTK2045, and the D205A mutant strain. CI are represented by ± 1 SE. n = 5 replicates.
To establish the phenotypic impact of m42C at C918, we performed head-to-head growth comparisons between strains TS559 and ΔTK2045. T. kodakarensis grows over a wide temperature range (~50 to ~98 °C), with a growth optimum of 85 °C. We tracked the growth of TS559 and ΔTK2045 at 65°, 75°, 85°, and 95 °C over ~18 to 40 h by measuring the optical density at 600 nm (OD600) of each culture. A severe ~25% reduction in cell culture density is observed at 65° and 95° in strains lacking m42C synthase (ΔTK2045, red curve) compared to TS559 (black curve). Only minor reductions in culture density are observed at 75° or 85 °C (Fig. 4D). We then asked whether the m42C synthase enzyme or the presence of the modification per se is important for cell growth. We showed that D205 is the likely catalytic residue and its substitution abolished m42C generation in vitro (Fig. 3). We therefore generated a strain wherein TK2045 was retained with an in vivo D205A substitution (SI Appendix, Fig. S6C). Head-to-head growth comparisons similarly show no obvious growth defect at moderate temperatures (Fig. 4D, TK2045-D205A, blue curve), but partially restored growth to levels comparable to TS559 at 65 °C. At 95 °C, a severe growth defect remained apparent. These data likely indicate that m42C918 imparts fitness benefits under thermal insult.
m42C Is Broadly Distributed Across Domains.
A search of the top one hundred m42C synthases revealed close homologs spanning all domains (SI Appendix, Fig. S4A). To understand the distribution of m42C within the Archaea, we performed nucleoside analysis of total RNA isolated from six species (SI Appendix, Fig. S7 A–F). We detected a robust signal for m42C in the hyperthermophilic species Thermococcus sp. AM4, T. kodakarensis, and Methanocaldococcus jannaschii (SI Appendix, Fig. S7 A–C). BlastP searches through KEGG revealed obvious m42C synthase homologs encoded by genes TAM4_1357 and MJ_1233. We were unable to detect m42C in the hyperthermophilic acidophile Sulfolobus islandicus, the halophile Halobacterium salinarum, or the mesophilic methanogen Methanococcus maripaludis (SI Appendix, Fig. S7 D–F). Likewise, m42C synthase homologs were not detected within these species or their respective genera. Although not all heat-loving archaea generate m42C, all close archaeal homologs identified here were from hyperthermophilic or thermophilic species (SI Appendix, Fig. S4B).
While 77 of the top 100 m42C synthase homologs are from Archaea (SI Appendix, Fig. S4A), m42C synthase is not restricted to the archaeal domain. Two close homologs are encoded within the mesophilic eukaryotes Chlamydomonas reinhardtii (algae) and Emiliania huxleyi (protist). Twenty-one close homologs were found to be encoded in Bacteria divided among psychrophilic, mesophilic, thermophilic, and hyperthermophilic species from both gram-positive and gram-negative organisms (SI Appendix, Fig. S4B). A recent study detected m42C by mass spectrometry in cultured Human hepatocarcinoma (Huh7) cells only when infected with Zika Virus (ZKV), Dengue Virus (DENV), hepacivirus (HPCV), or poliovirus (PV) (18). We failed to detect m42C in human cell lines, yeast, or Escherichia coli grown under laboratory conditions (SI Appendix, Fig. S7 G–I). Further, we did not find obvious homologs encoded in these species nor in any viral genomes. Further experimentation is necessary to identify the putative human enzyme responsible for generating m42C.
Phylogenetic comparisons of a representative homolog from each genera established that archaeal homologs share more similarity to each other than those in other domains, with the exception of a single homolog from a Thaumarchaeota, Ca. Caldiarchaeum subterraneum, which clusters near the eukaryotic homolog in E. huxleyi (SI Appendix, Fig. S4C). Sequence alignment of representative homologs from each genera indicates that the putative catalytic amino acid position within the C-terminal domain is highly conserved across all homologs. The m42C synthase N-terminal domain is largely conserved, but distinctly missing from many bacterial and both eukaryotic homologs (SI Appendix, Fig. S4D). In nonredundant genera, 21 amino acids are entirely conserved within the top one hundred m42C synthase close homologs (SI Appendix, Fig. S4E). These positions are located within the core of the C-terminal domain and include DPP of the DPPR motif. The first three amino acids of this motif (DPP) are entirely conserved within the 100 closest homologs, indicating these residues are likely important for catalytic function and are a shared signature of enzymes that generate m42C.
The Clusters of Orthologous Genes (COG) database (26) currently includes 1,187 and 122 bacterial and archaeal genomes, respectively. TK2045 belongs to COG2521 and arCOG00054, is annotated as a “predicted archaeal methyltransferase,” and searches indicate that 4.3% (57/1,309) of prokaryotes with complete and annotated genomes encode an ortholog of m42C synthase. Eukaryotic genomes have not been cataloged in the COG database and therefore an accurate assessment of m42C synthase orthologs within eukaryotic genomes is unknown. Within the COG catalog, m42C synthase orthologs appear in 21% (26/122) of Archaea and 2.6% (31/1,187) of Bacteria (SI Appendix, Fig. S4F). Consistent with our analysis by sequence homology, predicted orthologs appear to be specific to heat-loving Archaea, but in Bacteria, there is no obvious growth temperature bias. Bacterial orthologs are more divergent and many lack homology to the N-terminal domain present in archaeal orthologs (SI Appendix, Fig. S4G).
Discussion
We report a unique epitranscriptomic H31 marker in the hyperthermophilic archaeon, T. kodakarensis. We mapped m42C with single nucleotide resolution using BS-seq and LC-MS/MS, identified the writer enzyme (here termed m42C synthase), demonstrated a broad distribution of m42C synthases across the tree of life, and established the fitness impact of m42C under heat stress. Thermococcus ribosomes are hypermodified with ac4C and m5C, so it is not surprising that H31 is likewise densely modified to support life in extreme heat (7, 11). The exact molecular function of m42C is enigmatic, but many studies demonstrate that the epitranscriptome confers a fitness advantage under heat stress (3–5, 7, 12, 27–31), and we show that m42C is no exception. m42C is thought to disrupt hydrogen bonding with G (32), so a dimethylation likely prevents the H31 loop from forming base-paired structures and therefore might assist with maintaining the overall structure of the entire H31 region. It is equally probable that the bulkier dimethylation may restrict movement of the P-site bound tRNA that improves ribosome function under thermally denaturing conditions. It is possible that the lack of a single m42C residue in H31 results in inefficient translation under heat stress, and as translation is responsible for large energetic costs, inefficient energy consumption results in poor and slower overall growth.
Although we were unable to identify obvious m42C synthase homologs in humans or viruses, a recent study detected m42C in total RNA from Huh7 cells only after infection with ZKV, DENV, HPCV, or PV, indicating m42C is dynamic and plays a role in determining virus–host interactions (18). Flaviviruses (such as ZNK, DENV, and HPCV) encode one methyltransferase (NS5), which generates 7-methylguanosine and 2’-O-methyladenosine at the mRNA 5’ cap (33). The absence of other RNA methyltransferases in these viruses suggests that human encoded enzyme(s) are responsible for the generation of m42C upon viral infection, although to what RNA(s) is entirely unknown. It remained undefined whether m42C identified in this previous study is a result of viral or host (or a combination of) encoded activities, if the nucleoside is within a polymeric RNA or generated as a single nucleotide, or whether its synthesis is a side reaction of combined chemistries or the optimized chemistry of a defined pathway. Whether m42C supports human or viral fitness is also unclear, but it is tempting to hypothesize that m42C could be a factor in human cell function under infection-induced heat stress, where fever symptoms are often present.
BS-seq is currently the standard for quantitative, single-nucleotide precision mapping of m5C in DNA and RNA, and hundreds of BS-seq datasets that purport m5C profiles have been published. Bisulfite-driven deamination of cytosine begins with sulfonation at C2 followed by hydrolysis of the exocyclic N4 amino group, resulting in the generation of uracil sulfonate and ultimately uracil. While cytosines are readily deaminated, the deamination of C5-modified cytosines is very slow, typically taking ~12 to 16 h (34). C5-modified cytosines are resistant to the initial conversion step of sulfonation, but m4C is susceptible to bisulfite-driven deamination (35). We found that m42C appears largely resistant to bisulfite-driven deamination, even under the harsh BS-seq protocols employed here. It is unclear how widespread m42C is, but the robust resistance to deamination indicates that existing datasets may be misreporting m5C sites that are actually occupied by m42C.
Several publications (including the Modomics database) incorrectly inferred that the E. coli enzyme, RsmH, is responsible for generating m42C at C1402 in the 16S Rrna (18, 32, 36). More recent in vivo and in vitro experimentation has demonstrated that C1402 is occupied by N4,2’-O-dimethylcytidine (m4Cm) and is generated by the sequential activities of RsmH (N4 methyltransferase) and RsmI (2’-O methyltransferase) in E. coli (37, 38). C1402 in E. coli is not occupied by m42C, but instead m4Cm. T. kodakarensis TK2045 is thus the only known gene to encode a bona fide m42C synthase.
Methylation is the most common chemical modification to RNAs (39), but the mechanisms that dictate the chemistry of methyl transfer are only well understood for a small subset of methylation events. Endocyclic modifications (modifications within the nucleotide ring) are often catalyzed by 1 or 2 cysteine residues within well-defined RNA binding and catalytic motifs (23, 40). RNA m5C methyltransferases typically require two cysteine residues, one configured in a DAPC motif and another in a TCS motif, to coordinate methyl transfer. Similarly, RNA m5U methyltransferases usually encode DPPR and SCN motifs for catalysis. However, some exocyclic amino methyltransferases have been shown to encode a DPPY motif while lacking catalytic cysteine residue(s). Here, we show that m42C synthase encodes a DPPR catalytic motif, reminiscent of RNA m5U methyltransferases. No cysteine-containing motifs were detected within close homologs. We speculate that m42C methyltransferases are evolutionarily related to m5U methyltransferases where the DPPR motif was exapted. The loose conservation of the fourth arginine may adumbrate that it is an evolutionary remnant rather than an essential amino acid residue.
Materials and Methods
Cell Growth.
All T. kodakarensis strains were grown anaerobically at 85 °C in artificial sea water with yeast, tryptone (ASW-YT) and supplemented with sulfur and agmatine (41). Cultures grown for total ribonucleoside analysis by LC-MS/MS were grown to mid-exponential growth phase (OD600 = ~0.3) in duplicate before cultures were harvested by centrifugation. Cultures prepared for bisulfite sequencing were harvested during early exponential growth (OD600 0.1 to 0.2) and then left to continue growing before harvesting the remaining culture two hours after reaching its maximum optical density (0.6 to 0.8, stationary growth phase); 300 ml of culture were harvested by centrifugation during early exponential growth phase and 200 ml of culture were harvested during stationary phase. Cell pellets were stored at −20 °C.
Strain Construction.
Procedures and primer sequences used for generating deletion plasmids were previously reported in ref. 42. Briefly, the TK2045 genomic loci and ~700 bp up and downstream were PCR amplified from genomic DNA and gel purified (Qiagen, cat# 28706X4). Primers were modified to include 13 bp terminal extensions with homology to pTS700 at SwaI. pTS700 was linearized with SwaI (NEB, cat# R0604S) before the insert was cloned into pTS700 by ligation independent cloning. The coding sequence of the target gene or just the N-terminal sequence (nucleotides 4 to 231, NΔ) was then removed from the plasmid using QuickChange site-directed mutagenesis Agilent cat# 200516). To minimize unintentional impacts to the genome due to deletion of most or all of the target locus, regions where the target gene overlapped with other genomic elements (i.e., other genes or known promoter sequences) were retained in the plasmid and therefore genome. Deletion-plasmid sequences were confirmed using Sanger sequencing. The pTS700-base plasmid used to generate the TK2045 D205A strain was purchased from Twist Biosciences.
Procedures for markerlessly editing the genome of T. kodakarensis strain TS559 are published in ref. 43. The pTS700-based plasmid was transformed into T. kodakarensis strain TS559 and cells were plated on agmatine-free, rich medium. After 2 to 5 d of growth at 85 °C, transformants were picked into rich liquid media lacking agmatine and grown overnight. For each isolate, genomic DNA was extracted from 1 ml of fully grown culture by phenol/chloroform/isoamyl alcohol (25/24/1; v/v/v) extraction followed by alcohol precipitation of the aqueous phase with an equal volume of 100% isopropanol. Plasmid integration into the genome was confirmed by PCR using two primer pairs (F: GAAGGGTTGAAAGGGTTATAAAGG, R: CACTTTATGCTTCCGGCTCGTATGTTG; and F: GCAATAGCGGTCGTCGTCATGTTC, R: GTGATACTCAACATACTCTCCAACC); with one primer from each pair having homology to the genome and the other primer to the plasmid. This ensures that the amplicon originates from a genomically integrated sequence.
Isolates where the plasmid had integrated at the predicted loci were plated on minimal medium with the counterselectable marker, 6-methyl purine, and grown anaerobically for 4 d at 85 °C. Colonies were then picked into rich liquid media and grown overnight. Genomic DNA was purified from each strain and screened for deletion of the target loci. Candidate deletion strains were identified via PCR using four sets of primers. First, PCR using primer pairs that flanked the deletion loci (A/B primers; A: AGATCTCAGCCTCGTAAAAACACC, B: GGAGTACAGGAAGTACGGAATAGC) resulted in a reduced amplicon size equivalent to the size of the deleted region compared to the parent strain. Primer sets both with homology to a region internal to the deletion loci (C/D primers; C: GACGATAGAGATAAACGGTATTCG, D: GCTTGCTTGAACTTCTTAACTACC) resulted in PCR amplification in the parent strain but not the deletion strain. Two combinations of external and internal primers (A/D and C/B primer pairs), which should not produce an amplicon in the deletion strain, were used to ensure that the deletion loci was absent in the deleted strain. Candidate deletion strains where PCR indicated a successful deletion were finally confirmed via whole genome sequencing.
To ensure the region targeted for deletion was correctly excised and to ensure no off-target genome modifications were present, each candidate strain was screened by whole genome sequencing on a MinIon platform. Genomic DNA was purified using the Monarch Genomic DNA Purification kit (NEB, cat# T3010S), and library preparation for each strain was completed using the Rapid Barcoding kit 96 (ONT, cat# SQK-RBK110.96). Sequencing was done on the MinIon Mk1C with the R9.4.1 flow cell. Fast5 files were converted to Fastq files using Guppy high accuracy settings. Fastq files were aligned to the reference genome using MiniMap2 and alignment files were compared to the reference genome using Medaka Variant Calling Pipeline via Neural Networks. Visual inspection of the deletion loci on Integrated Genomics Viewer confirmed deletion of the loci. As a third confirmatory measure, the bisulfite sequencing of RNA (described below) was checked for RNA that aligned to the deleted loci. For all libraries, we observed virtually zero coverage at the deleted loci, indicating the gene had been completely deleted.
RNA Preparation for MS.
The universal human reference RNA (Agilent, cat# 740000) was generated by pooling 10 cell lines to reduce cell type bias. T. kodakarensis cells were resuspended in 1 ml TRIzol and 200 μl chloroform with a 5 to 10 min incubation at room temperature preceding each reagent. Cellular debris was removed via centrifugation at 21,000×g for 15 min. The aqueous phase was alcohol precipitated in 3x the volume of ethanol with 2 µl GlycoBlue (ThermoFisher Scientific, cat# AM9515) and incubated at −80 °C for 30 min. RNA pellets were collected via centrifugation at 21,000xg for 30 min at 4 °C. RNA was resuspended in nuclease-free water and treated with DNase I (NEB, cat# M0303) for 30 min at 37 °C. RNA was purified using the Monarch RNA Clean Up kit (NEB, cat# T2040S or T2030S) or the Zymo RNA Clean and Concentrator kit (Zymo, cat# R1017). A fraction of total RNA was depleted for tRNAs using the Zymo RNA Clean and Concentrator Kit following the manufacturer’s protocol for selective recovery of RNA > 200 nt.
rRNA Depletion.
rRNA was depleted from a fraction of the RNA samples using reagents provided in the NEBNext rRNA Depletion Kit (NEB, cat# E6310) and custom DNA oligos (44). The manufacturer’s protocol was followed with the following changes: The NEBNext rRNA Depletion Solution provided in the kit was substituted for a mixture of 85 oligonucleotides at 1 µM concentration for each oligo whose sequences were complementary to T. kodakarensis rRNA. All volumes for the probe hybridization, RNase H treatment, and DNase I treatment sections of the protocol were scaled up twofold, and 24 µl of 62.5 ng/µl T. kodakarensis RNA was used as the starting material.
RNA Library Preparation and Bisulfite Sequencing.
Frozen cell pellets were resuspended in a total volume of 7.5 ml TRI reagent RT (MRC Inc. Cincinnati, OH), vortexed, and incubated at room temperature for 5 to 10 min prior to cooling on ice. 375 µl of 4-bromoanisole (MRC Inc.) was added and tubes were mixed by inversion prior to centrifugation at 12,000×g for 15 min at 4 °C. The aqueous layer (~4.5 ml) was removed, 6.75 ml of 100% isopropanol was added, tubes were mixed by inversion and the centrifugation step was repeated for 30 min. Liquid was removed from the tubes, each pellet was washed with 75% ethanol and the centrifugation was repeated for 10 min. The 75% ethanol was removed and pellets were air dried for 5 min or less. Pellets were resuspended in 450 µl nuclease-free water (ThermoFisher), 50 µl DNase I buffer, 5 µl DNase I (NEB, cat# M0303) and incubated at 37 °C for 30 min. An equal volume of acid-phenol:chloroform, pH4.5 with IAA 125:24:1 (ThermoFisher Scientific, cat# AM9720) was added, tubes were mixed by inversion, centrifuged, and aqueous layer removed and precipitated similarly as above. After the 75% ethanol wash step, each pellet was dissolved in nuclease-free water and stored at −80 °C.
The rRNA- and tRNA-depletion was performed as described above. Bisulfite treatment of either RNA recovered from rRNA depletion or total RNA (400 to 750 ng) was carried out using an EZ RNA Methylation Kit (Zymo Research, cat# R5001) with a 65 °C incubation for 120 min in place of the 54 °C for 45 min incubation in the protocol. Eluted RNA was either immediately used for construction of sequencing libraries or stored at −80 °C. Libraries for sequencing were prepared using the NEBNext® Ultra™ or Ultra™ II Directional RNA Library Prep Kit for Illumina® (NEB, cat# E7420 or E7760) following the protocol for use with purified mRNA or rRNA depleted RNA. The recommended fragmentation conditions for partially degraded RNA were used in the fragmentation step. Bisulfite-treated libraries were sequenced on one of three Illumina platforms; replicates 1 and 2 of strain TS559 and replicate 1 of ΔTK2045 were sequenced on the Next-Seq. Replicate 3 of strain TS559, replicate 2 of ΔTK2045 and both replicates of TK2045-NΔ were sequenced on the Nova-seq.
Raw Data Processing.
Fastq reads were subjected to adapter trimming and quality control using Trimmomatic. Reads where any base was sequenced with a PHRED score <20 were removed from further analysis. Reads less than 15 nt were also removed. Reads with >3% cytosine retention (# cytosines/read length) were removed using a custom python script. Fastq files were then mapped to the reference genome using BSseeker2. Two mismatches per alignment were allowed (excluding C-to-T mismatches). Using samtools, alignments were removed where the MAPQ score was <20 and when detected as a PCR duplicate. CGmaptools was used to obtain CGmaps where C and T coverage are calculated at each genomically encoded cytosine.
Cytosine Retention Ratio.
The data analysis pipeline for detection of high-confidence and reproducible m5C sites was previously reported in ref. 11. The cytidine retention ratio is used to calculate m5C modification frequency after bisulfite conversion and is calculated by dividing C coverage by C + T coverage.
Excision and Purification of the H31 Fragment.
The H31 fragment was purified from total RNA or ribosomes methylated in vitro (see below) using the rRNA-depletion method described above but only including DNA oligos flanking the H31 fragment (as depicted in Fig. 1B). After RNase H digestion, reactions were purified using the Monarch RNA Cleanup Kit (NEB, cat# T2030S or equivalent). A 60 nt complementary, biotinylated DNA oligonucleotide was hybridized to the H31 fragment by adding the DNA probe in tenfold excess to the RNA and lowering the temperature from 95 °C to 22 °C at ~0.1 °C/s. The hybridized complex was purified using streptavidin-coated magnetic beads. The RNA was then eluted by digesting the DNA oligo with DNase I (NEB, cat# M0303) before the RNA was again purified using the Monarch RNA Cleanup Kit. The H31 fragment was visualized on a 15% TBE-Urea polyacrylamide gel with SYBRgold staining.
LC-MS/MS.
Cellular total RNA, ribosomal RNA, or synthetic RNA substrates for the in vitro methyltransferase assays (see below) were digested to nucleosides at 37 °C overnight using a Nucleoside Digestion Mix (NEB, cat# M0649S). Tandem liquid chromatography–mass spectrometry (LC-MS/MS) analysis was performed by injecting the digested RNAs on an Agilent 1290 Infinity II UHPLC equipped with a G7117A diode array detector and a 6495C triple quadrupole mass detector operating in the positive electrospray ionization mode (+ESI). UHPLC was carried out on a Waters XSelect HSS T3 XP column (2.1 × 100 mm, 2.5 µm) with a gradient mobile phase consisting of methanol and 10 mM aqueous ammonium acetate (pH 4.5). MS data acquisition was performed in the dynamic multiple reaction monitoring (DMRM) mode. Each nucleoside was identified in the extracted chromatogram associated with its specific MS/MS transition (m42C precursor ion m/z: <272.1>; product ion m/z: <140.1>). The abundance of each nucleoside derived from the digested samples was quantified based on chromatogram peak integration standard curves. The relative abundance of each modified nucleoside was determined by further dividing the peak integrations of modified and unmodified nucleosides (for example: m42C/C).
N4-methylcytidine (m4C) and C5-methylcytidine (m5C) are positional isomers sharing the same retention time and primary MS/MS transition (precursor ion m/z: <258.1>; product ion m/z: <126.1>) in the gradient mobile phase consisting of methanol and 10 mM aqueous ammonium acetate (pH 4.5). As m5C is one of the most abundant modifications in T. kodakarensis RNA, we sought to identify unique product ions of m4C using nucleoside standards. A unique product ion at m/z: <95> was found in m4C, but not in m5C, product ion spectra (SI Appendix, Fig. S8A). To separate m4C and m5C, digested T. kodakarensis RNA samples were eluted in a gradient mobile phase containing methanol and 0.1% formic acid (pH 2.7). In the formic acid gradient, m4C and m5C nucleoside standards are separated in two discrete chromatographic peaks within the primary MS/MS transition window (precursor ion m/z: <258.1>; product ion m/z: <126.1>) (SI Appendix, Fig. S8B).
For site-specific analysis of H31, the isolated 60 nt fragment was digested with RNase T1 (Thermo Fisher Scientific, cat# EN0542) at 37 °C for 30 min. The resultant digest was subjected to analysis on a Q Exactive Plus or an Eclipse Fusion Orbitrap mass spectrometer (Thermo Fisher Scientific) coupled with a Vanquish Horizon UHPLC (Thermo Fisher Scientific) or Vanquish Neo UHPLC, respectively. Digested oligonucleotides were separated in a mobile phase gradient consisting of buffer A (1% hexafluoroisopropanol (HFIP), 0.1% N,N-diisopropylethylamine (DIEA), and 1 μM EDTA) and buffer B (90% methanol, 10% water, 0.075% HFIP, 0.0375% DIEA, and 1 μM EDTA) on a C18 column (Vanquish Horizon UHPLC: Waters ACQUITY Premier Oligonucleotide, 1.7 µm, 2.1 × 100 mm or Vanquish Neo UHPLC: Waters nanoEase M/Z Peptide BEH C18 column, 1.7 µm, 100 µm × 100 mm).
MS/MS spectra were acquired both by data-dependent acquisition and parallel reaction monitoring modes. Oligonucleotide mass spectrometry analysis on the QE Exactive Plus mass spectrometer was collected in negative ion mode with a spray voltage of 2,700 V, capillary temperature of 320 °C, auxiliary gas heater temperature of 300 °C, S-lens RF level of 55, sheath gas of 35 (arbitrary units) and auxiliary flow of 8 (arbitrary units). MS1 data were acquired at a resolution of 70 K, an AGC target of 1e6, a maximum IT of 200 ms, and scanning between 500 and 2,500 m/z. MS2 data were acquired at a resolution of 70 K, an AGC target of 1e5, a maximum IT of 240 ms, an isolation window of 1.5 m/z and a normalized HCD collision energy of 20%. Oligonucleotide mass spectrometry analysis on the Orbitrap Eclipse Fusion Tribrid mass spectrometer was collected in negative ion mode with a spray voltage of 2,500 V and ion transfer tube temperature of 275 °C using a nano bullet emitter (Thermo Fisher Scientific Cat #ES994). MS1 data were collected at a resolution of 120 K with automatic AGC and maximum IT settings, scanning between 450 to 2,000 m/z. MS2 data were collected at a resolution of 60 K with automatic AGC and maximum IT settings and stepped normalized HCD collision energies of 22, 24, and 26% with 2 microscans.
The resulting oligonucleotide fragmentation data were searched utilizing the Nucleic Acid Search Engine (NASE) with a 5% FDR, including only hits with a minimum spectral score above 100 to map methylated and dimethylated nucleotide residues within the H31 loop (SI Appendix, Supplementary File 1). Oligonucleotide intact mass deconvolution was performed utilizing BioPharmaFinder 5.1 (Thermo Fisher) utilizing the Xtract algorithm and ProMass HR software package (Novatia LLC, USA). Estimation of relative oligonucleotide abundance in enzymatic reactions was performed utilizing oligonucleotide mass intensity estimates computed by BioPharmaFinder 5.1.
Phenotypic Growth Analysis.
Individual cultures were passaged at least twice before being inoculated into 10 ml of rich medium with agmatine and sulfur in triplicate. Cultures were grown at 65°, 75°, 85°, or 95 °C in a water or oil bath. The optical density at 600 nm (OD600) was measured for each culture every 60 or 120 min using UV-Vis spectroscopy. The OD600 and the SE were plotted in R using the ggplot function, geom_smooth(method = “loess,” se = TRUE).
Recombinant Protein Expression and Purification.
The gene sequences for TK2045 and its variant sequences were designed such that each end included 17 nt of homology to the expression vector. The 5’ end also included an E. coli ribosome entry site (AGGAGATAATTAA) before the start codon, and the 3’ end included 6x Histidine tag (6xHis, CACCATCACCATCACCAT) to aid in affinity purification. Each insert was cloned into a pQE80 expression vector at the EcoRI restriction site using Infusion cloning (Takara Bio., Cat # 638910). Expression vectors conferring ampicillin resistance were transformed into the Rosetta2 E. coli cell line. We note here that the pRARE plasmid in Rosetta2 cells, which overcomes codon bias, is required for the expression of TK2045 and confers resistance to chloramphenicol. Cells were grown on ampicillin (100 µg/ml)- and chloramphenicol (25 µg/ml)-containing solid LB medium. A single colony was picked into liquid broth containing the necessary antibiotics and grown overnight. Overnight cultures were used to inoculate larger cultures in a 1:100 ratio and grown at 37 °C with shaking. At an OD600 of ~0.3, cultures were spiked with isopropyl β-d-1-thiogalactopyranoside (IPTG, 400 µM final concentration) to induce protein expression and D-sorbitol (3% w/v final concentration) to improve protein solubility. Cultures were shaken at 37 °C for an additional 2 h before cells were harvested via centrifugation at 12,000×g for 10 min. Cell pellets were stored at −20 °C.
Cell pellets were thawed in buffer A (25 mM Tris-HCl pH 8.0, 500 mM NaCl, and 10% glycerol), completely resuspended, and sonicated on and off on ice for 30 min to lyse the cells. Cellular debris was removed by centrifugation at 12,000×g for 30 min. The supernatant was heated to 65 °C for 15 min to denature most of the E. coli proteins before centrifugation at 12,000×g for 30 min. The 6xHis-tagged protein in the supernatant was purified on an AKTA system using a 1 ml HiTrap Chelating HP column (Cytiva Life Sciences, cat# 17040901) charged with Ni2+. The column was washed with 50 column volumes of wash buffer (25 mM Tris-HCl pH 8.0, 500 mM NaCl, 10% glycerol, and 40 mM imidazole) to remove nonspecific proteins retained in the column. The protein was then eluted at increasing concentrations of buffer B. Protein elution typically peaked at 165 mM imidazole. Elution fractions were analyzed by SDS-PAGE (BioRad, cat# 5678095) under denaturing conditions followed by Western Blot with antibodies against the 6xHis tag (1°: Mouse anti-6x-His Tag Monoclonal antibody, Invitrogen, cat # MA1-21315; and 2°: Goat anti-Mouse Phosphatase, KPL, cat # 05-18-18). Fractions in which the target protein was cleanly eluted were pooled and dialyzed into storage buffer (25 mM Tris-HCl pH 7.5, 100 mM NaCl, and 50% glycerol v/v) and stored at −20 °C.
In Vitro Generation of RNA Substrates.
Unmodified 23 nt RNA substrates were purchased from IDT. Modified 46 nt substrates were purchased from TriLink. Longer substrates (200 nt, 300 nt, 400, and 966 nt) were generated by in vitro transcription using an Sp6 system (NEB cat# E2070S) according to the manufacturer’s protocol. RNA pellets were suspended in nuclease-free water and stored at −20 °C. Sequences for each RNA substrate are provided in SI Appendix, Table S2.
Purification of Ribosomes and rRNA.
Ribosomes were purified from the TS559 and ΔTK2045 stains. Cultures were grown in 2 l artificial sea water with yeast, tryptone, and pyruvate (ASW-YTP), supplemented with agmatine and sulfur, and grown anaerobically to an OD600 of ~0.5. ΔTK2045 was also grown up in a 40 l batch in media lacking sulfur using a fermenter. Tightly coupled ribosomes were purified following the protocol outlined in ref. 19. Briefly, cells were resuspended in ribosome buffer 1 (10 mM MgCl2, 20 mM TrisHCl pH 7.5, 300 mM NH4Cl, 0.5 mM EDTA, and 6 mM β-mercaptoethanol) and stored at −20 °C. The cell resuspension was subjected to three freeze-thaw cycles at −80 °C and 50 °C to lyse cells in the presence of DNase I. Cellular debris was removed by centrifugation at 21,000×g for 15 min. The supernatant was then centrifuged at 150,000×g for 1 h to pellet the crude ribosome fraction. Crude ribosome pellets were gently washed, fully resuspended in ribosome buffer 1, then layered onto a 30% sucrose cushion (ribosome buffer 1 with 30% (w/v) sucrose). The sample was centrifuged at 150,000×g for 18 h, and the supernatant was poured off. The ribosome pellet was resuspended in ribosome buffer 2 (1 mM MgCl2, 20 mM Tris-HCl pH 7.5, 300 mM NH4Cl, 0.5 mM EDTA, and 6 mM β-mercaptoethanol) and stored at −20 °C.
A portion of each ribosome preparation was subjected to organic extraction using phenol/chloroform/isoamyl alcohol (PCI, 25/24/1 v/v/v). An equal volume of PCI was added to the ribosome preparation and mixed vigorously by pulse vortexing. Samples were centrifuged at 21,000 x g for 5 min, and the aqueous phase taken into 3X the volume of 100% ethanol. Samples were mixed by inversion and incubated at −20 °C for a minimum of 30 min before centrifugation at 21,000 x g for 30 min. rRNA pellets were decanted, gently washed with 100% ethanol, aspirated completely, and air dried for no more than 5 min before resuspended in nuclease-free water. rRNA preparations were stored at −20 °C. All RNA preparations (including ribosome preparations) were quantified using the Qubit RNA BR kit.
Methyltransferase Activity Assay.
The activity of recombinant m42C synthase (rTK2045) to methylate synthetic RNA, rRNA, or ribosomes was measured in vitro. Data presented in Fig. 2A and SI Appendix, Fig. S2 were performed such that reactions in 20 µl volumes include 6 or 3 µg RNA, respectively. Also included were 1 µM enzyme, ~1 µM [3H-methyl]-SAM (PerkinElmer, cat# NET155V001MC), and 1X MTase buffer (25 mM Tris-HCl pH 7.5, 100 mM NaCl, 1 mM MgCl2, and 1 mM DTT). Negative control reactions in which one component was excluded in place of water were performed in parallel. Reactions were incubated at 65 °C for 45 min in biological triplicate before 15 µl of each reaction was transferred to a 1 square inch positively charged nylon membrane (Cytiva, cat # RPN2020B). Midpoint and time course assays (Fig. 4) were performed in triplicate where a 1:20 enzyme-to-substrate ratio was achieved using 50 nM enzyme and 1 µM substrate. Time course assays were performed at the same enzyme and substrate ratios but in 80 µl of total volume where 10 µl of each reaction were drawn at each time point.
Each nylon membrane was air dried and washed in 300 ml 5% w/v Trichloroacetic acid for 10 min to remove free 3H-SAM while 3H-methylated RNA remained bound to the membrane. The wash was performed 5 times, each time replacing the TCA. Membranes were air dried and transferred to individual vials containing 3 to 4 ml Ecoscint scintillation cocktail fluid. The β-decay of each reaction was measured in counts per minute (CPM) using the Liquid Scintillation counter TRI-CARB 2900TR (Pickard). CPMs between reactions were compared and P-values were calculated using a two-sample unpaired t test.
Single Crystal Structure of m42C Synthase.
Recombinant protein was dialyzed into crystallography buffer (25 mM Tris-HCl pH 8.0, 50 mM NaCl, and 1 mM DTT), concentrated to 2 mg/ml, and screened at the Hauptman-Woodward Medical Research Institute Crystallization Screening Center. The mother liquor that resulted in crystal growth contained 0.5 M AMPD pH 9.2 and 15% (v/v) PEG 3350, and crystals emerged overnight at 4° C. Crystal growth was recapitulated using 1:1 and 1:2 ratios of the mother liquor and protein via the hanging drop vapor diffusion method in 24-well plates where AMPD pH (8.7 to 9.7) (Hampton Research, cat # HR2-254) and PEG 3350 concentrations (10 to 20% v/v) (Hampton Research, cat # HR2-527) were titrated. Crystals were grown at 4° C and collected from wells containing 0.5 M AMPD pH 9.3 and 10% v/v PEG 3350 where a 1:2 mother liquor-to-protein ratio was used. Crystals were stored in liquid nitrogen, and diffraction data were collected under a cold nitrogen stream at the Advanced Light Source of the Berkeley Lab, Macromolecular Crystallography Beamline (MBC 4.2.2).
The structure was solved by molecular replacement using the polypeptide backbone of an AlphaFold predicted structure as the initial search model. Side chains were placed manually in the calculated electron densities during refinement. The structure was solved to 1.95 Å resolution with final Rwork = 23.8% and Rfree = 28.6%.
Structure Analysis.
The function of amino acid residues and domains of the m42C synthase structure were annotated using EMBL-EBI InterPro webserver (45). Cytidine and SAM were docked into the putative active site pocket based on homologous positions of GTP and SAH in TGS1 (PDB: 3GHD), using the “align” function in PYMOL. Electrostatic potential surfaces were calculated using the APBS Electrostatic plugin for PYMOL.
Homology and Orthology.
The amino acid sequence corresponding to TK2045 was subject to a sequence similarity search using the BlastP function through the Kyoto Encyclopedia of Genes and Genomes (KEGG) web database. The top 100 queries were cataloged as close homologs and met an e-value of ≥1e−47. A phylogenetic tree was generated on KEGG of the closest match from each unique genera. Within the Clusters of Orthologous Genes (COG) and arCOG databases, TK2045 corresponds to COG2521 and arCOG00054, respectively. Each ortholog was cataloged. Alignments of nonredundant homologs and orthologs were performed using BlastP multiple sequence alignment through NCBI and visualized on NCBI MSA viewer. Amino acids are colored according to the “conservation” method which highlights highly conserved and less conserved amino acid positions based on the relative entropy threshold of the residue. Only alignment positions with no gaps are colored. Red indicates highly conserved positions and blue indicates lower conservation.
Supplementary Material
Appendix 01 (PDF)
Dataset S01 (XLSX)
Acknowledgments
We thank members of the NSF-funded archaeal epitranscriptomics consortium for helpful comments that improved the manuscript and figures. This work was supported by funding from the US NSF, award #2022065 (to T.J.S.) and #MCB-2124202 (to P.S.H.), and the US NIH, R35-GM143963 (to T.J.S.). This study was also privately funded from New England Biolabs, Inc. Authors R.T.F., Y.-L.T., N.D., E.J.W., I.R.C., and G.B.R. are employees of New England Biolabs, Inc. K.A.F. was supported by a T32 training grant from the NIH, GM132057, and a departmental GAANN fellowship.
Author contributions
K.A.F., N.D., R.T.F., and T.J.S. designed research; K.A.F., N.D., E.J.W., R.T.F., P.S.H., V.T., L.E., Y.-L.T., J.S., H.P.F., and R.C. performed research; K.A.F. contributed new reagents/analytic tools; K.A.F., N.D., R.T.F., P.S.H., Y.-L.T., I.R.C., and T.J.S. analyzed data; G.B.R. supervision of efforts at New England Biolabs, Inc; and K.A.F., P.S.H., G.B.R., I.R.C., and T.J.S. wrote the paper.
Competing interests
K.A.F., P.S.H., V.T., L.E., J.S., H.P.F., R.C., and T.J.S. do not have any competing financial interests nor conflicts of interest to report. N.D., E.J.W., R.T.F., Y.-L.T., G.B.R., and I.R.C. are employed and funded by New England Biolabs, Inc., a manufacturer and vendor of molecular biology reagents, including nucleic acid modifying and synthesis enzymes. The authors state that this affiliation does not affect their impartiality, objectivity of data generation or interpretation, adherence to journal standards and policies, or availability of data.
Footnotes
This article is a PNAS Direct Submission S.J.B. is a guest editor invited by the Editorial Board.
Contributor Information
Ivan R. Corrêa, Jr., Email: correa@neb.com.
Thomas J. Santangelo, Email: thomas.santangelo@colostate.edu.
Data, Materials, and Software Availability
Bisulfite-sequencing fastq files have been deposited into NCBI Sequence Read Archive under BioProject PRJNA937301 (46). Whole genome sequencing fastq files have been deposited into NCBI Sequence Read Archive under BioProject PRJNA1154289 (47). The atomic structure of m42C synthase has been deposited into Protein Data Bank under accession code 8VYD (48). The atomic structure of the T. kodakarensis 70S ribosome was queried from Protein Data Bank under accession code 6TH6 (49).
Supporting Information
References
- 1.Martinez-Miguel V. E., et al. , Increased fidelity of protein synthesis extends lifespan. Cell Metab. 33, 2288–2300.e12 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Evans C. R., Fan Y., Ling J., Increased mistranslation protects E. coli from protein misfolding stress due to activation of a RpoS-dependent heat shock response. FEBS Lett. 593, 3220–3227 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Orita I., et al. , Random mutagenesis of a hyperthermophilic archaeon identified tRNA modifications associated with cellular hyperthermotolerance. Nucleic Acids Res. 47, 1964–1976 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Shigi N., et al. , Temperature-dependent biosynthesis of 2-thioribothymidine of Thermus thermophilus tRNA. J. Biol. Chem. 281, 2104–2113 (2006). [DOI] [PubMed] [Google Scholar]
- 5.Droogmans L., et al. , Cloning and characterization of tRNA (m1A58) methyltransferase (TrmI) from Thermus thermophilus HB27, a protein required for cell growth at extreme temperatures. Nucleic Acids Res. 31, 2148–2156 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Akira Hirata, et al. , Distinct modified nucleosides in tRNATrp from the hyperthermophilic archaeon Thermococcus kodakarensis and requirement of tRNA m2G10/m22G10 methyltransferase (Archaeal Trm11) for survival at high temperatures. J. Bacteriol. 201, e00448-19 (2019). 10.1128/jb.00448-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Sas-Chen A., et al. , Dynamic RNA acetylation revealed by quantitative cross-evolutionary mapping. Nature 583, 638–643 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kowalak J. A., Dalluge J. J., McCloskey J. A., Stetter K. O., The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochemistry 33, 7869–7876 (1994). [DOI] [PubMed] [Google Scholar]
- 9.Sharma S., Lafontaine D. L. J., “View from A Bridge”: A new perspective on eukaryotic rRNA base modification. Trends Biochem. Sci. 40, 560–575 (2015). [DOI] [PubMed] [Google Scholar]
- 10.Liang X.-H., Liu Q., Fournier M. J., Loss of rRNA modifications in the decoding center of the ribosome impairs translation and strongly delays pre-rRNA processing. RNA 15, 1716–1728 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Fluke K. A., et al. , The extensive m5C epitranscriptome of Thermococcus kodakarensis is generated by a suite of RNA methyltransferases that support thermophily. Nat. Commun. 15, 7272 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tomikawa C., Yokogawa T., Kanai T., Hori H., N7-Methylguanine at position 46 (m7G46) in tRNA from Thermus thermophilus is required for cell viability at high temperatures through a tRNA modification network. Nucleic Acids Res. 38, 942–957 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lui L. M., et al. , Methylation guide RNA evolution in archaea: Structure, function and genomic organization of 110 C/D box sRNA families across six Pyrobaculum species. Nucleic Acids Res. 46, 5678–5691 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sloan K. E., et al. , Tuning the ribosome: The influence of rRNA modification on eukaryotic ribosome biogenesis and function. RNA Biol. 14, 1138–1152 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Saraiya A. A., Lamichhane T. N., Chow C. S., SantaLucia J. Jr., Cunningham P. R., Identification and role of functionally important motifs in the 970 loop of Escherichia coli 16S ribosomal RNA. J. Mol. Biol. 376, 645–657 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kowalak J. A., Bruenger E., Crain P. F., McCloskey J. A., Identities and phylogenetic comparisons of posttranscriptional modifications in 16 S ribosomal RNA from Haloferax volcanii. J. Biol. Chem. 275, 24484–24489 (2000). [DOI] [PubMed] [Google Scholar]
- 17.Lamichhane T. N., Abeydeera N. D., Duc A.-C.E., Cunningham P. R., Chow C. S., Selection of peptides targeting helix 31 of bacterial 16S ribosomal RNA by screening M13 phage-display libraries. Molecules 16, 1211–1239 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.McIntyre W., et al. , Positive-sense RNA viruses reveal the complexity and dynamics of the cellular and viral epitranscriptomes during infection. Nucleic Acids Res. 46, 5776–5791 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Mehta P., Woo P., Venkataraman K., Karzai A. W., Ribosome purification approaches for studying interactions of regulatory proteins and RNAs with the ribosome. Methods Mol. Biol. 905, 273–289 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Monecke T., Dickmanns A., Ficner R., Structural basis for m7G-cap hypermethylation of small nuclear, small nucleolar and telomerase RNA by the dimethyltransferase TGS1. Nucleic Acids Res. 37, 3865–3877 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hausmann S., Shuman S., Specificity and mechanism of RNA cap guanine-N2 methyltransferase (Tgs1). J. Biol. Chem. 280, 4021–4024 (2005). [DOI] [PubMed] [Google Scholar]
- 22.Liu R.-J., Long T., Li J., Li H., Wang E.-D., Structural basis for substrate binding and catalytic mechanism of a human RNA:M5C methyltransferase NSun6. Nucleic Acids Res. 45, 6684–6697 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bujnicki J. M., Feder M., Ayres C. L., Redman K. L., Sequence-structure-function studies of tRNA:M5C methyltransferase Trm4p and its relationship to DNA:M5C and RNA:M5U methyltransferases. Nucleic Acids Res. 32, 2453–2463 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Holm L., Laiho A., Törönen P., Salgado M., DALI shines a light on remote homologs: One hundred discoveries. Protein Sci. 32, e4519 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Qiu S., Ogino M., Luo M., Ogino T., Green T. J., Structure and function of the N-terminal domain of the vesicular stomatitis virus RND polymerase. J. Virol. 90, 715–724 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Galperin M. Y., et al. , COG database update: Focus on microbial diversity, model organisms, and widespread pathogens. Nucleic Acids Res. 49, D274–D281 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ohira T., et al. , Reversible RNA phosphorylation stabilizes tRNA for cellular thermotolerance. Nature 605, 372–379 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Navarro I. C., et al. , Identification of putative reader proteins of 5-methylcytosine and its derivatives in Caenorhabditis elegans RNA. Wellcome Open Res. 7, 282 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tang Y., et al. , OsNSUN2-mediated 5-methylcytosine mRNA modification enhances rice adaptation to high temperature. Dev. Cell 53, 272–286.e7 (2020). [DOI] [PubMed] [Google Scholar]
- 30.Huber S. M., Leonardi A., Dedon P. C., Begley T. J., The versatile roles of the tRNA epitranscriptome during cellular responses to toxic exposures and environmental stress. Toxics 7, 17 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Turner B., et al. , Archaeosine modification of archaeal tRNA: Role in structural stabilization. J. Bacteriol. 202, e00748-19 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mao S., et al. , Base pairing, structural and functional insights into N4-methylcytidine (m4C) and N4, N4-dimethylcytidine (m42C) modified RNA. Nucleic Acids Res. 48, 10087–10100 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Dong H., Zhang B., Shi P.-Y., Flavivirus methyltransferase: A novel antiviral target. Antiviral Res. 80, 1–10 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hayatsu H., Negishi K., Shiraishi M., DNA methylation analysis: Speedup of bisulfite-mediated deamination of cytosine in the genomic sequencing procedure. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 80, 189–194 (2004). [Google Scholar]
- 35.Huo W., Adams H. M., Zhang M. Q., Palmer K. L., Genome modification in enterococcus faecalis OG1RF assessed by bisulfite sequencing and single-molecule real-time sequencing. J. Bacteriol. 197, 1939–1951 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Boccaletto P., et al. , MODOMICS: A database of RNA modification pathways. 2021 update. Nucleic Acids Res. 50, D231–D235 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kimura S., Suzuki T., Fine-tuning of the ribosomal decoding center by conserved methyl-modifications in the Escherichia coli 16S rRNA. Nucleic Acids Res. 38, 1341–1352 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wei Y., et al. , Crystal and solution structures of methyltransferase RsmH provide basis for methylation of C1402 in 16S rRNA. J. Struct. Biol. 179, 29–40 (2012). [DOI] [PubMed] [Google Scholar]
- 39.Sibbritt T., Patel H. R., Preiss T., Mapping and significance of the mRNA methylome. Wiley Interdiscip. Rev. RNA 4, 397–422 (2013). [DOI] [PubMed] [Google Scholar]
- 40.Bheemanaik S., Reddy Y. V. R., Rao D. N., Structure, function and mechanism of exocyclic DNA methyltransferases. Biochem. J. 399, 177–190 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Scott K. A., Williams S. A., Santangelo T. J., Thermococcus kodakarensis provides a versatile hyperthermophilic archaeal platform for protein expression. Methods Enzymol. 659, 243–273 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hileman T. H., Santangelo T. J., Genetics techniques for Thermococcus kodakarensis. Front. Microbiol. 3, 195 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Gehring A. M., Sanders T. J., Santangelo T. J., Markerless gene editing in the hyperthermophilic archaeon Thermococcus kodakarensis. Bio. Protoc. 7, e2604 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Morlan J. D., Qu K., Sinicropi D. V., Selective depletion of rRNA enables whole transcriptome profiling of archival fixed tissue. PLoS One 7, e42882 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Paysan-Lafosse T., et al. , InterPro in 2022. Nucleic Acids Res. 51, D418–D427 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Fluke K. A., Thermococcus kodakarensis bisulfite-sequenced RNA. NCBI Sequence Read Archive. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA937301. Deposited 21 February 2023.
- 47.Fluke K. A., Thermococcus kodakarensis TK2045 project, whole genome sequence reads. NCBI Sequence Read Archive. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1154289. Deposited 29 August 2024.
- 48.Ho P. S., Santangelo T. J., Fluke K. A., 8VYD: A novel synthase generates m4(2)C to stabilize the archaeal ribosome. Protein Data Bank. https://www.rcsb.org/structure/8VYD. Deposited 8 February 2024.
- 49.Matzov D., et al. , 6TH6: Cryo-EM Structure of T. kodakarensis 70S ribosome. Protein Data Bank. https://www.rcsb.org/structure/6TH6. Deposited 18 November 2019.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix 01 (PDF)
Dataset S01 (XLSX)
Data Availability Statement
Bisulfite-sequencing fastq files have been deposited into NCBI Sequence Read Archive under BioProject PRJNA937301 (46). Whole genome sequencing fastq files have been deposited into NCBI Sequence Read Archive under BioProject PRJNA1154289 (47). The atomic structure of m42C synthase has been deposited into Protein Data Bank under accession code 8VYD (48). The atomic structure of the T. kodakarensis 70S ribosome was queried from Protein Data Bank under accession code 6TH6 (49).
