Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2009 Apr 28;106(19):7768–7773. doi: 10.1073/pnas.0900528106

A chemical screen for biological small molecule–RNA conjugates reveals CoA-linked RNA

Walter E Kowtoniuk 1, Yinghua Shen 1, Jennifer M Heemstra 1, Isha Agarwal 1, David R Liu 1,1
PMCID: PMC2674394  PMID: 19416889

Abstract

Compared with the rapidly expanding set of known biological roles for RNA, the known chemical diversity of cellular RNA has remained limited primarily to canonical RNA, 3′-aminoacylated tRNAs, nucleobase-modified RNAs, and 5′-capped mRNAs in eukaryotes. We developed two methods to detect in a broad manner chemically labile cellular small molecule–RNA conjugates. The methods were validated by the detection of known tRNA and rRNA modifications. The first method analyzes small molecules cleaved from RNA by base or nucleophile treatment. Application to Escherichia coli and Streptomyces venezuelae RNA revealed an RNA-linked hydroxyfuranone or succinyl ester group, in addition to a number of other putative small molecule–RNA conjugates not previously reported. The second method analyzes nuclease-generated mononucleotides before and after treatment with base or nucleophile and also revealed a number of new putative small molecule–RNA conjugates, including 3′-dephospho-CoA and its succinyl-, acetyl-, and methylmalonyl-thioester derivatives. Subsequent experiments established that these CoA species are attached to E. coli and S. venezuelae RNA at the 5′ terminus. CoA-linked RNA cannot be generated through aberrant transcriptional initiation by E. coli RNA polymerase in vitro, and CoA-linked RNA in E. coli is only found among smaller (≲200 nucleotide) RNAs that have yet to be identified. These results provide examples of small molecule-RNA conjugates and suggest that the chemical diversity of cellular RNA may be greater than previously understood.

Keywords: mass spectrometry, RNA modifications, coenzyme A


Over the past few decades, RNA has emerged as much more than an intermediary in biology's central dogma. Ribozymes (1), riboswitches (2), microRNAs (miRNAs) (3), small interfering RNAs (siRNAs) (4), Piwi-interacting RNAs (piRNAs) (5), small nuclear RNAs (snRNAs) (6), CRISPR sRNAs (7), RNA transcriptional regulators (8), and long noncoding RNAs (9, 10) are all examples of RNAs that are thought to play a wide range of catalytic, regulatory, or defensive roles in the cell. Models of early biotic systems have proposed even broader roles for RNA, including the possibility that RNA-tethered molecules participated in RNA-templated chemical reactions as an early form of metabolism (1116).

In contrast with these newer insights into its functional diversity, the known chemical diversity of natural RNA has remained limited primarily to canonical polyribonucleotides, 3′-aminoacylated tRNAs (17), modified nucleobases in a variety of RNAs (18), and 5′-capped mRNAs in eukaryotes (1920). This disparity between functional and chemical diversity, coupled with the powerful functional properties of synthetic small molecule-nucleic acid conjugates (2124) led us to speculate that small molecule–RNA conjugates beyond those previously described may exist in modern cells as evolutionary fossils or even as novel RNAs with functions enabled by their modifications.

To begin to explore this possibility, we have developed and implemented a general approach to discovering small molecule-RNA conjugates that does not depend on a specific type of small-molecule structure or a particular biological function of the conjugate. Our method uses simple chemical reactions on RNA to liberate small molecules or small molecule-conjugated nucleotides. Comparative high-resolution liquid chromatography and mass spectrometry (LC/MS) of these species identifies the masses of labile small molecules or small molecule-nucleotide conjugates that are putatively linked to cellular RNA. MS/MS fragmentation, isotope labeling, and comparison with authentic standards is then used to elucidate the structures of small molecules derived from conjugates with biological RNAs.

Using this approach, we have identified CoA and several CoA thioesters as covalent conjugates to cellular RNA in Escherichia coli and Streptomyces venezuelae. Experiments with E. coli RNA polymerase in vitro suggest that the observed CoA groups are not installed through aberrant, nonspecific transcriptional initiation. In addition, experiments indicate that the CoA-derived RNA(s) are ≲200 nt in length. Although the identity of the corresponding RNA(s) and their possible biological relevance are not yet known, our findings collectively suggest that the chemical diversity of biological RNA in modern cells is greater than previously understood.

Results

Small-Molecule Cleavage Method Detects Known RNA Modifications.

We subjected whole cellular RNA from E. coli or S. venezuelae to size-exclusion chromatography and retained the macromolecular fraction (≳2,500 Da). One half of the resulting material was treated with mild aqueous base (pH 8.0) or with a simple alkyl amine nucleophile (500 mM n-butylamine in acetonitrile) to cleave base-labile and nucleophile-labile small molecules, respectively. The other half was subjected to control conditions (pH 4.5, or acetonitrile with no n-butylamine, respectively) designed to leave small molecule–RNA conjugates intact (Fig. 1).

Fig. 1.

Fig. 1.

Small-molecule cleavage method for small molecule–RNA conjugate discovery.

Each sample was separately subjected to size-exclusion chromatography as before, but this time the small-molecule fraction (≲2,500 Da) was retained. The cleavage condition and control condition samples were then analyzed by LC/MS. Peaks with corresponding retention times containing species with similar mass:charge ratios (m/z) from the two samples were computationally paired, and their relative abundances were calculated (25). Species that were more abundant in the base- or nucleophile-treated sample relative to the control sample were considered candidate small molecules cleaved from cellular macromolecules (Fig. 1). This comparative analysis proved essential because each separate sample contained thousands of detectable chemical species.

To account for the possible presence of contaminating non-RNA macromolecules in our RNA preparations, we also pretreated a third whole cellular RNA sample with a mixture of RNase A and RNase T1 before the first size-exclusion step to confirm that candidate small molecules arose from a small molecule–RNA conjugate rather than from small molecules conjugated to other macromolecules. The ion abundance of a genuine small molecule-RNA conjugate, but not that of a contaminating small molecule-macromolecule conjugate, should decrease significantly in samples pretreated with RNase. Small molecules that were more abundant on treatment with base or nucleophile, but that were less abundant when pretreated with RNase, were considered candidate small molecules cleaved from cellular RNA.

The small-molecule cleavage method was validated in vitro by using aminoacylated tRNAs as positive controls. Phenylalanine-charged tRNAPhe was prepared in vitro from purified phenylalanine-tRNA aminoacyl synthetase (PheRS) and E. coli tRNA (26). Comparative high-resolution LC/MS analysis revealed a 65-fold greater abundance of phenylalanine arising from base cleavage conditions compared with the control conditions (Fig. 2A). Similarly, 52-fold and 42-fold ratios of amino acid abundance in base-treated versus control samples were observed with tRNA charged in vitro with LeuRS and AspRS, respectively. These results demonstrate that the small-molecule cleavage method is able to detect aminoacylated tRNAs generated in vitro.

Fig. 2.

Fig. 2.

Initial application of the small-molecule cleavage method and structural elucidation of [M+H]+ m/z = 101.023. (A) Purified E. coli tRNA was aminoacylated in vitro with phenylalanine and PheRS, then subjected to the small-molecule cleavage method and analyzed by LC/MS. The extracted ion chromatogram (EIC) at [M+H]+ m/z = 166.08 (corresponding to phenylalanine) for both samples is shown here; the cleavage conditions (pH 8.0) result in 65-fold higher Phe abundance than the control conditions (pH 4.5). (B) EIC for the experiment in A using total RNA isolated from E. coli instead of aminoacylated tRNA. (C) The EICs for the unknown ion of [M+H]+ m/z = 101.0232 from E. coli RNA subjected to pH 8.0 cleavage conditions, pH 4.5 control conditions, or pretreatment with RNase A and RNase T1 before pH 8.0 cleavage conditions. (D) Possible carboxylic acids of the formula C4H4O3, excluding ketenes, allenes, allene oxides, and oxocyclopropanes. (E) Coinjection of the n-butyl amide variants of candidates 1, 2, or 3 (compounds 9, 10, or 11, respectively) with the cellular butyl amide reveals that the cellular butyl amide matches hydroxyfuranone butyl amide 11. (F) MS/MS fragmentation of cellular n-butyl amide (Top) and synthetic 12 (Bottom) confirms that the [M+H]+ m/z = 101.023 species is the hydroxyfuranone 4 or it aqueous tautomers.

Next, we validated the small-molecule cleavage method by detecting amino acids conjugated to endogenous cellular RNA. Freshly isolated RNA from E. coli was subjected to the small-molecule cleavage method (Fig. 2B). To narrow the resulting list of candidates, we defined three criteria: (i) ≥4.5-fold enrichment in the base-treated samples relative to the control samples; (ii) ≥3.0-fold enrichment of a corresponding butyl amine addition product in the nucleophile-treated samples relative to their control samples; and (iii) ≥2-fold lower base enrichment values on pretreatment with RNases A and T1. These thresholds were empirically determined to be sufficiently low to enable detection of most of the amino acid positive controls, while sufficiently high to avoid most false positives such as canonical mononucleotides predicted to be stable to the base and nucleophiles used.

When this approach was applied to E. coli and S. venezuelae RNA, 14 of the 20 aa (70%) were found to meet all three of the above criteria (supporting information (SI) Appendix, Fig. S4). The successful detection of the majority of the amino acids conjugated to RNA validates the ability of the small-molecule cleavage method to detect the presence of known small molecule–RNA conjugates from whole cellular RNA.

Small-Molecule Cleavage Method Detects Unknown RNA Conjugates.

In addition to the expected amino acids and known labile nucleobase modifications (data not shown), the small-molecule cleavage method applied to E. coli RNA also reveals 5 unknown species that met all three of the above criteria for putative base- or nucleophile-labile small molecule–RNA conjugates (SI Appendix, Fig. S5). When S. venezuelae RNA was subjected to the same treatment, 14 aa and 5 unknown species were found to meet the same 3 criteria. The smallest of these unknown species, with [M+H]+ m/z = 101.0232, was found in both E. coli and S. venezuelae. Independent biological replicates of total E. coli RNA subjected to this method generated consistent enrichment factors with trial-to-trial correlation coefficients of ≈0.85; similarly, two independent trials of the small-molecule cleavage method applied to S. venezuelae RNA produced datasets with a correlation coefficient of 0.87.

A library of synthetic RNA 45-mers of random sequence was subjected to the complete RNA isolation and small-molecule cleavage method described above. None of the putative small molecule–RNA conjugates arising from the analysis of E. coli or S. venezuelae RNA were significantly enriched when using synthetic RNA. These results indicate that the putative small-molecule RNA conjugates from bacterial RNA arise from cellular processes, and not from RNA degradation or rearrangement reactions that occur during the small-molecule cleavage method.

Structural Elucidation of m/z = 101.0232.

We chose the [M+H]+ m/z = 101.0232 species (Fig. 2C and SI Appendix, Fig. S6) as our initial target for structural elucidation. The observed mass and isotope profile of the base-cleaved species suggested a molecular formula of C4H4O3 (expected [M+H]+ m/z = 101.0233). A corresponding n-butylamine-treated cleavage product was observed with [M+H]+ m/z = 156.1013, suggesting that this unidentified species could arise from an RNA-linked oxyester or thioester that undergoes hydrolysis in the presence of mild base to form a carboxylic acid, and aminolysis in the presence of n-butylamine to form an n-butylamide.

In addition, analysis of the n-butylamine-treated RNA from both E. coli and S. venezuelae revealed the presence of a species consistent with the double addition of n-butylamine to the C4H4O3 unknown and the loss of two molecules of water (observed [M+H]+ m/z = 211.1826; expected m/z for C12H23N2O = 211.1805; see also SI Appendix). Excluding ketenes, allenes, allene oxides, and oxocyclopropanes because of their rarity among biological molecules, we proposed 7 possible carboxylic acids consistent with the molecular formula C4H4O3, of which only three (molecules 1, 2, and 3) are capable of undergoing a second n-butylamine addition with loss of water (Fig. 2D).

Authentic samples of the n-butyl amides of 1, 2, and 3 (n-butyl amides 9, 10, and 11, respectively) were prepared by chemical synthesis (SI Appendix, Figs. S1–S3). The LC/MS spectrum of the cellular n-butyl amide of [M+H]+ m/z = 101.0232 did not match those of synthetic n-butyl amides 9 or 10, indicating that the unknown is not ketone 1 or trans-alkene 2 (Fig. 2E). The remaining candidate, compound 3, preferentially exists as the hydroxyfuranone 4, which can spontaneously tautomerize in aqueous solution to form succinic anhydride (27, 28). The n-butyl amide of 4 (compound 11) was synthesized and found to spontaneously isomerize to n-butyl succinimide (12). LC/MS analysis revealed that 12 matched the n-butyl amide of the cellular unknown. Importantly, the MS/MS ion fragmentation patterns of 12 and the n-butyl amide of the cellular unknown were virtually identical (Fig. 2F). Taken together, these results are consistent with a model in which the observed 100.0154 Da base-cleaved species is hydroxyfuranone 4 or its tautomer, succinic anhydride (SI Appendix, Fig. S7).

Subsequent experiments did not reveal any RNA nucleotides directly conjugated to a 100.0154-Da small molecule (see below). Therefore, we hypothesized that 4 is not directly conjugated to RNA, but instead arises from a base- and nucleophile-labile succinyl group of a larger small molecule–RNA conjugate, the identity of which we sought to reveal by using a method capable of detecting small molecule-linked nucleotides.

Nucleotide Cleavage Method Detects Known Small Molecule-Nucleotide Conjugates.

The small-molecule cleavage method described above can reveal labile small molecules that are directly or indirectly conjugated to RNA, but does not characterize intact small molecule-nucleotide conjugates as they might exist in cellular RNA. We therefore developed a complementary method to identify nucleotides conjugated to base-labile or nucleophile-labile small molecules. In this second method, the macromolecular fraction of whole cellular RNA was treated with nuclease P1, an endonuclease that cleaves RNA to generate mononucleotides with a 3′-hydroxyl group and a 5′-phosphate (29). As in the first method, one half of the resulting sample was treated with base (pH 10.5) or nucleophile (500 mM n-butylamine in acetonitrile) and the other half was treated with control conditions (pH 4.5, or acetonitrile, respectively). The samples were subjected to size-exclusion chromatography again and the small-molecule fraction from each was retained. Following comparative high-resolution LC/MS, species with greater abundance in control samples versus the base- or nucleophile-treated samples were considered candidate nucleotides linked to labile small molecules (SI Appendix, Fig. S8).

As with the small-molecule cleavage method, the nucleotide cleavage method was also validated by detection of amino acid-linked RNAs from whole cellular E. coli and S. venezuelae RNA. An enrichment threshold of ≥2-fold was empirically found to distinguish the 3′-aminoacyl adenosine monophosphates, which serve as base-labile positive controls, from known nucleotide modifications, such as N6,N6-dimethyladenosine, that should not be base labile. Species enriched ≥2-fold included 15–16 of the 20 major 3′-aminoacyl adenosine monophosphates (SI Appendix, Fig. S9), as well as many species consistent with rRNA and tRNA nucleoside modifications (SI Appendix, Fig. S10) that have been previously reported, although not necessarily known to exist in E. coli or S. venezuelae (18). These results validate the ability of the nucleotide cleavage method to detect the presence of known small molecule-RNA conjugates.

Nucleotide Cleavage Method Detects Unknown Small Molecule-Nucleotide Conjugates.

In addition to 3′-aminoacyl adenosine monophosphates and known nucleotide modifications, 17 unknown species were enriched ≥2-fold in E. coli nucleotide samples treated with control conditions relative to base-treated samples (SI Appendix, Fig. S11). For S. venezuelae RNA, the nucleotide cleavage method revealed 18 unknown species that were enriched 2-fold or more (SI Appendix, Fig. S11). Independent biological replicates of the nucleotide cleavage method generate enrichment factors with trial-to-trial correlation coefficients of 0.93–0.95 (Fig. 3 and SI Appendix, Fig. S12). None of the observed unknown species were detected from total E. coli or S. venezuelae RNA whether nuclease P1 was omitted, or whether nuclease P1 treatment was replaced with incubation in formamide and/or 10 mM EDTA at 95 °C, conditions expected to impair RNA secondary structure. These results suggest that the species in Fig. 3 arise from nuclease P1-mediated RNA cleavage, and not from the liberation of small molecules noncovalently bound to RNA.

Fig. 3.

Fig. 3.

Result of two independent trials (r = 0.95) of the nucleotide-cleavage method applied to total S. venezuelae RNA. The observed species include 16 3′-aminoacyl adenosine monophosphates, 16 known nucleotide modifications, the four canonical RNA nucleotides, 3′-dephospho-CoA and its three thioester derivatives discovered in this work, and 18 additional unknown species with a control:base ratio ≥2-fold. Note that 3′-dephospho-CoA is observed with an control:base ratio <1 due to the base-labile nature of the 3′-dephospho-CoA thioesters.

Structural Elucidation of m/z = 786.1582 and 686.1432.

One unknown species from both E. coli and S. venezuelae detected by the nucleotide cleavage method was [M−H] m/z = 786.1582 (Fig. 4A and SI Appendix, Fig. S13). We became especially interested in this species because its MS/MS spectrum always included a major fragment with [M−H] m/z = 686.1209 (Fig. 4B), and because a second unknown species with a similar observed mass ([M−H] m/z = 686.1432) was slightly more abundant on treatment with base in both bacteria. We hypothesized that these two species might represent a larger, base- and nucleophile-labile small molecule-nucleotide conjugate (787.1660 Da), and a smaller version (687.1510 Da) that is left behind after the loss of a 100.0154-Da C4H4O3 moiety such as the succinyl group hypothesized above to be conjugated to RNA. In support of this model, the MS/MS fragmentation daughter ions of the [M−H] m/z = 686.1432 species represented a subset of the daughter ions arising from fragmentation of the [M−H] m/z = 786.1582 ion (SI Appendix, Figs. S14 and S15).

Fig. 4.

Fig. 4.

Two small molecule-linked nucleotides of [M−H] m/z = 786.1582 and 686.1432 from E. coli and S. venezuelae RNA. (A) The EICs for [M−H] m/z = 786.1532 from E. coli RNA digested with nuclease P1 and subjected to cleavage conditions (pH 10.5) or control conditions (pH 4.5). (B) MS/MS fragmentation of the [M−H] m/z = 786.1582 species from E. coli and of authentic 3′-dephospho-succinyl-CoA. See SI Appendix, Figs. S14 and S15 for a plausible complete fragment assignment. (C) EIC comparison of the E. coli cellular RNA nuclease P1 digest and authentic 3′-dephospho-succinyl-CoA. (D) Spiking large quantities of CoA thioesters into E. coli cell lysate before RNA isolation and the nucleotide cleavage method does not change the observed ion counts of these species, indicating that the observed 3′-dephospho-CoA signals do not arise from small-molecule CoA thioester contaminants. (E) Total E. coli RNA was separated into RNAs of length ≳200 nt (fraction I) and RNAs of length ≲200 nt (fraction II) using a silica column (Qiagen RNeasy). Each fraction was subjected to nuclease P1 digestion and analyzed by LC/MS. The presence of 3′-dephospho-CoA in fraction II suggests that the CoA-linked RNA(s) are primarily ≲200 nt in length.

The molecular weights of these two unknowns are too large to unambiguously assign empirical formulas. We therefore cultured S. venezuelae in media containing 13C-glucose as the sole carbon source, or in media containing 15N-ammonium sulfate as the sole nitrogen source. Total S. venezuelae RNA from each culture was separately treated with nuclease P1 and analyzed by LC/MS. The resulting shifts in observed m/z values allowed us to assign a molecular formula of C25H38N7O16P2S to the compound with [M−H] m/z = 786.1582 and a molecular formula of C21H34N7O13P2S to the compound with [M−H] m/z = 686.1432 (SI Appendix, Fig. S16). The smaller compound therefore represents the loss of C4H4O3 from the larger, consistent with the above model.

Inspection of the MS/MS fragmentation patterns of both unknowns strongly suggested that both species contain ADP. Therefore, we reasoned that the 787.1660-Da species likely consists of a 100.0154-Da group and a 260.1211-Da group attached to the pyrophosphate of ADP (Fig. 4 and SI Appendix, Figs. S14 and S15). The fragmentation data also suggested that the 687.1510-Da species is the same 260.1211-Da group attached to the pyrophosphate of ADP, but lacking the 100.0154-Da group. Collectively, these observations led us to propose that the 786.1582 species and 686.1432 species are 3′-dephosphosuccinyl-CoA (expected [M−H] m/z = 786.1577) and 3′-desphospho-CoA (expected [M−H] m/z = 686.1416), respectively. These hypotheses were confirmed by LC/MS comparison of the cellular species with authentic 3′-dephosphosuccinyl-CoA and authentic 3′-dephospho-CoA (Fig. 4C).

A search for additional related CoA derivatives in our datasets together with LC/MS analysis of authentic standards revealed that RNA from E. coli and S. venezuelae both contain 3′-dephosphoacetyl-CoA (observed [M−H] m/z = 728.1532; expected [M−H] m/z = 728.1522), and that RNA from S. venezuelae contains 3′-dephosphomethylmalonyl-CoA (observed [M−H] m/z = 786.1585; expected [M−H] m/z = 786.1577). The absence of a 3′-dephosphomethylmalonyl-CoA signal from E. coli RNA is consistent with the known inability of E. coli to biosynthesize methylmalonyl-CoA, assuming that biosynthesis of methylmalonyl-CoA precedes covalent attachment to RNA.

During the base-induced cleavage of small molecules from RNA, the succinyl group of succinyl-CoA-RNA can undergo cyclization to generate succinic anhydride while cleaving itself from the CoA-RNA. Indeed, analysis of either succinic anhydride or succinic acid under the LC/MS conditions used for the small-molecule cleavage method resulted in a LC peak with [M+H]+ m/z, retention time, and MS/MS fragmentation pattern identical to that of the cellular 100.0154-Da species (SI Appendix, Fig. S7). Together, these findings suggest that the [M+H]+ m/z = 101.0232 ion discovered through the small-molecule cleavage method is derived from the base-induced cleavage of the succinyl group from succinyl-CoA-RNA.

Characterization of the Attachment of CoA Derivatives to RNA.

CoA and its thioester derivatives are common cellular metabolites (30). To ensure that the detected CoA species did not simply arise from the 3′-phosphatase activity of nuclease P1 on intracellular CoA and CoA esters that had unexpectedly survived RNA purification and size exclusion, we spiked varying quantities of CoA thioesters into E. coli and S. venezuelae cell lysates, and repeated the RNA isolation, nuclease P1 digestion, and LC/MS analysis. Despite adding up to 10,000-fold more acetyl-CoA and succinyl-CoA than we observed in unspiked samples, no significant changes in the abundance of the corresponding 3′-dephosphorylated species were detected (Fig. 4D). Likewise, the addition to cell lysates of similarly large quantities of benzoyl-CoA, butyryl-CoA, and crotonyl-CoA, three CoA esters that were not observed in our original experiments, did not result in corresponding detectable levels of those compounds (Fig. 4D). These results demonstrate that the CoA species observed in our experiments on E. coli and S. venezuelae RNA cannot be accounted for by endogenous small-molecule contaminants, and further support the conclusion that these species arise from cellular small molecule–RNA conjugates.

Based on the structure of 3′-dephospho-CoA, we hypothesized that these modifications are present at the 5′ termini of one or more cellular RNAs. To test this hypothesis we digested total RNA from both E. coli and S. venezuelae with nuclease P1 in the presence of 18O-enriched water. Because nuclease P1 catalyzes the attack of a water molecule on RNA to generate 5′-phosphonucleotides (29), in the presence of 18O water all nuclease P1 digestion products other than the nucleotides at the 5′ termini will have a mass shift of +2 Da compared with products of digestion in 16O water. Indeed, the expected +2 Da shift was observed for 3′-Phe-AMP (observed [M−H] m/z = 495.1290; expected [M−H] m/z = 495.1285). In contrast, no mass shift from nuclease P1 digestion in the presence of 18O water was observed for any of the 3′-dephospho-CoA derivatives, consistent with a model in which these species are originally present at the 5′ termini of RNA molecules (SI Appendix, Fig. S17).

By comparing signal intensities of cellular and authentic samples of known concentration, we estimate that there are 80–120 total copies of CoA-RNA and CoA-thioester-RNA per E. coli or S. venezuelae cell. The amount of total CoA linked per μg of total RNA is ≈8 fmol for E. coli and ≈13 fmol for S. venezuelae.

Transcriptional Initiation by E. coli RNA Polymerase in Vitro Cannot Account for Observed Levels of CoA-RNA.

Because 3′-dephospho-CoA shares structural features with ATP and is a known biosynthetic precursor of CoA (30), we speculated that CoA might be incorporated into RNA at the 5′ terminus through aberrant transcriptional initiation with 3′-dephospho-CoA or its thioesters instead of ATP. Indeed, this mechanism of CoA incorporation into a transcript in vitro has been reported with T7 RNA polymerase (31). To explore this possibility we carried out in vitro transcription by using E. coli RNA polymerase in the presence of high concentrations of 3′-dephospho-CoA by using two different templates. For the first template, we modified a pUC19 plasmid to encode an adenosine at the +1 position of each of its four predicted transcripts. An in vitro transcription reaction containing 0.5 mM NTP each and either 0.5 mM or 5 mM 3′-dephospho-CoA yielded 555 μg or 544 μg of RNA, respectively. When this RNA was purified, digested with nuclease P1, and analyzed by LC/MS, no 3′-dephospho-CoA was detected. The second template used was E. coli genomic DNA. In vitro transcription in the presence of either 0.5 mM or 5 mM of 3′-dephospho-CoA yielded 89 μg or 95 μg of RNA. Once again, this material contained no detectable 3′-dephospho-CoA after nuclease P1 digestion (SI Appendix, Fig. S18). In contrast, when a 5′-CoA-linked transcript (generated by using T7 RNA polymerase) was spiked into an in vitro transcription reaction and processed in the same way, CoA-linked RNA was readily detected (SI Appendix, Fig. S18).

Based on the observed abundances of CoA-RNA from E. coli cells, we would expect to obtain >2.8 pmol of 3′-dephospho-CoA from ≈550 μg of A-initiated RNA, and 0.45 pmol of 3′-dephospho-CoA from ≈90 μg of RNA transcribed from the E. coli genome, if aberrant transcriptional initiation were predominantly responsible for the CoA-RNA conjugates. These quantities should be readily detected by our methods, which can reliably detect ≤0.1 pmol of 3′-dephospho-CoA (SI Appendix, Fig. S18). If one assumes that the inability of E. coli RNA polymerase to incorporate these levels of 3′-dephospho-CoA in vitro reflects an inability to do so in vivo, these results suggest that CoA groups are installed posttranscriptionally.

Size Distribution of CoA-Linked RNAs.

Both methods described above subject RNA to size exclusion to remove molecules of molecular mass ≲2,500 Da. To establish an upper size limit on CoA-linked RNAs, we subjected the macromolecule fraction to further size fractionation by using silica-based RNA purification columns (Qiagen RNeasy columns), which separate RNA molecules into two fractions that are less than or greater than ≈200 nt in length (SI Appendix, Fig. S19). Each of the two fractions was then subjected to nuclease P1 digestion and LC/MS analysis.

As expected, 3′-aminoacyl adenosine monophosphates conjugated to tRNAs (≈76 nt) were present predominantly in the <200-nt flow-through fraction (Fig. 4E), and the rRNA nucleoside modification N6,N6′-dimethyladenine (conjugated to 1.5–2.9 kB rRNAs) was detected in the >200 base fraction (Fig. 4E) (32, 33). Like the 3′-aminoacyl adenosine monophosphates, the CoA-linked nucleotides were predominantly detected in the flow-through RNA fraction. This result suggests that the CoA-linked RNA(s) from E. coli and S. venezuelae are not widely distributed in their size but instead are ≲200 nt in length. In addition, this finding further supports the hypothesis that the CoA modifications arise through a mechanism other than nonspecific transcriptional initiation, which would be expected to generate a broad size distribution of CoA-linked RNAs.

Discussion

We have developed and validated two methods that, in principle, enable the detection of any base- or nucleophile-labile small molecule–RNA conjugate. Application of these methods led to the discovery of a hydroxyfuranone or succinyl group, as well as a series of CoA derivatives including succinyl-CoA, linked to E. coli and S. venezuelae RNA. These findings represent new examples of biological small molecule–RNA conjugates beyond aminoacylated tRNAs, RNAs containing modified nucleobases, and 5′-capped mRNA in eukaryotes. More generally, our results suggest that the chemical diversity of cellular RNA is greater than previously understood. Because E. coli and S. venezuelae represent two different phyla, our findings suggest that the presence of these newly discovered conjugates is not limited to a narrow range of species.

The 3′-dephospho-CoA group is attached to the 5′ terminus of cellular RNA(s) of length ≲200 nt. On average we observe ≈100 CoA-RNA molecules per E. coli cell, which suggests that CoA-linked RNAs together are approximately 10-fold less abundant than Phe-linked tRNA in E. coli (34) and ≈10- to 100-fold less abundant than the E. coli 6S RNA (35). Although we currently do not know the biological role, if any, that these CoA-RNA conjugates might play, it is tempting to speculate that they might play a role in RNA stability, RNA localization, or gene regulation, or even in mediating chemical reactions involving CoA groups linked to RNA strands that serve to direct reactivity (21, 24). The last possibility highlights an unusual feature of these groups compared with most previously discovered RNA modifications—namely, that CoA and CoA thioesters are substantially more reactive.

From E. coli RNA we observe 3′-dephospho-CoA, succinyl-dephospho-CoA, and acetyl-dephospho-CoA as RNA conjugates. In addition, we observe methylmalonyl-dephospho-CoA as a RNA conjugate from S. venezuelae. These observations suggest that CoA attachment to RNA occurs after thioesterification. The liberation of 3′-dephospho-CoA derivatives from cellular RNA by nuclease P1 digestion together with their presence on the 5′ terminus of RNA (SI Appendix, Fig. S17) strongly suggests that the CoA-RNA linkage is a phosphodiester bond linking the 3′-phosphate of CoA to the 5′ end of the RNA.

Although our in vitro transcription experiments suggest that nonspecific transcriptional initiation is not the primary mechanism for CoA-RNA formation, they do not exclude the possibility of a gene-specific transcriptional initiation pathway for CoA incorporation, or even a nonspecific transcriptional pathway whether other cellular components beyond those present in the in vitro transcription reactions are required. DNA primase synthesizes short RNAs that prime DNA synthesis (36) and in theory could also serve as possible source of CoA-linked RNAs. Primase-generated RNAs have been observed to be 10-fold less abundant in E. coli (37) than CoA-linked RNAs, however, arguing against this possibility. Finally, this study suggests the need to identify small molecule–RNA conjugates, characterize the RNA species to which these groups are attached, and evaluate their possible functional roles in living systems.

Experimental Methods

See SI Appendix for additional experimental details.

Small-Molecule Cleavage Method.

One milligram of E. coli RNA or 750 μg of S. venezuelae RNA as prepared above was subjected to cleavage conditions (base: 500 mM NH4HCO3, pH 8.0, 37 °C, 2.5 h; nucleophile: 500 mM n-butylamine in acetonitrile, 37 °C, 8.0 h). An equal quantity of RNA was subjected to control conditions (base control: 500 mM NH4OAc, pH 4.5, 37 °C, 2.5 h; nucleophile control: acetonitrile, 300 μL, 37 °C, 8.0 h). After treatment, the samples were acidified with 200 μL of 3 M NH4OAc, pH 4.5. The small-molecule fraction was isolated by size-exclusion chromatography by using a NAP5 column and lyophilized. The lyophilized product was redissolved in 20 μL of 0.1% aqueous sodium formate and analyzed by LC/MS. For the experiment using synthetic RNA, 2.3 μmol of a library of random synthetic N45 RNAs (IDT) was dissolved in cell lysis buffer, and processed as described above.

LC/MS Data Collection and Analysis.

LC/MS was performed by using a Waters Aquity UPLC Q-TOF Premier instrument with an Aquity UPLC BEH C18 column. See SI Appendix for a detailed description of LC/MS and MS/MS conditions. The analysis of total ion chromatograms was performed by using the XCMS program (25). Integrated ion abundances were averaged among replicates, and the ratios of these average ion intensities between cleavage conditions and control conditions were the enrichment values reported

Base-cleaved species were matched with corresponding nucleophile-cleaved species by subtracting 55.07858 ± 0.020 Da (the mass of butylamine minus the mass of water) from the n-butyl amide cleavage products. For the purpose of this study, base-cleaved species without a corresponding nucleophile-cleaved partner were discarded, even though some small molecule–RNA conjugates were overlooked as a result.

Nucleotide Cleavage Method.

Three hundred fifty micrograms of E. coli RNA or 250 μg of S. venezuelae RNA was digested with 10 U nuclease P1 (Sigma-Aldrich) in 200 μL of 50 mM NH4OAc, pH 4.5 at 37 °C for 20 min). The digestion products were purified by size-exclusion chromatography (NAP5) and the small-molecule fraction was retained. Half of the resulting nucleotides were subjected to cleavage conditions (base: 500 mM (NH4)2CO3, pH 10.5, 37 °C, 2.5 h; nucleophile: 500 mM n-butylamine in acetonitrile, 37 °C, 8.0 h) whereas the other half was subjected to control conditions (base control: 500 mM NH4OAc, pH 4.5, 37 °C, 2.5 h; nucleophile control: nucleophile control: acetonitrile, 300 μL, 37 °C, 8.0 h). The samples were acidified with 200 μL of 3 M NH4OAc, pH 4.5, lyophilized, redissolved in 20 μL of 0.1% aqueous ammonium formate, and analyzed by LC/MS. The nuclease P1 digestion with H218O (Cambridge Isotope Laboratories) was performed as described above except in buffer with a final composition containing 86% H218O and 14% H216O.

Supplementary Material

Supporting Information

Acknowledgments.

We thank Jack Szostak and Matt Hartman for aminoacyl-tRNA synthetase enzymes. This work was supported by the Howard Hughes Medical Institute and National Institutes of Health/National Institute for General Medical Sciences Grant R01GM065865. W.E.K. received a National Science Foundation Graduate Research Fellowship.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0900528106/DCSupplemental.

References

  • 1.Doudna JA, Cech TR. The chemical repertoire of natural ribozymes. Nature. 2002;418:222–228. doi: 10.1038/418222a. [DOI] [PubMed] [Google Scholar]
  • 2.Mandal M, Breaker RR. Gene regulation by riboswitches. Nat Rev Mol Cell Biol. 2004;5:451–463. doi: 10.1038/nrm1403. [DOI] [PubMed] [Google Scholar]
  • 3.Chen K, Rajewsky N. The evolution of gene regulation by transcription factors and microRNAs. Nat Rev Genet. 2007;8:93–103. doi: 10.1038/nrg1990. [DOI] [PubMed] [Google Scholar]
  • 4.Matzke MA, Birchler JA. RNAi-mediated pathways in the nucleus. Nat Rev Genet. 2005;6:24–35. doi: 10.1038/nrg1500. [DOI] [PubMed] [Google Scholar]
  • 5.Brower-Toland B, et al. Drosophila PIWI associates with chromatin and interacts directly with HP1a. Genes Dev. 2007;21:2300–2311. doi: 10.1101/gad.1564307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Patel SB, Bellini M. The assembly of a spliceosomal small nuclear ribonucleoprotein particle. Nucleic Acids Res. 2008;36:6482–6493. doi: 10.1093/nar/gkn658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Sorek R, Kunin V, Hugenholtz P. CRISPR—A widespread system that provides acquired resistance against phages in bacteria and archaea. Nat Rev Microbiol. 2008;6:181–186. doi: 10.1038/nrmicro1793. [DOI] [PubMed] [Google Scholar]
  • 8.Storz G, Altuvia S, Wassarman KM. An abundance of RNA regulators. Annu Rev Biochem. 2005;74:199–217. doi: 10.1146/annurev.biochem.74.082803.133136. [DOI] [PubMed] [Google Scholar]
  • 9.Dinger ME, et al. NRED: A database of long noncoding RNA expression. Nucleic Acids Res. 2009;37:D122–126. doi: 10.1093/nar/gkn617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Mattick JS, Makunin IV. Non-coding RNA. Hum Mol Genet. 2006;15:R17–R29. doi: 10.1093/hmg/ddl046. [DOI] [PubMed] [Google Scholar]
  • 11.Illangasekare M, Yarus M. Specific, rapid synthesis of Phe-RNA by RNA. Proc Natl Acad Sci USA. 1999;96:5470–5475. doi: 10.1073/pnas.96.10.5470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Szostak JW, Bartel DP, Luisi PL. Synthesizing life. Nature. 2001;409:387–390. doi: 10.1038/35053176. [DOI] [PubMed] [Google Scholar]
  • 13.Benner SA, Ellington AD, Tauer A. Modern metabolism as a palimpsest of the RNA world. Proc Natl Acad Sci USA. 1989;86:7054–7058. doi: 10.1073/pnas.86.18.7054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jeffares DC, Poole AM, Penny D. Relics from the RNA world. J Mol Evol. 1998;46:18–36. doi: 10.1007/pl00006280. [DOI] [PubMed] [Google Scholar]
  • 15.Visser CM, Kellogg RM. Bioorganic chemistry and the origin of life. J Mol Evol. 1978;11:163–169. doi: 10.1007/BF01733891. [DOI] [PubMed] [Google Scholar]
  • 16.White HB. Coenzymes as fossils of an earlier metabolic state. J Mol Evol. 1976;7:101–104. doi: 10.1007/BF01732468. [DOI] [PubMed] [Google Scholar]
  • 17.Hoagland MB, Stephenson ML, Scott JF, Hecht LI, Zamecnik PC. A soluble ribonucleic acid intermediate in protein synthesis. J Biol Chem. 1958;231:241–257. [PubMed] [Google Scholar]
  • 18.Dunin-Horkawicz S, et al. MODOMICS: A database of RNA modification pathways. Nucleic Acids Res. 2006;34:D145–D149. doi: 10.1093/nar/gkj084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wei CM, Moss B. Methylated nucleotides block 5′-terminus of vaccinia virus messenger RNA. Proc Natl Acad Sci USA. 1975;72:318–322. doi: 10.1073/pnas.72.1.318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Furuichi Y, Miura K-I. A blocked structure at the 5′ terminus of mRNA from cytoplasmic polyhedrosis virus. Nature. 1975;253:374–375. doi: 10.1038/253374a0. [DOI] [PubMed] [Google Scholar]
  • 21.Li X, Liu DR. DNA-templated organic synthesis: nature's strategy for controlling chemical reactivity applied to synthetic molecules. Angew Chem Int Ed. 2004;43:4848–4870. doi: 10.1002/anie.200400656. [DOI] [PubMed] [Google Scholar]
  • 22.Kanan MW, Rozenman MM, Sakurai K, Snyder TM, Liu DR. Reaction discovery enabled by DNA-templated synthesis and in vitro selection. Nature. 2004;431:545–549. doi: 10.1038/nature02920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gartner ZJ, et al. DNA-templated organic synthesis and selection of a library of macrocycles. Science. 2004;305:1601–1605. doi: 10.1126/science.1102629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gartner ZJ, Liu DR. The generality of DNA-templated synthesis as a basis for evolving non-natural small molecules. J Am Chem Soc. 2001;123:6961–6963. doi: 10.1021/ja015873n. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Smith CA, Want EJ, O'Maille G, Abagyan R, Siuzdak G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem. 2006;78:779–787. doi: 10.1021/ac051437y. [DOI] [PubMed] [Google Scholar]
  • 26.Hartman MCT, Josephson K, Szostak JW. Enzymatic aminoacylation of tRNA with unnatural amino acids. Proc Natl Acad Sci USA. 2006;103:4356–4361. doi: 10.1073/pnas.0509219103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Poskonin VV, Badovskaya LA. Unusual conversion of 5-hydroxy-2(5H)furanone in aqueous solution. Chem Heterocycl Compd. 2003;39:594–597. [Google Scholar]
  • 28.Skrinrov Z, Bowden K, Fabian WMF. An ab initio and density functional study on the ring-chain tautomerism of (Z)-3-formyl-acrylic acid. Chem Phys Lett. 2000;316:531–535. [Google Scholar]
  • 29.Romier C, Dominguez R, Lahm A, Dahl O, Suck D. Recognition of single-stranded DNA by nuclease P1: high resolution crystal structures of complexes with substrate analogs. Proteins Struct Funct Bioinf. 1998;32:414–424. [PubMed] [Google Scholar]
  • 30.Leonardi R, Zhang Y-M, Rock CO, Jackowski S. Coenzyme A: Back in action. Prog Lipid Res. 2005;44:125–153. doi: 10.1016/j.plipres.2005.04.001. [DOI] [PubMed] [Google Scholar]
  • 31.Huang F. Efficient incorporation of CoA, NAD and FAD into RNA by in vitro transcription. Nucleic Acids Res. 2003;31:e8. doi: 10.1093/nar/gng008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Brosius J, Palmer ML, Kennedy PJ, Noller HF. Complete nucleotide sequence of a 16S ribosomal RNA gene from Escherichia coli. Proc Natl Acad Sci USA. 1978;75:4801–4805. doi: 10.1073/pnas.75.10.4801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Brosius J, Dull TJ, Noller HF. Complete nucleotide sequence of a 23S ribosomal RNA gene from Escherichia coli. Proc Natl Acad Sci USA. 1980;77:201–204. doi: 10.1073/pnas.77.1.201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Jakubowski H, Goldman E. Quantities of individual aminoacyl-tRNA families and their turnover in Escherichia coli. J Bacteriol. 1984;158:769–776. doi: 10.1128/jb.158.3.769-776.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wassarman KM, Storz G. 6S RNA Regulates E. coli RNA polymerase activity. Cell. 2000;101:613–623. doi: 10.1016/s0092-8674(00)80873-9. [DOI] [PubMed] [Google Scholar]
  • 36.Frick DN, Richardson CC. DNA PRIMASES. Annu Rev Biochem. 2001;70:39. doi: 10.1146/annurev.biochem.70.1.39. [DOI] [PubMed] [Google Scholar]
  • 37.Ogawa T, Hirose S, Okazaki T, Okazaki R. Mechanism of DNA chain growth: XVI. Analyses of RNA-linked DNA pieces in Escherichia coli with polynucleotide kinase. J Mol Biol. 1977;112:121–140. doi: 10.1016/s0022-2836(77)80160-5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES