Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Aug 15.
Published in final edited form as: Anal Chim Acta. 2008 Jun 26;623(2):117–125. doi: 10.1016/j.aca.2008.06.027

MASS SPECTROMETRY OF THE FIFTH NUCLEOSIDE

A REVIEW OF THE IDENTIFICATION OF PSEUDOURIDINE IN NUCLEIC ACIDS

Anita Durairaj 1, Patrick A Limbach 1,*
PMCID: PMC2597214  NIHMSID: NIHMS60067  PMID: 18620915

Abstract

Pseudouridine, the so-called fifth nucleoside due to its ubiquitous presence in RNAs, remains among the most challenging modified nucleosides to characterize. As an isomer of the major nucleoside uridine, pseudouridine cannot be detected by standard reverse-transcriptase based DNA sequencing or RNase mapping approaches. Thus, over the past 15 years, investigators have focused on the unique structural properties of pseudouridine to develop selective derivatization or fragmentation strategies for its determination. While the N-cyclohexyl-N’-β-(4-methylmorpholinium) ethylcarbodiimide p-tosylate (CMCT)-reverse transcriptase assay remains both a popular and powerful approach to screen for pseudouridine in larger RNAs, mass spectrometry-based approaches are poised to play an increasingly important role in either confirming the findings of the CMCT-reverse transcriptase assay or in characterizing pseudouridine sequence placement and abundance in smaller RNAs. This review includes a brief discussion of pseudouridine including a summary of its biosynthesis and known importance within various RNAs. The review then focuses on chemical derivatization approaches that can be used to selectively modify pseudouridine to improve its detection, and the development of mass spectrometry-based assays for the identification and sequencing of pseudouridine in various RNAs.

Keywords: CMCT derivatization, cyanoethylation, Pseudouridine synthase, MALDI-MS, LC-MS, modified nucleosides

INTRODUCTION

Post-transcriptional modifications and their function in ribonucleic acids (RNAs) have been investigated for many years. Post-transcriptional modifications are modified nucleosides that are found in all phylogenetic domains of life (Archeae, Eubacteria and Eukaryotes) and encompass all types of RNAs. There are roughly 100 different post-transcriptionally modified nucleosides that are presently known [1-3]. The most frequently occurring post-transcriptional modification in RNA is pseudouridine (Ψ). Pseudouridine has been a target of interest since its discovery in 1960 [4]. Despite being discovered over 40 years ago, knowledge of the specific functional roles for pseudouridine remains a challenge to identify. A prerequisite for understanding the function of pseudouridine is its identification and knowledge of its exact sequence location in RNA. The detection and sequence placement of pseudouridine in RNA is challenging because it is an isomer of uridine [5, 6].

In this review, we will present a brief overview of pseudouridine including a summary of its biosynthesis and known importance within various RNAs. We will also cover various chemical derivatization approaches that can be used to selectively modify pseudouridine to improve its detection. Finally, the use of mass spectrometry-based methods for the characterization of pseudouridine in various RNAs will be presented.

Biosynthesis of pseudouridine

Post-transcriptional modifications are modified nucleosides that are formed during the process of RNA maturation. They are derivatives of the four major nucleosides; uridine, cytidine, adenosine and guanosine, which are the principal constituents of stable, mature RNA (Figure 1). The most common type of post-transcriptional modification involves modification of the nucleobase but the ribose moiety of the nucleotide can also be modified. These modifications range from simple to more complex chemical transformations. Figure 2 shows examples of simple chemical modifications involving base or ribose methylations and the isomerization of uridine.

Figure 1.

Figure 1

Structures of the four major nucleosides present in RNA.

Figure 2.

Figure 2

Examples of common posttranscriptional modifications in RNA including pseudouridine. 7-methylguanosine is an example of a base modified nucleoside and 2′-O-methylguanosine is an example of a sugar modified nucleoside.

The isomerization of uridine results in the formation of pseudouridine [7-9] and its unique structural aspects (Figure 3). These include a new hydrogen bond donor at the N-1 position of pseudouridine and the presence of a C-C bond linking the base and sugar rather than the typical N-C glycosyl bond found in other nucleosides [6]. The formation of pseudouridine involves enzymatic cleavage of the N-glycosyl bond of uridine that links the base and sugar, rotation of the enzyme-bound uracil ring resulting in C-5 occupying the position that was previously held by N-1, and a reformation of the glycosyl bond as a carbon-carbon bond. This conversion of uridine into pseudouridine is catalyzed by a group of enzymes called pseudouridine-synthases [10, 11].

Figure 3.

Figure 3

Isomerization of uridine into pseudouridine. The glycosidic bond between N-1 and C’-4 is broken and a new C-C glycosidic bond is created between C- 5 and C’-4.

The pseudouridine-synthases from all three domains of life are classified into five families: RluA, RsuA, TruA, TruB, and TruD [12]. Crystallographic studies reveal that the pseudouridine-synthases exhibit structural similarity in that they share a common core protein fold and active site structure. However, they differ in their substrate specificities and share limited sequence similarities [12]. Substrates for the different pseudouridine-synthases include transfer RNA (tRNA), ribosomal RNA (rRNA) and small nuclear RNA (snRNA) [6, 12, 13].

Bacterial pseudouridine-synthases function by recognizing the target uridines in the context of the RNA sequence or structure of the site of interest. Eukaryotic and archaeal pseudouridine synthases are part of an elaborate ribonucleoprotein (RNP) complex that includes RNA and protein accessory factors [12, 14, 15]. As with other post-transcriptional modifications for eukaryotes, pseudouridylation is facilitated by guide RNAs that recognize specific features that direct the RNP complex containing pseudouridine synthase to the appropriate site for modification [12]. While the mode of action for pseudouridine-synthases may vary in bacterial and eukaryotic systems, the resulting pseudouridines seem to impact biologically important processes such as cell growth [16], cell viability [17], and diseased phenotypic states [18-21].

Significance of pseudouridine in RNA

Particular pseudouridine residues have been found to be physiologically essential in various organisms including bacteria, yeast, and mammals [16-18, 21]. However, a molecular function for pseudouridine remains unknown. Instead, its potential functional role is largely implied from its specific location within the RNA structure. Pseudouridines are found in tRNA, rRNA, snRNA, small nucleolar RNA (snoRNA) and transfer messenger RNA (tmRNA). These RNAs are classified as structured molecules because their tertiary structures are important to their function. Thus, pseudouridine may act as a structural stabilizer at critical sites in these RNAs [6].

For instance, pseudouridine is found in specific structural motifs of tRNA such as the TΨC loop, the D stem, the anticodon stem and the anticodon loop. Studies have found that the pseudouridine present in the TΨC loop of tRNA participates in the specific geometry of T-loops and may play a role in stabilizing D-loop - T-loop tertiary interactions [22]. Other studies show that pseudouridine in single-stranded loop regions are found to be slightly destabilizing in contrast to pseudouridine at stem-loop junctions [23]. Pseudouridines are also found in acceptor branches (acceptor stem and T-stem loop) of other structured RNAs, such as tmRNA [24]. Again, a possible role for pseudouridine here is stabilization of these branches.

The stabilizing effect of pseudouridine formation in RNA has been suggested to arise from improved base stacking conferred by pseudouridine on neighboring nucleosides due to its additional hydrogen bond donor. Findings from X-ray crystallography, NMR, and molecular dynamic simulations show that the additional H-bond donor on pseudouridine confers rigidity in both single and double-stranded regions of the RNA [6, 25-27].

Another important function for pseudouridine is its potential role in the process of peptide bond formation in the ribosome. This is largely implied from its localization in rRNA. Pseudouridine is a ubiquitous constituent of both small subunit (SSU) rRNA and large subunit (LSU) rRNA. However, in LSU rRNA, pseudouridines seem to be clustered around functionally important regions of the RNA. This localization is conserved across all three phylogenetic domains of life. The functionally important regions are Domains II, IV and V. Domain V constitutes the peptidyltransferase (PTC) center or the site of peptide bond formation. Domains II and IV interact with Domain V in forming the three-dimensional structure of the PTC within the ribosome [28-30]. Because pseudouridines in LSU rRNA are clustered around the PTC, they may specifically play a catalytic role in peptide bond synthesis and rates of ribosome and PTC assembly [8].

More recent studies of pseudouridine in the LSU rRNA have focused on the specific location of pseudouridine residues within the three-dimensional structure of the ribosome, as identification of these sites could aid in determining pseudouridine function. In one study, pseudouridines were found to be densely packed around the center of the ribosome. The pseudouridines were found to be in contact with RNA residues from the SSU and tRNA. Thus, another function for pseudouridine could be modulation of intermolecular RNA-RNA contacts [31]. Similarly, pseudouridine residues were found in segments involved in intermolecular RNA-RNA or RNA-protein contacts in snRNAs [32]. In snRNAs, pseudouridine has been implicated in the functional assembly of snRNPs and the spliceosome [33]. Regardless of the type of RNA involved, none of these studies have been successful in implicating pseudouridine with a specific functional role. Therefore, deducing a functional role for pseudouridine remains a work in progress.

Identification and Sequence Mapping of Pseudouridine in RNA

One prerequisite for deciphering the function of pseudouridine is a knowledge of the exact sequence location of pseudouridine in the RNA. However, the sequencing of RNA and identification of pseudouridine in the process is difficult because pseudouridine is formed by an isomerization of uridine. Because there are no specific functional groups that are added in the isomerization, there is nothing unique to pseudouridine that can be radioactively tagged for ease of identification. Moreover, pseudouridine is mass-neutral with its precursor, uridine, making its identification problematic [6, 8, 29].

Prior to 1993, classical methods used in RNA sequencing and pseudouridine identification involved nuclease and chromatography procedures. These procedures were highly involved, laborious, and time-consuming as a combination of techniques were used including radiolabeling, nuclease digestions, gel electrophoresis and chromatography [34-37]. These approaches continued to be used extensively for pseudouridine detection until 1993, when Bakin and Ofengand introduced a chemical derivatization/reverse-transcriptase based approach for the detection and sequence localization of pseudouridine residues in RNA [5]. Current biochemical methods for mapping pseudouridines in RNA are based on the method of Bakin and Ofengand and utilize chemical modification [6]. The chemistry of pseudouridine allows it to be selectively modified by certain reagents under a specific set of conditions. The selectivity and specificity of certain reagents for pseudouridine aid in its identification.

Chemistry of Pseudouridine: Reaction with N-cyclohexyl-N’-β-(4-methylmorpholinium) ethylcarbodiimide p-tosylate (CMCT)

CMCT is the most commonly used reagent for pseudouridine modification. It is a water-soluble carbodiimide that can be used as a means for selectively labeling pseudouridine in a two-step reaction procedure. In the first step, the RNA is reacted with CMCT in an aqueous solution at pH of 8-8.5. The CMCT reacts with guanosine, uridine, guanosine-like and uridine-like residues under these conditions. The weak, alkaline conditions result in the formation of a reactive nucleobase anion with either N-1 or N-3 of these nucleosides being deprotonated followed by a nucleophilic addition to the carbodiimide. The products of these reactions are shown to be adducts containing one molecule of the carbodiimide reagent attached to the bases of the nucleotides [38]. The CMCT adduct attaches to the N-1 position of the base in guanosine and guanosine-like residues and to the N-3 position in uridine and uridine-like residues. However, in the case of pseudouridine, the CMCT adduct is substituted at both N-1 and N-3 positions of the base (Figure 4). In the second step of the reaction, the CMCT-reacted RNA is incubated in an alkaline solution of pH 10.4. This results in the selective modification of pseudouridine by cleavage of the carbodiimide groups from all the guanosine, uridine, guanosine-like and uridine-like residues. It also eliminates the CMCT adduct from the N-1 position of pseudouridine but leaves the second CMCT adduct attached to the N-3 position of the base in pseudouridine. The presence of the single CMCT adduct residing on the pseudouridine serves as a tag for identification [5, 29, 38, 39].

Figure 4.

Figure 4

Structure of N-cyclohexyl-N’-β-(4-methylmorpholinium)ethylcarbodiimide p-tosylate (CMCT) and its site of attachment on the ribonucleosides uridine, guanosine and pseudouridine. Pseudouridine is derivatized at both N-1 and N-3, although only N-3 is stable to alkaline hydrolysis.

Chemistry of Pseudouridine: Reaction with Acrylonitrile

In addition to reaction with carbodiimides, cyanoethylation of pseudouridine with acrylonitrile can also result in selective derivatization. Acrylonitrile reacts with adenosine, guanosine, uridine, cytidine and pseudouridine residues at a pH of 11.5. However, selective cyanoethylation of pseudouridine at the N-1 position is possible at a pH of 8.8. The resulting product is 1-cyanoethylpseudouridine (Figure 5) [40]. Unlike modification with CMCT, which is a two-step procedure, the cyanoethylation proceeds in a single step. However, it is similar to the CMCT-derivatization in that the cyanoethyl adduct on pseudouridine can serve as a tag or label for identification [41].

Figure 5.

Figure 5

Structure of 1-cyanoethylpseudouridine.

Chemistry of Pseudouridine: Reaction with Methyl Vinyl Sulfone

Pseudouridine can also be reacted with methyl vinyl sulfone (MVS) (pH 7 and 70 °C for 3 h) to yield primarily an N-1 substituted pseudouridine, MVS-pseudouridine (Figure 6) [42]. Unlike derivatization with CMCT or acrylonitrile, MVS-pseudouridine cannot serve as a selective tag because MVS also derivatizes uridine and both nucleosides are identical in mass. However, the derivatization does lead to improved detection sensitivity for pseudouridine upon chromatographic separation that allows for distinguishing between pseudouridine and other nucleosides [42].

Figure 6.

Figure 6

Structure of 1-methylvinylsulfone-pseudouridine.

Detecting Chemically Modified Pseudouridine Using Conventional Methods

The conventional method for mapping pseudouridines was first introduced by Bakin and Ofengand in 1993 [5]. The method utilizes the CMCT derivatization approach and makes use of a reverse transcriptase/gel-electrophoresis method as the mode of detection. In this approach, the RNA is first subjected to the two-step CMCT derivatization for selective modification of pseudouridine. After derivatization, an enzyme, reverse transcriptase, creates a DNA copy of the RNA via polymerization. The reverse transcriptase terminates polymerization when it encounters a CMCT-derivatized pseudouridine. The resulting termination products can then be sequence and size determined by gel electrophoresis as the points of termination are visualized as a gel band [5, 29]. Such termination products are referred to as CMCT-dependent stops. To differentiate reverse transcriptase stop sites that are a result of the CMCT reaction only, a second series of reverse transcriptase/gel electrophoresis analyses are conducted, which generate CMCT-independent stops. These stop sites can arise from various base modified nucleosides or circumstances where the enzyme stalls and cannot read through the RNA template. A comparison of CMCT-dependent and CMCT-independent stop sites can reveal those locations which are potential sites of pseudouridine. This approach is widely used for pseudouridine sequencing as it is convenient and effective [6]. However, there are some limitations. The method is dependent on visual observation of the intensity of gel bands to identify pseudouridine. This often leads to errors in identification. Also, the method cannot completely distinguish pseudouridine from other modified uridines which also inhibit the action of reverse transcriptase [29]. Other disadvantages to the approach include difficulties in distinguishing between pseudouridine residues that are adjacent to each other or other uridines such as ΨΨ and UΨ. Also, the experimental procedure involved in this approach cannot cover the complete sequence of RNA due to the lack of suitable reverse transcriptase primer binding sites. Hence an additional step utilizing a poly(A) tailing procedure is required for primer binding and reverse transcriptase characterization of the 3′ end of the RNA [6].

Detecting Pseudouridine Using Mass Spectrometry-Based Methods

An alternate approach for the identification and localization of post-transcriptional modifications involves the use of mass spectrometry. Mass spectrometry has played a notable role in the characterization of nucleic acids since the discovery of electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) as ionization techniques [43].

LC-MS Determination of RNA Hydrolysates

The most common approach for the determination of post-transcriptional modifications in RNA involves liquid chromatography/electrospray ionization mass spectrometry (LC-ESI-MS) of nucleoside digests [42, 44, 45]. In this approach, the RNA is digested into mixtures of nucleosides by total enzymatic hydrolysis and then analyzed by LC-ESI-MS. The combination of chromatographic retention times and mass measurements allows identification of nearly all modified nucleosides that are present in the intact RNA. While modified nucleosides can be detected with this approach, sequence placement of the identified post-transcriptional modifications is not possible.

An alternate detection strategy utilizing methyl vinyl sulfone has been implemented with LC-MS to identify pseudouridine in a mixture of nucleosides. Identification of pseudouridine was found to be straightforward after the derivatization, as detection sensitivity was improved upon LC-MS determination [42]. MVS derivatization is more suited for incorporation into a suite of LC-MS based protocols that are designed for application to mixtures of RNA nucleosides. While the sensitivity and specificity of pseudouridine detection in a mixture of nucleosides is enhanced upon MVS derivatization, no information can be obtained about the specific sequence placement of pseudouridine within the RNA.

RNase Mapping with Mass Spectrometry

For sequence placement of the modified nucleosides, an RNase mapping approach is typically implemented [46]. This procedure utilizes base-specific endoribonucleases that can perform selective enzymatic digestions to generate a set of oligonucleotides that are amenable to accurate mass measurements by ESI or MALDI mass spectrometry. Once accurate molecular mass measurements are obtained, comparisons of the experimental mass data with the data predicted from the gene sequence of the RNA can aid in the identification and sequence location of post-transcriptional modifications. This is done by attributing any deviations of experimental molecular masses from the predicted mass values to the presence of post-transcriptional modifications [46]. As nearly all modifications cause a change in mass, their detection and sequence placement is fairly straightforward. While this approach has been used successfully for mapping post-transcriptional modifications [45, 47, 48], it is not amenable for all modifications.

Cyanoethylation and MALDI-MS

The determination of pseudouridine by mass spectrometry remains problematic because it is the only known mass-silent post-transcriptional modification and it does not manifest itself as a change in mass. Alternate mass spectrometry based procedures have been reported for the detection and sequence placement of pseudouridine in RNA [41, 42, 49, 50].

As mentioned above, pseudouridine can be chemically derivatized with reagents such as methyl vinyl sulfone (MVS), acrylonitrile, and the carbodiimide, CMCT. These derivatization reactions have been implemented with MALDI-MS and LC-ESI-MS. In particular, the use of MALDI-MS in conjunction with chemical derivatization has proven to be an effective means of detecting chemically modified pseudouridine.

In one approach, the reaction of pseudouridine with acrylonitrile (cyanoethylation) was used in conjunction with MALDI-MS to detect pseudouridine in tRNA [41]. This approach involved a combination of chemical derivatization, enzymatic hydrolysis and mass spectrometric steps to determine pseudouridine in RNA. First, intact tRNA was cyanoethylated and then subjected to endoribonuclease (RNase) digestion. Next, MALDI-MS was used to identify cyanoethylated fragments based on the mass spectrometric comparison of un-treated and acrylonitrile-treated samples. Differences of 53.0 Da between the two samples denoted the presence of fragments containing pseudouridine.

Kirpekar and co-workers have utilized cyanoethylation to identify specific pseudouridines within large ribosomal RNAs [51]. The 23S rRNA from the large subunit of Haloarcula marismortui was found to contain three pseudouridine residues. First, the CMCT derivatization followed by reverse transcriptase/gel electrophoresis was used to identify CMCT-dependent stop sites, which are indicative of pseudouridine. These analyses suggested a total of four pseudouridine residues, U1956, U1958, U2607 and U2621 (H. marismortui numbering). To validate the reverse transcriptase assay, cyanoethylation of selected regions of 23S rRNA was performed with the resulting RNase digestion products analyzed by MALDI-MS. Cyanoethylation, which results in a 53.0 Da mass increase after derivatization, yielded confirmation of pseudouridine only at positions 1956, 1958 and 2621; the putative pseudouridine at 2607 was not detected by cyanoethylation. As will be discussed below, these findings agree with those published earlier that same year by McCloskey, Ofengand and co-workers using a different mass spectrometry-based approach for pseudouridine identification [52]. Similarly, Kirpekar and co-workers used cyanoethylation and MALDI-MS to discount putative pseudouridine modifications within the 23S rRNA of Thermus thermophilus. As for H. marismortui, initial screening for pseudouridine was conducted using CMCT derivatization followed by reverse transcriptase/gel electrophoresis to identify CMCT-dependent stop sites. Sites identified in this manner included U918, U1932, U1938, U2201, U2210 and U2616 (T. thermophilus numbering). The sites at positions 1932, 1938 and 2616 were considered definitive pseudouridines due to strong CMCT-dependent stops after primer extension. The sites at positions 918, 2201 and 2210 yielded less conclusive information, and were further analyzed by cyanoethylation and MALDI-MS. No 53.0 Da increase in the RNase digestion products containing these sites was detected, and the authors therefore concluded that these stop sites were not pseudouridine.

Multiple Reaction Monitoring (MRM) and LC-MS

Pomerantz and McCloskey recently described an MS/MS approach for pseudouridine detection that is compatible with LC-MS based RNase mapping of nucleic acids [50]. The success of this approach arises from the unique C-C glycosidic bond in pseudouridine. Compared to typical C-N glycosidic bonds found for other nucleosides, including uridine, the C-C glycosidic bond is less labile during collision-induced dissociation leading to unique fragmentation pathways when pseudouridine is present within an oligonucleotide.

Figure 7 illustrates the major fragment ions arising from pseudouridine. The particular diagnostic ions that can be used to identify pseudouridine include the doubly dehydrated nucleoside anion and its MS/MS product ion at m/z 164 as well as the fragmentation products at m/z 225 and 165. By selectively detecting these fragmentation pathways using multiple reaction monitoring (MRM), the presence of pseudouridine in an oligonucleotide can be determined. Specifically, the m/z 207→ 164 fragmentation pathway can be used during an MRM scan to identify the presence of internal pseudouridines and the m/z 225→ 165 transition can be used during an MRM scan to identify the presence of pseudouridine at the 5′-terminus of RNase digestion products (or other oligonucleotides). Pomerantz and McCloskey also noted several unique features of RNase digestion product MS/MS spectra when pseudouridine is present: the backbone cleavage between the C-3′ and O-3′ of the pseudouridine residue lead to increased abundance over uridine containing samples of a- and w-type cleavage products at the site of pseudouridine.

Figure 7.

Figure 7

Major fragment ions for pseudouridine. All structures are drawn as the neutral species, and m/z values are for the anions.

This MRM-based assay for identifying pseudouridine in large RNAs has been implemented in two reports from the McCloskey labs [45, 52]. One report focused on the pseudouridines present in the 23S rRNAs of H. marismortui and Deinococcus radiodurans [52]. In that publication, CMCT-derivatization followed by reverse transcriptase/gel electrophoresis was used to identify likely sites of pseudouridine residues within these two organisms. Complementary mass spectrometry analyses were conducted, using the entire combination of MS/MS approaches including both MRM transitions and an examination of the relative abundances of a- and w-type ions to confirm the sequence location of pseudouridines in D. radiodurans at positions 1894 and 1900 (D. radiodurans numbering) and those in H. marismortui at positions 1956, 1958 and 2621 (H. marismortui numbering). In addition, these authors also demonstrated the application of this approach for identifying the methylated pseudouridine, m3Ψ, at position 1898 in D. radiodurans, which is of particular interest as CMCT-independent stops are noted in the reverse transcriptase assay for this modified pseudouridine.

In another report, McCloskey and coworkers used a combination of MRM-based assays along with cyanoethylation to determine the exact sequence location of pseudouridine(s) in the 16S rRNA of Thermus thermophilus [45]. A total of three pseudouridines were detected at positions 516, 1540 and 1541 (E. coli numbering). Of note in that publication was the use of a unique MRM transition, the m/z 668→ 207 fragmentation pathway to identify ΨGp containing RNase T1 digestion products, which was the sequence context of pseudouridine 516. Cyanoethylation was used as a confirmatory step in the placement of the adjacent pseudouridine residues at positions 1540 and 1541.

CMCT-Derivatization and MALDI-MS

Similar to the cyanoethylation reaction, MALDI-MS has been used with CMCT derivatization to identify pseudouridine residues through a characteristic mass shift [49]. This approach involves a combination of enzymatic hydrolysis, chromatographic separation and mass spectrometric steps to determine pseudouridine in RNA. Pseudouridine was chemically modified using CMCT and alkaline buffer. Upon mass spectrometric determination, pseudouridine could be identified by its characteristic mass shift of 252 Da due to the CMCT adduct. This mass shift could be observed by making a comparison of the masses of the unreacted and CMC-reacted oligonucleotides after HPLC purification of endonuclease digested RNA products. In this way, the number of pseudouridine residues in the oligonucleotide could also be determined. To determine the exact sequence location of pseudouridine, endonuclease digest products known to contain pseudouridine were further isolated by HPLC and subjected to enzymatic sequencing [49].

More recently, we optimized the CMCT derivatization conditions to improve the compatibility of this derivatization approach with direct MALDI-MS determination [53]. After these improvements, the derivatization strategy was found to be applicable in an RNase mapping approach for identifying pseudouridine in transfer RNAs, although it was also found that CMCT is not specific to pseudouridine, as it also modifies thiouridines and 2-methylthio-N6-isopentenyladenosine, which are common posttranscriptional modifications in tRNAs. These improvements have also enabled the application of a CMCT derivatization followed by MALDI-MS determination to identify pseudouridines from mixtures of tRNAs (Durairaj and Limbach, unpublished results) through a combination of RNase mapping and signature digestion products [54].

Modified Pseudouridines

Pseudouridine can also contain additional structural modifications that may affect its detection by either traditional biochemical assays or through the use of mass spectrometry-based assays [55]. To date, the known modified pseudouridines include 1-methylpseudouridine, 2′-O-methylpseudouridine and 1-methyl-3-(3-amino-3-carboxypropyl) pseudouridine, which have been detected in archaeal or eukaryotic RNAs, and 3-methylpseudouridine, which has been detected in bacterial RNAs [3].

Each of these modified pseudouridines can be identified using the standard nucleoside digestion/HPLC assay discussed earlier [30]. The MVS derivatization of modified pseudouridines, however, is likely only applicable to 2′-O-methylpseudouridine and 3-methylpseudouridine, as derivatization occurs at the N-1 position. As the CMCT-based assays using either reverse transcriptase or mass spectrometry require selective modification of the N-3 site on pseudouridine, these strategies will only be applicable to 1-methylpseudouridine and 2′-O-methylpseudouridine. In contrast, the cyanoethylation derivatization approach selectively modifies the N-1 site on pseudouridine, thus it can only be used to identify 2′-O-methylpseudouridine and 3-methylpseudouridine. The MS/MS based approaches should be the most general for identifying modified pseudouridines. Although the specific MRM transitions for the four modified pseudouridines have not been established, LC-MS/MS has been used to identify 3-methylpseudouridine in D. radiodurans [52].

CONCLUSIONS

Pseudouridine, the so-called fifth nucleoside due to its ubiquitous presence in RNAs, remains among the most challenging modified nucleosides to characterize. To date, sequence placement of pseudouridine in large RNAs by mass spectrometry has been demonstrated using either the MRM assay developed by the McCloskey group or the cyanoethylation derivatization approach developed by Kirpekar and co-workers. Similarly, our lab has focused on new mass spectrometry approaches using CMCT derivatization to identify pseudouridine in small RNAs. An overall review of these recent publications suggests that the MALDI-based derivatization strategies (cyanoethylation or CMCT) may be most useful for characterizing pseudouridines within complex mixtures of small RNAs. In contrast, the MRM-based assays are particularly useful for identifying pseudouridine from larger rRNAs, with confirmation of the MRM results being possible via MALDI-based derivatization strategies and/or CMCT-reverse transcriptase assays.

The last five years have yielded a number of new mass spectrometry-based approaches for pseudouridine detection which complement the existing reverse transcriptase/gel electrophoresis method. While these mass spectrometry approaches have demonstrated utility, there remain several challenges for identifying pseudouridine and modified pseudouridines. Probably the most significant challenge remains the amount of starting material required for pseudouridine identification and detection. LC-MS methodologies for analyzing total nucleoside digests require the most sample, typically 10 to 100 micrograms or more of initial RNA. Moreover, as pseudouridine elutes early under standard HPLC separation conditions [56], it is often difficult to separate, which led to the development of MVS-derivatization [42]. Due to the polarity of pseudouridine, the use of hydrophilic interaction liquid chromatography (HILIC) may prove to be more advantageous for its separation and mass spectrometric determination [57].

For mass spectrometry-based assays that are dependent upon chemical derivatization (e.g., cyanoethylation or CMCT), the challenge is to have sufficient starting material such that the derivatization reaction can be purified and analyzed to give unambiguous results. Similarly, for pseudouridine sequencing, the challenge is to have sufficient starting material to work about the instrumental limits of detection during MRM scans. Whether analyzing small RNAs, such as miRNAs or tRNAs, or large ribosomal RNAs, as the complexity of the initial sample increases, the analytical challenges become more difficult. Here, the development of affinity purification methods could reduce detection limits by eliminating most, if not all, of the RNA components which do not contain pseudouridine. Alternatively, the use of targeted enzymatic digestion, such as RNase H, can be used to limit the overall size of the RNA being analyzed for pseudouridine.

The widespread abundance and structural significance of pseudouridine, along with as yet unanswered questions regarding its functional significance, suggests that future investigations will require robust and sensitive approaches for detecting pseudouridine and identifying its sequence location in a variety of RNAs. The recently developed approaches are clearly poised for even further applications, and as chromatographic and mass spectrometric approaches and instrumentation improve, the field can expect these methods to be more widely applied for answering important biological questions regarding the importance of the fifth nucleoside.

ACKNOWLEDGEMENTS

Financial support of this work was provided by the National Institutes of Health (GM58843) and the University of Cincinnati.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

REFERENCES

RESOURCES