Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Dec 6.
Published in final edited form as: J Am Soc Mass Spectrom. 2023 Nov 10;34(12):2785–2792. doi: 10.1021/jasms.3c00298

Automated Identification of Modified Nucleosides during HRAM-LC-MS/MS using a Metabolomics ID Workflow with Neutral Loss Detection

Robert L Ross 1, Ningxi Yu 2, Ruoxia Zhao 2, Andrew Wood 2, Patrick A Limbach 2,*
PMCID: PMC11587168  NIHMSID: NIHMS2033577  PMID: 37948765

Abstract

The role of post-transcriptional modification in biological processes has been an ongoing field of study for the past several decades. Improvements in liquid chromatography platforms and mass spectrometry instrumentation have resulted in enhanced identification, characterization, and quantification of modified nucleosides in biological systems. One consequence of the rapid technological improvements in the analytical acquisition of modified nucleosides has been a dearth of robust data processing workflows for analyzing more than a handful of samples at a time. To improve the utility of LC-MS/MS for batch analyses of modified nucleosides, a workflow for automated nucleoside identification has been developed. We have adapted the Thermo Fisher Scientific metabolomics identification software package, Compound Discoverer, to accurately identify modified nucleosides from batch LC-MS/MS acquisitions. Three points of identification are used: accurate mass from a monoisotopic mass list, spectral matching from a spectral library, and neutral loss identification. This workflow was applied to a batch (n=24) of urinary nucleosides resulting in the accurate identification and relative quantification of 16 known nucleosides in less than one hour.

Graphical Abstract

graphic file with name nihms-2033577-f0006.jpg

Workflow for using Compound Discoverer for the automated detection of modified nucleosides during LC-MS/MS

INTRODUCTION

There are over 150 modifications found in RNA throughout the three kingdoms of life, with the greatest density residing on transfer RNAs (tRNA). Many of these modifications are homologous to all tRNAs, such as the TΨC loop containing pseudouridine (Ψ), or the D-Loop containing dihydrouridine (D). Through normal homeostasis it can be reasoned that the levels of many of these modifications would remain static unless some perturbation of the system (e.g., stress or disease) intervened, making the use of RNA modified nucleosides a candidate for clinical analysis in early detection of disease and progression. Historically, early disease screening of post-transcriptional modifications has focused on feature detection in urine,13 where the observed differences between healthy and sick patient’s nucleoside abundances are compared. As with all clinical assays, adoption and application depend on the sensitivity and selectivity of the biomarker(s) and the robustness, throughput and overall cost of the analytical platform and measurements.

The use of liquid chromatography-tandem mass spectrometry (LC-MS/MS) is ideal for clinical analysis.4 Current high-resolution accurate mass (HRAM) instruments can accurately differentiate molecules whose molecular mass differs by 1 ppm or more. Tandem mass spectrometry allows one to discern the structure,5 differentiating between positional isomers or characterizing functional group position.6 This methodology was applied by Jiang for profiling of urinary nucleosides by boronate affinity chromatography coupled to a HRAM mass spectrometer.7 Kuskovsky quantified deoxyribonucleotides in fibroblasts using stable isotope labeling and ion-pairing liquid chromatography HRAM mass spectrometry.8 An HRAM LC-MS/MS platform was used by Bao to detect elevated levels of nucleosides in plasma following irinotecan exposure.9

While the technical performance of HRAM and LC-MS/MS of nucleosides is sufficiently robust to warrant applications in a clinical setting, there remains a lack of appropriate automated data analysis methods that would improve throughput without significantly increasing assay expense. Historically, global analysis of RNA modified nucleosides has been performed manually by compiling a list of known nucleoside mass values, which can be tabulated from online databases,10, 11 followed by a manual or semi-automated (e.g., neutral loss calculations) examination of the entire data set. Confirmation of nucleoside identifications is then performed by manual inspection of the fragmentation pattern generated during MS/MS.

Nucleoside fragmentation is described by loss of the nucleobase as a charged ion joined with loss of the ribose ring as a neutral species. Three forms of the neutral ribose structure can be generated during collisional dissociation depending on the nucleoside identity: the ribose sugar itself, having a mass of 132 Da; a 2’-O-methylated ribose with a mass of 146 Da; or a 2’-deoxyribose with a mass of 116 Da. When monitoring for the neutral loss masses one can identify the possible nucleosides in a sample. This approach is most often implemented using a triple quadrupole mass spectrometer.12, 13 Kellner used the neutral loss approach to develop a stable isotope dilution protocol, which allows for quantification of nucleosides in a sample by stable isotope labeling (SIL).14, 15 Grobe applied the neutral loss approach for the detection of 32 post-transcriptionally modified nucleosides in the organism P aueruginosa.16 The addition of neutral loss identification lends specificity to nucleoside profiling while also allowing for discovery of previously unknown modifications.17

Another recent development in the field of nucleoside analysis by mass spectrometry has been the implementation of spectral libraries to assist in the identification of modified nucleosides. This approach is most often implemented using HRAM at varying collision energies (e.g., using higher-energy collisional dissociation (HCD)).18, 19 One advantage of this approach is that positional isomers (i.e., modified nucleosides having the same mass and neutral loss mass but different structures) are more readily differentiated due to the generation of more fragments than just dissociation of the glycosidic bond. Additionally, spectral matching approaches can significantly enhance the discovery of previously unknown modifications through similarity searching techniques.18, 20

While there are many database search programs available,21, 22 most are aimed at peptide identification,23 lack robust statistical packages,24 or require some knowledge of coding to augment the software for the needed analysis.25 Here we demonstrate a user-friendly approach for automating nucleoside data analysis, built in this instance within Thermo Scientific Compound Discoverer, which uses multiple layers of redundancy to identify nucleosides in a sample, compare their relative abundance across a sample set, and report the data in a user-friendly output. This workflow was shown on the qualitative analysis of urinary nucleosides obtained from a single subject over 24 days.

EXPERIMENTAL

Urine Samples.

Urine samples were collected over 24 days and were immediately frozen upon deposition until analysis. Urine was thawed at room temperature and 1 mL from each sample was placed in a 1.5 mL Eppendorf tube and centrifuged at 1000xg for 10 min. 100 μL were then transferred to a clean 1.5 mL Eppendorf tube to which was added 300 μL of acetonitrile (Honeywell). This tube was then vortexed and centrifuged at 1000xg for 30 min. 100 μL of supernatant was extracted and dried in a speed vac (Thermo Fisher). Samples were reconstituted in 20 μL of 5.3 mM ammonium acetate (pH 5.3) prior to injection. A 5 μL aliquot of each sample, was taken and combined to create a separate sample which the program uses for normalization (QC). It should be noted that this QC samples is used to compensate for time-dependent batch effects that can be observed when acquiring large, uninterrupted sets of samples26 and is not a QC sample as used in traditional quantification assays. In total 24 time point samples and one QC sample were injected.

LC-MS/MS Analysis.

Separation was accomplished by reversed-phase liquid chromatography using an Acquity UPLC HSS T3, 1.8 μm, 1 mm X 100 mm column, (Waters, Milford, MA) on a Vanquish Horizon Quaternary UHPLC system (Thermo Fisher Scientific, San Jose, CA). Mobile phase A consisted of 5.3 mM ammonium acetate in LC-MS grade water, pH 5.3. Mobile phase B consisted of a 60:40 mixture of 5.3 mM ammonium acetate and acetonitrile with a gradient of 0% B (from 0 to 7.6 min), 2% B at 15.7 min, 3% B at 19.2 min, 5% B at 25.7 min, 25% B at 29.5 min, 50% B at 32.3 min, 75% B at 36.4 min, 99% B at 36.6 min (hold for 3 min), returning to 0% B at 46.8 min at a flow rate of 100 μL min−1. Flow is then ramped to 200 μL min−1 at 46.9 min, returned to 100 μL min−1 at 55.1 min then finished at 60 min. The column temperature was set at 40 °C.

HRAM analyses were performed on an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific) interfaced with a heated electrospray (H-ESI) source in positive polarity mode. Full scan data was acquired at a resolution of 120,000, mass range 220-900 m/z, automatic gain control (AGC) 7.5e4, and injection time (IT) 100 ms. Two data-dependent top speed MS/MS spectra (1 s cycle) filters were acquired in the orbitrap at a resolution of 15,000, AGC 1.0e4, and IT 150 ms. One filter was for CID the second was for HCD with both differing only in fragmentation type and energy with CID set to 42 and HCD set to 80. The other instrumental conditions were quadrupole isolation of 1 m/z; radio frequency (RF) 35%; sheath gas, auxiliary gas, and sweep gas of 30, 10 and 0 arbitrary units, respectively; ion transfer tube temperature of 289 °C; vaporizer temperature of 92 °C; and spray voltage of 3500 V. Data was analyzed using Xcalibur 4.0, Compound Discoverer 3.0 and mzVault 2.1.

Spectral Library Construction.

To create the spectral library, transfer RNA from Haloferax volcannii, Cyberlindnera jadinii, Saccharomyces cerevisiae, human liver and placenta were purified and digested to nucleosides as previously described.27 Transfer RNA from Escherichia coli MRE600 was purchased from Sigma Aldrich (St. Louis, MO). Nucleoside standards were prepared at concentrations of 1 μg μL−1, and mixtures of multiple standards were prepared at a final concentration of ~10 ng μL−1. Positional isomers (e.g., 3-methylcytidine (m3C) and N4-methylcytidine (m4C)) were analyzed in separate mixtures. LC-MS/MS analysis was performed at five different collision energies (20, 40, 60, 80 and 100 a.u.) in the ion routing multipole to generate higher-energy collisional dissociation (HCD)-based fragmentation,28 as well as 42% in the dual-pressure linear ion-trap to generate collision-induced dissociation (CID)-based fragmentation.29 RAW format files were loaded into mzVault 2.1 (Thermo Fisher Scientific). The spectral library was created by importing the spectra for each nucleoside at the activation energies chosen during data acquisition (20-100 a.u. HCD, 42% CID). The workflow for importing the spectral library is explained in Supplemental Figure S1. In total eighty-three nucleosides or their metabolic analogs were imported into the spectral library.

Nucleoside Mass List.

To identify detected signals against nucleoside hits, a nucleoside mass list was created using ribonucleoside chemical formulas from MODOMICS10 and included 2’-deoxynucleosides as well as other relevant molecules such as S-Adenosyl methionine (SAM), methylthioadenosine (MTA) and 8-oxoguanosine (8-oxoG). In total, the mass list contained 138 entries.

Neutral Loss Mode.

A neutral loss annotation node was created using the C# markup language of Visual Studio 15 using the node development kit available for CD3 and implemented into our data analysis workflow. This node compares the HRAM MS spectrum to the fragment ion spectrum acquired during MS/MS to determine if any expected neutral loss values might be present. For ribonucleosides, neutral loss values of 132.0422 Da for the ribose, 146.0579 Da for a methylated ribose and 36.02113 Da for pseudouridine were used. For deoxyribonucleosides, a neutral loss value of 116.0473 Da for deoxyribose was used. In newer releases of Compound Discoverer, the neutral loss node feature is included.

Compound Discover.

Workflow:

One feature of the software is the ability to tailor the workflow for a specific analysis. This is accomplished by decoupling the workflow into individual nodes (Figure 1) where each individual step in the data processing workflow can be optimized. Results can be filtered based on multiple parameters. For this acquisition, we filtered all results based on name, formula, and neutral loss. Results are shown in the main window, where chromatogram, spectral identity, chemical properties, statistical scoring, etc. can be viewed (Figure 2). The generic workflow file, “Untargeted Metabolomics with Statistics Detect Unknowns with ID Using Local Databases” was used to build the process used in this work. Changes to the generic workflow are presented in Table 1. To enhance data processing, some settings were specifically adjusted for nucleoside chemistry. In the Select Spectra node, retention times and masses were limited to the acquisition time and scan range used in this study. For the Detect Compounds node settings, signal to noise ratio (S/N Threshold) was raised while Min. Peak Intensity was lowered to limit the number of peaks arising from noise that were processed. Under the Ions dropdown for feature detection, all singly charged positive ions were chosen except for “[M+DMSO+H]+1” as there was no dimethyl sulfoxide (DMSO) used anywhere in the workup or acquisition. “C8 H12 N2 O5” was chosen as Min. Element Counts based off the dihydrouridine (D) molecule, as this structure is a skeletal nucleoside structure. For the Max. Element Counts “C30 H60 N10 O18 P3 S5 Se” were chosen, as a nucleoside having atoms more than those listed has not been reported.

Figure 1.

Figure 1.

Representative image of the Compound Discoverer 3.0 workflow nodes. Each node allows for fine tuning of the parameters used in the analysis.

Figure 2.

Figure 2.

Representative image of the Compound Discoverer 3.0 main window output. Top left pane shows samples chromatograms, MS and MSMS spectra, with isotopic pattern highlighted in green, are shown top right. Detected features, after filtering are shown in bottom panel where each detected nucleosides chemical formula, RT, neutral mass and area are shown.

Table 1.

Nucleoside chemistry specific parameters used in the analysis.

Select Spectra Detect Compounds Group Compounds
Lower RT Limit 1 Intensity Tolerance % 20 Mass Tolerance 10 ppm
Upper RT Limit 50 S/N Threshold 10 Preferred Ions [M+H]+1
Lowest Charge State 1 Min. Peak Intensity 50000
Highest Charge State 1 Ions All +1*
Min. Precursor Mass 200 Min. Element Count C8 H12 N2 O5
Max. Precursor Mass 600 Max. Element Count C30 H60 N10 O18 P3 S5 Se
Search mzVault Predict Composition
mzVault library In house Min. Element Count C8 H12 N2 O5
Match Activation Energy Any Max. Element Count C30 H60 N10 O18 P3 S5 Se
Match Factor Threshold 70 Min. RDBE 2.5
Use Retention Time True Max. RDBE 15
Search Mass List Assign Compound Annotation
Mass List In house Data Source #1 mzVault search
Use Retention Time False Data Source #2 Predicted Compositions
Data Source #3 Mass List search
Normalize Area
Normalization Type Median Absolute Deviation (MAD)

The Search mzVault parameters used only the in-house spectral library, with “any” for Match Ion Activation Energy and Use Retention Time set to “True.” The Predict Composition node settings used the same elements for Min. and Max. Element Count as in the Detect Compounds node. The Min. RDBE was set to 2.5 with Max. RDBE set to 15. These were chosen, again, based on dihydrouridine structure and maximum envisioned nucleoside structure. The Search Mass List node, used only the in-house mass list with Use Retention Time set to “False.” Assign Compound Annotation used “mzVault Search” (Data Source #1), “Predicted Compositions” (Data Source #2) and “Mass List Search” (Data Source #3). Under the Normalize Areas node, the Normalization type was set to “Median Absolute Deviation (MAD).”

Data Processing:

Once the workflow has been created, the samples to be processed (IDs, Blanks, Samples),30 are moved into the Analysis Pane and then processed. Processing time on a Dell Precision 5550 laptop (~28 files) was completed in less than 1 hour.

RESULTS AND DISCUSSION

To establish an improved data processing workflow for nucleoside analysis acquired by UHPLC-MS/MS, we adapted the commercially available software, Compound Discoverer 3.0 and mzVault, to generate a new, automated workflow for nucleoside identification. Using accurate mass, isotopic distribution and MS/MS spectra results in an enhanced and accurate identification of nucleosides in a format applicable for identification from a single sample to batch analysis.

Mass List Matching

During data acquisition a detected feature’s m/z is measured in the orbitrap analyzer at a high degree of precision based on the resolution settings defined in the instrument’s acquisition method. In a small molecule analysis, such as a nucleoside acquisition, this can result in thousands of “hits” being recorded as the acquisition mass window (here 200-600 Da) is small enough that other molecules, like solvent background ions, are also detected. Using the node “Assign Compound Annotation” with the Mass List Search as a data option, the software will “match” observed mass from the MS spectra with that of a user created mass list containing name and chemical formula for each entry. Tolerance settings for matching are set within the node with the matched feature annotated and reported in the main window (Figure 2) under the name column. The entire table can then be filtered to display only those cells that contain an annotation.

Spectral Library Matching.

A spectral library was created from 83 RNA nucleosides or analogs. A collision energy of 42% for CID along with a range for HCD (20 – 100 a.u.) were chosen to populate the spectral library. HCD fragmentation is generated in a multipole instead of an ion trap,31 and at low energies this fragmentation mimics that of ion-trap fragmentation. HCD fragmentation at higher energies is useful for identifying positional isomers when standards are not available,19 and provides fragmentation that can be used for identification of unknowns.32, 33 Given that many families of RNA nucleosides exist as isomers,34 this approach takes advantage of HCD fragmentation to identify positional isomers.18, 35

Spectra were imported into mzVault either from the Compound Discoverer graphical user interface (GUI) or directly from the RAW file. Different permutations of the Compound Discoverer Search mzVault node settings were tested to identify the best parameters for detection. These parameters included the “Match Ion Activation Type,” “Match Ion Activation Energy,” and “Use Retention Time.” To determine if CID and low energy HCD could be used interchangeably in spectral matching, a 5-μg injection of E. coli nucleoside digest acquired with HCD 20 a.u. was processed against a spectral library containing only CID 42% spectra. Conversely, a 5-μg injection of E. coli nucleoside digest acquired with CID 42% was processed with a spectral library containing only HCD 20 a.u. spectra. Each returned 40 spectral matches, with the only difference between the two is that dihydrouridine was not matched in the HCD 20 a.u. acquisition (data not shown). These results show that a low collisional energy imparted upon the molecule to cause N-glycosidic bond cleavage, whether in an ion trap (CID 42%) or an ion routing multipole (HCD 20 a.u.) are interchangeable for spectral matching if the spectral library contains only base loss. However, given that positional isomers nucleobases can have the same mass, fragmentation acquired at higher energies are necessary for correct identification of the isomer. Even at a higher fragmentation energy, the software can still return a false positive when not considering RT, as some fragmentation profiles of isomers may vary only in the fragment ion ratios. For example, the molecule 1-methyladenosine (m1A), which elutes at ~4 min, was identified in both HCD 20 a.u. and CID 42% outputs at the elution time of N6-methyladenosine (m6A). Two other positional isomers, 2-methlyadenosine (m2A) and 8-methyladenosine (m8A), were also identified at the RT of m6A. This result is not surprising as all three positional isomers (m1A, m2A, and m8A) yield the same low energy MS/MS spectra, i.e., the nucleobase ion at m/z 150.0774. Thus, using low energy HCD or CID only for nucleoside identification risks mis-annotation of the components of the sample.

We then tested whether using a higher fragmentation energy (HCD 80) to positively identify the molecules through their unique fragmentation patterns would reduce the false positives. In general, the “Use Retention Time” set to “True” in the mzVault node was found to be necessary for accurate spectral matching. Furthermore, setting the spectral match score lower in the node results in more identifications at the cost of higher false positive rates. We have found that using the default Best Match score of 70 is adequate for global analyses and was the setting used in the urine assay.

Of interest was whether higher fragmentation energy improves the differentiation and identification of position isomers. The results here were a reversal of the low energy output with the resulting spectral list of m2A and m8A as being present as a low scoring match for the 1-methyladenosine (m1A) spectra at ~4 minutes (Figure 3). When retention time is a parameter, the false positives are minimized, yet false identifications of the positional isomers are still seen, except here it is because of their elution window. The elution profile of the singly methylated adenosines is m1A, Am, m2A, m8A and m6A with m2A and m8A eluting within a minute of the m6A peak, using our gradient. When using RT for spectral matching the setting in the node allows for a ± RT window with a maximum setting of 10 minutes. Restricting this window can eliminate false positives between positional isomers such as m3C and m5C which elute a few minutes apart, but will still identify m2A, m8A and m6A when only one is present. However, a quick visual inspection of their spectral match will identify which species are present.

Figure 3.

Figure 3.

mzVault false positive spectral match of m8A (bottom spectra).

Urinary Nucleosides.

To establish a method for batch processing of nucleoside data, we applied our CD3 workflow to an UHPLC-HRAM-MS/MS analysis of urine samples (n=24) from a single subject. The analysis took ~1 hour to process and generated over 10,000 features. The resulting output was first filtered to include all features with an annotation in the name column. Any pseudouridine containing nucleosides annotated in the name column were manually inspected before removal through further filtering. Pseudouridine and its methylated isomers contain a C-C bond between sugar and nucleobase and therefor does not lose the signature neutral loss mass of 132 or 146. Further filtering to include only those cells which contained an annotation in the Neutral Loss, Name and Formula columns, along with having MS/MS spectra resulting in 23 identities under the “Compounds” output tab. As previously stated, name annotations in the “Compounds” columns are the result of hierarchical choice made in the Assign Compound Annotation Node. Order of preference in annotation choice is crucial for accurate identification. To assign the correct name to a feature while minimizing errors, the spectral library needs to be chosen as preference #1 followed by Mass List and Predicted Formula.

Under the “mzVault Results” tab the number of molecules identified by spectral match was 20 with ac4C at the top of the list with a spectral identification Best Match score of 97.7. By highlighting each entry in the table its spectral match can be verified against the library entry and identity quickly confirmed. In our results all four single methylated cytidines were listed, m3C, m4C, m5C and Cm, through spectral matching it was found that only m3C and Cm were present in the batch. In total sixteen nucleosides were accurately identified in the batch through spectral matching, (Table 2), with Um and m3Ψ having a low Best Match score (63 and 75 respectively) and removed from the results. The amount detected is around half of what was reported previously by He,36 and is attributed to the lack of enrichment through SPE.

Table 2.

Final mzVault results: Seventeen modified nucleosides identified through accurate mass, neutral loss, and spectral match. The Match Factor Threshold for this analysis was set to 70.

Name Formula Molecular Weight RT [min] Best Match
ac4C C11 H15 N3 O6 285.09609 19.92 97.7
m6t6A C16 H22 N6 O8 426.14991 31.71 96.9
mcm5s2U C12 H16 N2 O7 S 332.06782 30.04 96.3
L-t6A C15 H20 N6 O8 412.13426 30.47 95.8
m2,2G C12 H17 N5 O5 311.12297 29.19 94.7
Cm C10 H15 N3 O5 257.10117 6.36 94.7
Y C9 H12 N2 O6 244.06954 1.53 94.2
m6Am C12 H17 N5 O4 295.12805 32.43 94.1
acp3U C13 H19 N3 O8 345.11721 2.67 93.6
m2,2,7G C13 H19 N5 O5 325.13862 29.47 93.5
ncm5Um C12 H17 N3 O7 315.10665 2.72 92.9
ncm5U C11 H15 N3 O7 301.091 2.776 92.8
m3C C10 H15 N3 O5 257.10117 3.51 92.3
ncm5U C11 H15 N3 O7 301.091 2.66 92.1
m7G C11 H15 N5 O5 297.10732 7.84 91.2
Ψm C10 H14 N2 O6 258.08519 5.06 85.1

While most of the species identified from urine are as expected, we encountered several cases where this automated data processing workflow encountered challenges. If spectral matching (mzVault Search) is not chosen as first Data Source preference in the “Assign Compound Annotations” node, mis-annotation can occur for identical masses arising from other molecules within the sample, especially if the sample is a biofluid. Secondly, mis-annotation can happen if the target molecule has positional isomers where elution times are very close. For example, the nucleoside m3C appears three times under the Name column at 3.7, 4.2 and 5.5 min. respectively, suggesting other isomers are present in the batch, yet possibly misannotated. To verify which isomer or isomers are present, the spectral matches should be compared under the mzVault Results tab in the main window. Figure 4 shows a comparison of the m3C spectra detected in the samples against its library spectrum and versus an m5C library spectrum. Comparison of the library spectra shows both are very similar, each containing primarily the same fragment ions except for the additional fragment ion 109.05595 Da. This similarity is enough for the software to mis-annotate the feature should one positional isomer drift into the elution window of a second. In this analysis, over multiple injections, m3C experienced chromatographic drift causing it to fall outside of its assigned RT window resulting in incorrect annotation. This behavior was also witnessed for 1-methyladenosine and 7-methylguanosine although not as pronounced. As this drifting was not witnessed for 2’-O-methylcytidine, which elutes after m3C (Supplemental Figure S2), nor for other positional isomers of 1-methyladenosine or 7-methylguanosine (data not shown) nor for any other nucleoside, zwitterionic interactions may be leading to the change in retention time37 as all three molecules are charged species due to methylation position on the base.

Figure 4.

Figure 4.

Spectral match comparison between m3C and m5C. The zwitterion m3C showed chromatographic drift during analysis causing the software to annotate it as m4C and m5C. By comparing the spectral match in the mzVault tab, the nucleoside can be accurately identified.

Quantitative Analysis.

The focus of this work was to develop an automated data processing workflow which could be applied toward routine batch analysis of RNA nucleosides. Urine samples were chosen for workflow testing as urine is one of the easiest biofluid to acquire. The preliminary results shown here are in general agreement with past studies.36 To illustrate how this workflow can be integrated with quantitative analyses, Figure 5 shows whisker plots of seven nucleosides that were detected in every sample. Of the seven modifications in Figure 5, five are specifically tRNA modifications. Of the five tRNA nucleosides, one, the modification OHyW, is found solely at position 37 of the tRNAPhe, while the nucleoside mcm5s2U is a wobble position modification found in tRNAGLN, tRNALYS, and tRNAGLU in yeast and implicated in translation fidelity.38, 39 Interestingly, the modification ms2t6A was also detected at a relatively constant level, at concentrations similar to mcm5s2U. The nucleoside ms2t6A is a position 37 modification that is also found on tRNALYS where hypomodification may be linked to type 2 diabetes.40 The modification m2,2,7G is a mRNA cap modification associated with small nucleolar RNAs U1-U5 responsible for pre-mRNA splicing recognition by the spliceosome 41 and found in some viral RNA transcripts.42 Finally the modification m1acp3Ψ, a pseudouridine derivative, is found in rRNA at a single nucleotide.43 The relative static concentration of these modifications in urine suggests, for this data set, a macroscopic view of RNA homeostasis in the individual. This creates the possibility of monitoring these few, very specific modifications, quantitatively, for possible early detection of disease.

Figure 5.

Figure 5.

Whisker plots of relative abundances of modified nucleosides detected in every sample. From left to right: (1) N6-threonylcarbamoyladenosine (L-t6A) (2) N2,2,7-trimethylguanosine (m2,2,7G), (3) 3-(3-amino-3-carboxypropyl) uridine (acp3U) (4) 2-methylthio-N6-threonylcarbamoyladenosine (ms2t6A), (5) 5-methoxycarbonylmethyl-2-thiouridine (mcm5s2U), (6) hydroxywybutosine and (7) 1-methyl-3-(3-amino-3-carboxypropyl) pseudouridine (m1acp3Ψ). Five of the seven occur primarily in tRNA, with m2,2,7G and m1acp3Ψ occurring in mRNA and rRNA respectively.

CONCLUSIONS

This work is designed as an automated data processing workflow proof of concept, for the accurate detection of post transcriptional modifications in any sample. The use of an automated detection and identification workflow using accurate mass, spectral matching and neutral loss identification yields high confidence results. Equally important to confidence in identification is the amount of time necessary for post-acquisition data processing. Development of this workflow has reduced the amount of time for processing batch acquisition of nucleoside data to a few hours. Furthermore, this workflow creates the ability to selectively monitor the change of the nucleoside pool, both known and unknown, when a single modification is perturbed through experimentation.

Supplementary Material

SI Material

ACKNOWLEDGMENT

The authors would like to thank Ralf Tautenhahn (Thermo Fisher Scientific) for discussion and advice on creation of this workflow. Financial support for this work is provided by the National Institutes of Health (NIH GM058843) and the National Science Foundation (NSF 1507357). The generous support of the Rieveschl Eminent Scholar Endowment and the University of Cincinnati for these studies is also appreciated.

Footnotes

Supporting Information

The Supporting Information is available free of charge on the ACS Publications website.

Description of steps involved to create the spectral library of modified nucleosides (Figure S1) and evaluation of methylated cytidines (Figure S2).

REFERENCES

  • (1).Dudley E; Lemiere F; Van Dongen W; Esmans E; El-Sharkawi AMM; Games DE; Brenton AG; Newton RP Urinary modified nucleosides as tumor markers. Nucleosides, Nucleotides and Nucleic Acids 2003, 22 (5-8), 987–989, Conference Paper. DOI: 10.1081/NCN-120022719 Scopus. [DOI] [PubMed] [Google Scholar]
  • (2).Patejko M; Struck-Lewicka W; Siluk D; Waszczuk-Jankowska M; Markuszewski MJ Chapter One - Urinary Nucleosides and Deoxynucleosides. In Advances in Clinical Chemistry, Makowski GS Ed.; Vol. 83; Elsevier, 2018; pp 1–51. [DOI] [PubMed] [Google Scholar]
  • (3).Struck W; Siluk D; Yumba-Mpanga A; Markuszewski M; Kaliszan R; Markuszewski MJ Liquid chromatography tandem mass spectrometry study of urinary nucleosides as potential cancer markers. Journal of Chromatography A 2013, 1283, 122–131. DOI: 10.1016/j.chroma.2013.01.111. [DOI] [PubMed] [Google Scholar]
  • (4).Grebe SK; Singh RJ LC-MS/MS in the Clinical Laboratory - Where to From Here? The Clinical biochemist. Reviews 2011, 32 (1), 5–31. From NLM. [PMC free article] [PubMed] [Google Scholar]
  • (5).Grützmacher HF Fragmentation in Mass Spectrometry☆. In Encyclopedia of Spectroscopy and Spectrometry (Third Edition), Lindon JC, Tranter GE, Koppenaal DW Eds.; Academic Press, 2017; pp 730–740. [Google Scholar]
  • (6).McLafferty F Tandem mass spectrometry. Science 1981, 214 (4518), 280–287. DOI: 10.1126/science.7280693. [DOI] [PubMed] [Google Scholar]
  • (7).Jiang H-P; Qi C-B; Chu J-M; Yuan B-F; Feng Y-Q Profiling of cis-Diol-containing Nucleosides and Ribosylated Metabolites by Boronate-affinity Organic-silica Hybrid Monolithic Capillary Liquid Chromatography/Mass Spectrometry. Scientific Reports 2015, 5, 7785, Article. DOI: 10.1038/srep07785 https://www.nature.com/articles/srep07785#supplementary-information. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Kuskovsky R; Buj R; Xu P; Hofbauer S; Doan MT; Jiang H; Bostwick A; Mesaros C; Aird KM; Snyder NW Simultaneous isotope dilution quantification and metabolic tracing of deoxyribonucleotides by liquid chromatography high resolution mass spectrometry. Analytical Biochemistry 2019, 568, 65–72, Article. DOI: 10.1016/j.ab.2018.12.023 Scopus. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Bao X; Wu J; Kim S; Lorusso P; Li J Pharmacometabolomics Reveals Irinotecan Mechanism of Action in Cancer Patients. Journal of Clinical Pharmacology 2019, 59 (1), Article. DOI: 10.1002/jcph.1275 Scopus. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Boccaletto P; Machnicka MA; Purta E; Piątkowski P; Bagiński B; Wirecki TK; de Crécy-Lagard V; Ross R; Limbach PA; Kotter A; et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic acids research 2018, 46 (D1), D303–D307. DOI: 10.1093/nar/gkx1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Cantara WA; Crain PF; Rozenski J; McCloskey JA; Harris KA; Zhang X; Vendeix FA; Fabris D; Agris PF The RNA Modification Database, RNAMDB: 2011 update. Nucleic acids research 2011, 39 (Database issue), D195–201. DOI: 10.1093/nar/gkq1028 From NLM. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Su D; Chan CT; Gu C; Lim KS; Chionh YH; McBee ME; Russell BS; Babu IR; Begley TJ; Dedon PC Quantitative analysis of ribonucleoside modifications in tRNA by HPLC-coupled mass spectrometry. Nat Protoc 2014, 9 (4), 828–841. DOI: 10.1038/nprot.2014.047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Thüring K; Schmid K; Keller P; Helm M Analysis of RNA modifications by liquid chromatography–tandem mass spectrometry. Methods 2016, 107, 48–56. DOI: 10.1016/j.ymeth.2016.03.019. [DOI] [PubMed] [Google Scholar]
  • (14).Kellner S; Neumann J; Rosenkranz D; Lebedeva S; Ketting RF; Zischler H; Schneider D; Helm M Profiling of RNA modifications by multiplexed stable isotope labelling. Chemical Communications 2014, 50 (26), 3516–3518, 10.1039/C3CC49114E. DOI: 10.1039/C3CC49114E. [DOI] [PubMed] [Google Scholar]
  • (15).Heiss M; Borland K; Yoluc Y; Kellner S Quantification of Modified Nucleosides in the Context of NAIL-MS. Methods Mol Biol 2021, 2298, 279–306. DOI: 10.1007/978-1-0716-1374-0_18 [DOI] [PubMed] [Google Scholar]
  • (16).Grobe S; Doberenz S; Ferreira K; Krueger J; Brönstrup M; Kaever V; Häussler S Identification and Quantification of (t)RNA Modifications in Pseudomonas aeruginosa by Liquid Chromatography–Tandem Mass Spectrometry. ChemBioChem 2019, 20 (11), 1430–1437. DOI: 10.1002/cbic.201800741. [DOI] [PubMed] [Google Scholar]
  • (17).Yoluc Y; Ammann G; Barraud P; Jora M; Limbach PA; Motorin Y; Marchand V; Tisne C; Borland K; Kellner S Instrumental analysis of RNA modifications. Crit Rev Biochem Mol Biol 2021, 56 (2), 178–204. DOI: 10.1080/10409238.2021.1887807 [DOI] [PubMed] [Google Scholar]
  • (18).Espadas G; Morales-Sanfrutos J; Medina R; Lucas MC; Novoa EM; Sabidó E High-performance nano-flow liquid chromatography column combined with high- and low-collision energy data-independent acquisition enables targeted and discovery identification of modified ribonucleotides by mass spectrometry. J Chromatogr A 2022, 1665, 462803. DOI: 10.1016/j.chroma.2022.462803 From NLM. [DOI] [PubMed] [Google Scholar]
  • (19).Jora M; Burns AP; Ross RL; Lobue PA; Zhao R; Palumbo CM; Beal PA; Addepalli B; Limbach PA Differentiating Positional Isomers of Nucleoside Modifications by Higher-Energy Collisional Dissociation Mass Spectrometry (HCD MS). J Am Soc Mass Spectrom 2018, 29 (8), 1745–1756. DOI: 10.1007/s13361-018-1999-6 From NLM. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Jora M; Corcoran D; Parungao GG; Lobue PA; Oliveira LFL; Stan G; Addepalli B; Limbach PA Higher-Energy Collisional Dissociation Mass Spectral Networks for the Rapid, Semi-automated Characterization of Known and Unknown Ribonucleoside Modifications. Anal Chem 2022, 94 (40), 13958–13967. DOI: 10.1021/acs.analchem.2c03172 From NLM. [DOI] [PubMed] [Google Scholar]
  • (21).Kind T; Tsugawa H; Cajka T; Ma Y; Lai Z; Mehta SS; Wohlgemuth G; Barupal DK; Showalter MR; Arita M; et al. Identification of small molecules using accurate mass MS/MS search. Mass Spectrometry Reviews 2018, 37 (4), 513–532. DOI: doi: 10.1002/mas.21535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Blaženović I; Kind T; Ji J; Fiehn O Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics. Metabolites 2018, 8 (2), 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Perkins DN; Pappin DJC; Creasy DM; Cottrell JS Probability-based protein identification by searching sequence databases using mass spectrometry data. ELECTROPHORESIS 1999, 20 (18), 3551–3567. DOI: doi:. [DOI] [PubMed] [Google Scholar]
  • (24).Clasquin MF; Melamud E; Rabinowitz JD LC-MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine. Current Protocols in Bioinformatics 2012, 37 (1), 14.11.11–14.11.23. DOI: doi: 10.1002/0471250953.bi1411s37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Eng JK; Jahan TA; Hoopmann MR Comet: An open-source MS/MS sequence database search tool. PROTEOMICS 2013, 13 (1), 22–24. DOI: doi: 10.1002/pmic.201200439. [DOI] [PubMed] [Google Scholar]
  • (26).Dunn WB; Broadhurst D; Begley P; Zelena E; Francis-McIntyre S; Anderson N; Brown M; Knowles JD; Halsall A; Haselden JN; et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols 2011, 6 (7), 1060–1083. DOI: 10.1038/nprot.2011.335. [DOI] [PubMed] [Google Scholar]
  • (27).Ross R; Cao X; Yu N; Limbach PA Sequence mapping of transfer RNA chemical modifications by liquid chromatography tandem mass spectrometry. Methods 2016, 107, 73–78. DOI: 10.1016/j.ymeth.2016.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Senko MW; Remes PM; Canterbury JD; Mathur R; Song Q; Eliuk SM; Mullen C; Earley L; Hardman M; Blethrow JD; et al. Novel Parallelized Quadrupole/Linear Ion Trap/Orbitrap Tribrid Mass Spectrometer Improving Proteome Coverage and Peptide Identification Rates. Analytical Chemistry 2013, 85 (24), 11710–11714. DOI: 10.1021/ac403115c. [DOI] [PubMed] [Google Scholar]
  • (29).Olsen JV; Schwartz JC; Griep-Raming J; Nielsen ML; Damoc E; Denisov E; Lange O; Remes P; Taylor D; Splendore M; et al. A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Molecular & cellular proteomics : MCP 2009, 8 (12), 2759–2769. DOI: 10.1074/mcp.M900375-MCP200 PubMed. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Zelena E; Dunn WB; Broadhurst D; Francis-McIntyre S; Carroll KM; Begley P; O’Hagan S; Knowles JD; Halsall A; Wilson ID; et al. Development of a Robust and Repeatable UPLC–MS Method for the Long-Term Metabolomic Study of Human Serum. Analytical Chemistry 2009, 81 (4), 1357–1364. DOI: 10.1021/ac8019366. [DOI] [PubMed] [Google Scholar]
  • (31).Olsen JV; Macek B; Lange O; Makarov A; Horning S; Mann M Higher-energy C-trap dissociation for peptide modification analysis. Nature Methods 2007, 4, 709. DOI: 10.1038/nmeth1060 https://www.nature.com/articles/nmeth1060#supplementary-information. [DOI] [PubMed] [Google Scholar]
  • (32).Yu N; Jora M; Solivio B; Thakur P; Acevedo-Rocha CG; Randau L; de Crecy-Lagard V; Addepalli B; Limbach PA Transfer RNA Modification Profiles and Codon Decoding Strategies in Methanocaldococcus jannaschii. Journal of bacteriology 2019. DOI: 10.1128/jb.00690-18 From NLM. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Jora M; Borland K; Abernathy S; Zhao R; Kelley M; Kellner S; Addepalli B; Limbach PA Chemical Amination/Imination of Carbonothiolated Nucleosides During RNA Hydrolysis. Angew Chem Int Ed Engl 2021, 60 (8), 3961–3966. DOI: 10.1002/anie.202010793 From NLM. [DOI] [PubMed] [Google Scholar]
  • (34).You C; Dai X; Wang Y Position-dependent effects of regioisomeric methylated adenine and guanine ribonucleosides on translation. Nucleic acids research 2017, 45 (15), 9059–9067. DOI: 10.1093/nar/gkx515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (35).Jensen SS; Ariza X; Nielsen P; Vilarrasa J; Kirpekar F Collision-induced dissociation of cytidine and its derivatives. Journal of Mass Spectrometry 2007, 42 (1), 49–57. DOI: doi: 10.1002/jms.1136. [DOI] [PubMed] [Google Scholar]
  • (36).He L; Wei X; Ma X; Yin X; Song M; Donninger H; Yaddanapudi K; McClain CJ; Zhang X Simultaneous Quantification of Nucleosides and Nucleotides from Biological Samples. J Am Soc Mass Spectrom 2019, 30 (6), 987–1000. DOI: 10.1007/s13361-019-02140-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (37).Agris PF The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function. Prog Nucleic Acid Res Mol Biol 1996, 53, 79–129. DOI: 10.1016/s0079-6603(08)60143-9 [DOI] [PubMed] [Google Scholar]
  • (38).Björk GR; Huang B; Persson OP; Byström AS A conserved modified wobble nucleoside (mcm5s2U) in lysyl-tRNA is required for viability in yeast. RNA 2007, 13 (8), 1245–1255. DOI: 10.1261/rna.558707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (39).Schaffrath R; Leidel SA Wobble uridine modifications-a reason to live, a reason to die?! RNA Biol 2017, 14 (9), 1209–1222. DOI: 10.1080/15476286.2017.1295204 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Wei FY; Suzuki T; Watanabe S; Kimura S; Kaitsuka T; Fujimura A; Matsui H; Atta M; Michiue H; Fontecave M; et al. Deficit of tRNA(Lys) modification by Cdkal1 causes the development of type 2 diabetes in mice. J Clin Invest 2011, 121 (9), 3598–3608. DOI: 10.1172/JCI58056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (41).McGrail JC; O’Keefe RT The U1, U2 and U5 snRNAs crosslink to the 5’ exon during yeast pre-mRNA splicing. Nucleic acids research 2008, 36 (3), 814–825. DOI: 10.1093/nar/gkm1098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (42).HsuChen CC; Dubin DT Di-and trimethylated congeners of 7-methylguanine in Sindbis virus mRNA. Nature 1976, 264 (5582), 190–191. DOI: 10.1038/264190a0 [DOI] [PubMed] [Google Scholar]
  • (43).Bakin A; Ofengand J Mapping of the 13 pseudouridine residues in Saccharomyces cerevisiae small subunit ribosomal RNA to nucleotide resolution. Nucleic acids research 1995, 23 (16), 3290–3294. DOI: 10.1093/nar/23.16.3290 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI Material

RESOURCES