Abstract
We developed decision rule sets for Lipid Data Analyzer (LDA; http://genome.tugraz.at/lda2), enabling automated and reliable annotation of lipid species and their molecular structures in high-throughput data from chromatography-coupled tandem mass spectrometry. Platform independence was proven in various mass spectrometric experiments, comprising low- and high-resolution instruments and several collision energies. We propose that this independence and the capability to identify novel lipid molecular species render current state-of-the-art lipid libraries now obsolete.
Lipidomics is a rapidly evolving scientific discipline that provides high-throughput data for elucidating lipid structure, metabolism and dynamics at cellular and tissue-level scales1,2. Liquid chromatography-linked tandem mass spectrometry (LC-MS/MS) enables analyses including simultaneous high precision quantitative measurements of hundreds to thousands of lipids in complex mixtures3. Such “profiling” can be carried out at six levels of structural information: (i) the lipid subclass level, (ii) bond type level, (iii) fatty acyl level, (iv) fatty acyl position level, (v) fatty acyl/sphingoid base structure level and (vi) the LIPID MAPS level; the latter adheres to full structural elucidation including double bond location and geometry4. Throughout this paper the term lipid species refers to lipid subclass including bond type level which identifies lipids by numbers of carbons and double bonds of constituent fatty acyl and/or alkyl/1-alkenyl chains (e.g. PI 38:4). The term lipid molecular species corresponds to fatty acyl level (e.g. PI 20:4_18:0) and/or fatty acyl position level (e.g. PI 18:0/20:4), in which structural information such as identification of constituent chains and determination of their respective regio-selectivities at the glycerol backbone is obtained. In these approaches for lipid profiling, automated lipid annotation relies currently on spectral libraries5–7. However, variables such as the type of mass spectrometer, the collision energy applied, the type of adduct ion, and the charge state, all cause substantial variation in the MS/MS spectra of lipid molecular species (Fig. 1).
Figure 1.
Tandem mass spectra of lipid molecular species depend on platform and collision energy. Spectra of deprotonated PI 18:0/20:4 from two platforms and two collision energy settings are shown: (a) Orbitrap Velos Pro, CID mode, 30 %, precursor m/z 885.545, damping gas He; (b), 4000 QTRAP, CID mode, 30 eV, precursor m/z 885.93, collision gas N2; (c) Orbitrap Velos Pro, CID mode, 60 %, precursor m/z 885.549, damping gas He; (d) 4000 QTRAP, CID mode, 60 eV, precursor m/z 885.85, collision gas N2.
Thus, matching of spectral data to experimentally- or in silico-generated spectral libraries is problematic for the following reasons: (i) it is not possible to detect novel lipid molecular species which are absent from spectral libraries (with novel acyl and/or alkyl/1-alkenyl constituents, or an unusual sn-position); (ii) it is challenging to obtain decisive information from low-abundance signals (e.g. fatty acyl and/or alkyl/1-alkenyl chain fragments from phospholipids in positive ion mode), because the matching algorithms are geared mainly toward higher-intensity signals; (iii) the sn-positions of fatty acyl and/or alkyl/1-alkenyl constituents are extremely difficult to determine, because general matching algorithms are not designed to discriminate the intensity relationships of low-abundance fragments that would reveal stereochemistry; (iv) it is not possible to discriminate between isobaric lipid species and between structural isomers of lipid molecular species; (v) users are, to a certain extent, precluded from setting up their own spectral libraries tailored to their platform because of the impracticality of having to generate thousands of in silico MS/MS spectra for each adduct of each single lipid subclass.
Here we describe a universal and flexible solution to the above limitations by introducing decision rule sets for lipid subclasses/adducts, including an algorithm to apply these rules for identification of lipid species and lipid molecular species. This enables lipid annotation in high-throughput data derived from chromatography-coupled tandem mass spectrometry. The tool, which we call Lipid Data Analyzer (LDA), adapts not only to specific parameters of the various MS platforms, but also to changes in collision energies and to different adduct ions. Consequently, lipid annotation is based on well-defined fragments (fragment rules) and their intensity relationships (intensity rules), allowing for routine profiling of known lipid targets and for detection of novel lipids (Online Methods). As such, the software flexibly accommodates differences in fragmentation behavior. Importantly, the decision rule sets allow identification of fatty acyl and/or alkyl/1-alkenyl constituents and determination of their respective sn-positions at the glycerol backbone (in the case of co-eluting regio-isomers, the assignment is based on the more abundant regio-isomer), even with low-abundance lipid molecular species, as well as the definition of fragments from isobaric/isomeric lipid subclasses for their differentiation.
The basis for the fragment rules is derived from available information about lipid fragmentation8. To gather further evidence supporting the reliability of the fragments and to establish the intensity rules, we conducted three control experiments containing lipid standards of known constituent fatty acyl and/or alkyl/1-alkenyl chains including their respective sn-positions and one biological experiment on the lipidome of murine liver samples. In total, we performed more than 600 LC-MS/MS runs on eight different MS/MS platforms (AB Sciex, Agilent Technologies, Thermo Scientific, and Waters – Supplementary Table 1, Supplementary Note 1), summarized as follows:
In control experiment 1, which included 78 non-isobaric/non-isomeric standard lipids from 14 lipid subclasses (Supplementary Table 2), we generated respective decision rule sets for each lipid subclass/adduct and successfully validated the algorithm in MS/MS spectra. This pertained to the identification of the stereochemistry of lipid molecular species as well.
In control experiment 2, with eight isomeric lipid molecular species (Supplementary Table 3), we verified the ability of the algorithm to discriminate between isomeric species from different lipid subclasses/adducts in MS and MS/MS spectra (Supplementary Table 4).
In control experiment 3, with 16 structural isomers of lipid molecular species originating from different subclasses mixed at various concentrations (Supplementary Table 5), we demonstrated that the algorithm appropriately assigned the respective structural isomers (Supplementary Tables 6 and 7).
The biological experiment with the lipidome of murine liver, allowed us to confirm the capability of LDA to deal with complex biological samples. The algorithm clearly identified low-abundance species (Supplementary Fig. 1), isobaric species and structural isomers contained in these samples (see http://www.ebi.ac.uk/metabolights/MTBLS396). This approach also allowed the identification of 109 novel lipid molecular species and 6 novel regio-isomeric species (Supplementary Table 8 and Supplementary Fig. 2); we consider a lipid molecular species as “novel” if it is neither present in LIPID MAPS Structure Database9, ChEBI10, CyberLipid (http://www.cyberlipid.org), HMDB11, nor YMDB12. Details about identified lipid molecular species on the various platforms, including a cross-platform comparison, are given in Supplementary Tables 9 and 10.
We used data from control experiment 1 and the biological experiment (acquired on Orbitrap Velos Pro in CID mode and on 4000 QTRAP with collision energy settings of +50% and -50%, and +45eV and -45eV, respectively) to verify our approach and to benchmark the LDA algorithm against the state of the-art in silico library LipidBlast7 (Online Methods and Supplementary Note 2). Compared with LipidBlast, LDA typically identified considerably more lipid (molecular) species with higher confidence (Table 1 and Supplementary Tables 11-13). Data at lipid species level revealed that ‘stringent’ LipidBlast conditions identified only a third of the lipid species identified by LDA (n=1041; 97% of 1077 manually identified species). When we used ‘relaxed’ LipidBlast settings, the number of correctly identified lipid species increased at the cost of drastically reduced positive predictive values. More dramatic were the findings at lipid molecular species level, for which LDA identified an impressive 2862 (80% of 3567) lipid molecular species (see Table 1), underlining its power to discriminate lipid structural details. In addition to its broader scope to quantitatively analyze lipid molecular species13 even at low abundance (Supplementary Fig. 1), a further important advantage of LDA is the greatly improved detectability of unanticipated fatty acyl and/or alkyl/1-alkenyl combinations (Supplementary Table 8 and Supplementary Fig. 2). Moreover, LDA unambiguously assigned the sn-positions for almost all standards (positive ion mode: 104/110; negative ion mode 105/105), whereas LipidBlast using ‘relaxed’ settings consistently reported an erroneous positional isomer in addition to the correct species (http://www.ebi.ac.uk/metabolights/MTBLS397). In the case of co-eluting regio-isomeric lipid molecular species, the assignment is based on the more abundant regio-isomer (Supplementary Fig. 3 and 4); chromatographic approaches exist to solve this issue14.
Table 1.
Sensitivity and positive predictive value (PPV) of LDA and LipidBlast (LB) in positive ion mode based on data acquired on Orbitrap Velos Pro in CID mode. LDA outperforms the in silico library approach of LipidBlast with matching factors 450 (stringent) and 10 (relaxed). The lipidome of murine liver samples was determined five times. “Total lipid (molecular) species” in the column headings below represent the sum of all species manually identified in the five MS runs.
Total lipid species identified: 1077 |
Total lipid molecular species identified: 3567 |
||||||
---|---|---|---|---|---|---|---|
LDA | LB 450 | LB 10 | LDA | LB 450 | LB 10 | ||
Sensitivity (%)a | 97 | 36 | 85 | 80 | 15 | 57 | |
PPV (%)b | 97 | 91 | 70 | 92 | 91 | 58 | |
Sensitivity: percent of total species identified by the software;
Positive predictive value: percent of correct identifications.
Of note, sophisticated software programs for lipid identification in direct infusion (shot-gun) MS have been developed15–18; however, unlike LDA, they do not support chromatography-linked approaches, which are now frequently used in lipidomics19. The LDA approach correctly identifies isobaric and isomeric lipids, and structural isomers, enabling their use as diagnostic markers in routine analyses on the one hand, and as key indicators of healthy versus aberrant metabolism on the other. Owing to the high sensitivity attainable with LDA, information derived from low-abundance fragments (e.g. in positive ion mode) is now made accessible and can be converted into lipid structures. Moreover, the software reports structural annotations based solely on spectral evidence (Supplementary Fig. 5 and 6) and avoids misleading structural overdetermination20.
LDA offers platform independence and utmost flexibility by circumventing the need for experimental and in silico spectral libraries. Indeed, users can easily adapt existing decision rule sets or generate new ones (even for other metabolite classes), as LDA features a graphical user interface for such rule definition that provides direct visual feedback on acquired spectra (Online Methods and Supplementary Fig. 7). Generally, further decision rule set development should be based on measured standards and subsequent validation in pertinent biological settings, but can also be performed on biological data directly, if the lipid subclasses/adducts are sufficiently separated by chromatography.
LDA currently offers highly reliable decision rule sets for various adducts of 14 major lipid subclasses acquired using platforms from multiple major instrument vendors. LDA annotations reflect the level of structural details inherent to the analyzed spectra, while avoiding reporting of unsubstantiated structural details. The simplicity of defining and handling decision rule sets allows for easy application of LDA by bioinformaticians and mass spectrometrists.
Online Methods
We provide first the bioinformatics background including explanations for how data are processed by LDA via “decision rule sets”, we illustrate then the versatility of LDA through various application notes.
Decision rule sets
MS/MS spectra of lipids vary greatly, depending on the type of mass spectrometer used, the collision energy applied, the adduct ions and charge state. Taking these factors into account, we developed flexible decision rule sets that enable automated annotation of lipids in the generally accepted format4 at multiple levels of structural detail (Supplementary Fig. 5 and 6).
A decision rule set for a lipid subclass/adduct consists of the section ([GENERAL]), pertaining to general lipid subclass information, and of the three sections ([HEAD], [CHAINS], and [POSITION]), corresponding to information concerning the structural details. The latter three sections do not apply to subclasses/adducts lacking a head group or chain fragment. In these three sections, fragment rules and intensity rules reflect the pattern of MS/MS spectra, as we exemplify for deprotonated diacylglycerophosphoinositol (PI) in Supplementary Figure 8. This figure demonstrates fragment rules (‘!FRAGMENTS’) consisting of an arbitrary name, a chemical formula (for m/z value calculation), the charge state, the MSn level where the fragment might be observed (‘2’ corresponds to MS/MS), and whether the presence of a fragment is required for positive identification at a certain structural level. Moreover, the parameter ‘formula’ allows for the placeholder ‘$PRECURSOR’ (corresponding to the mass of the precursor) to define neutral losses. ‘$CHAIN’ designates any possible fatty acyl chain, and ‘$ALKYLCHAIN’/‘$ALKENYLCHAIN’ any alkylated/1-alkenylated forms, respectively. Previously defined fragments can be reused (e.g. section ‘[CHAINS]’). The parameter ‘mandatory’ is set to true for characteristic fragments, such as the neutral loss of the phosphoethanolamine head fragment (neutral loss of 141 Da) in spectra of protonated diacylglycerolphosphoethanolamines (PE). The parameter ‘mandatory’ is set to false for fragments observed infrequently. Even though the presence of such fragments is not essential to any annotated structural level, usage of infrequent fragments in intensity rules considerably improves the reliability of annotations. A third option for this parameter is other, which designates fragments originating from isobaric or isomeric lipid species not belonging to the lipid subclass of the rule set. This is used to discard false positive identifications.
Intensity rules (‘!INTENSITIES’) consist of ‘equation’ parameters representing allowed intensity relationships of fragments, and the parameter ‘mandatory’. The parameter ‘equation’ utilizes any previously defined fragments, including a placeholder called ‘$BASEPEAK’, to define a minimum intensity for fragments. Furthermore, an optional number in square brackets defines the sn-position of the fragment. For the parameter ‘mandatory’, only true or false is allowed. The effect of this parameter depends on the section in which it is used, as will be discussed in the following paragraphs.
The algorithm processes the sections in the aforementioned order. A decision rule set is applied on a consolidated spectrum, i.e., a spectrum consisting of the sum of denoised spectra (Supplementary Note 3) within a detected MS1 peak. Quantification of MS1 peaks and removal of isotopic peaks is performed as described by Hartler et al.13. Moreover, for lipid species verified by reverse phase LC-MS/MS, LDA offers a non-linear fitting approach to predict retention times of lipid species determined by MS1 only. This allows to remove peaks with implausible retention times (Supplementary Note 4).
Starting with the [HEAD] section, the algorithm calculates the m/z values of the fragments and interrogates the consolidated spectrum for their presence (Supplementary Fig. 9a). When mandatory fragments cannot be detected in the spectrum, the algorithm discards the associated MS1 peak. Otherwise, fragment intensities are checked for compliance with the intensity rules (Supplementary Fig. 9b). Again, if a mandatory intensity rule is not fulfilled in the [HEAD] section, the MS1 identification will be discarded. [HEAD] section rules are the primary check for verification of a lipid subclass/adduct. Note that in cases where subclasses/adducts lack head group specific fragments (e.g. ammoniated triacylglycerols), false positive MS1 identification will be discarded by the spectrum coverage. The spectrum coverage is controlled by an adjustable threshold for the percentage of annotated fragment intensities.
For the [CHAINS] section, the algorithm computes all possible chain combinations pertaining to the total number of carbon atoms and double bonds of the particular lipid species; e.g., PI 18:1_20:3 is appropriate for PI 38:4. The same procedure as for head group is applied for each potential chain (Supplementary Fig. 10 shows an example for PI 38:4 containing a 20:4 residue). The algorithm will typically report chain combinations only if all chains in the combination comply with the decision rules. However, there are subclasses/adducts where acyl- or alkyl/1-alkenyl chains at certain positions show low-abundance fragments only. An example is deprotonated 1-(1Z-alkenyl),2-acylglycerophosphoethanolamine, where the deprotonated alkenyl chain is of extremely low intensity (due to resonance stabilization of the carboxylate anion). For such cases, a parameter is available in section [GENERAL] to allow acceptance of a certain combination with only one verified chain. If low/high abundance chains comply with the rules, the algorithm will advance to the [POSITION] rules.
[POSITION] rules consist of intensity comparisons of previously defined fragments (Supplementary Fig. 11a). If mandatory intensity rules are defined, all of them must be fulfilled for sn-position assignment, whereas for optional rules (‘mandatory’ false), a majority already suffices for an assignment (Supplementary Fig. 11b). This approach is preferable, because in some cases the most reliable position information is derived from low-abundance, rare fragments. If these fragments are present, they are decisive by a mandatory intensity rule; otherwise, position assignment is based on less reliable, optional intensity rules.
The decision rule sets and the algorithm for their interpretation allow for utmost flexibility, such as inclusion of isotopically labeled standards (used in TG rule development – Supplementary Table 2), and even for the detection of co-eluting lipid molecular species, which is encountered frequently (Supplementary Fig. 12). Although the rules use a syntax easily comprehensible for mass spectrometrists, we recognized the need for adapting and extending the existing rules provided thus far, and for generating rules for further lipid subclasses/adducts. Consequently, we implemented a graphical user interface for rule definition which provides direct visual feedback on acquired spectra (Supplementary Fig. 7).
Experiments carried out for rule development and verification
The experimental execution is described in detailed protocols presented in Supplementary Note 1 and Supplementary Table 1. Data obtained are discussed in the main text and shown in Supplementary Tables 2 to 10, and Supplementary Figure 13. No statistical testing was applied. Further information is provided in the Life Sciences Reporting Summary published alongside this paper. Detailed data and further details on results are available online (see “Data availability” at the end of this section).
Application note 1
Collision energies in mass spectrometry are considered optimal for a subclass/adduct when both the head group and chain fragments are equally well represented. Since these energies vary depending on the subclass/adduct, as a tradeoff we selected energies which delivered the best overall result with the platform we used for the given lipidome in control experiment 1.
Application note 2
The basic fragment rules are based on published results8,21. They were adapted and extended by visual inspection of spectra from control experiment 1 and biological data. We determined detectable fragments, identified mandatory fragments, derived intensity rules, and extracted decisive differences for many isobaric/isomeric subclasses/adducts. Further, we determined intensity relationships characteristic for sn-position assignment. Finally, we found novel fragment ion relationships, such as the relative intensity of the sodiated form of a carboxylated chain fragment that allowed for differentiation between 1,2- and 1,3-diacylglycerols at optimal collision energies, and demonstrated the software’s capability to distinguish regio-isomers under certain chromatographic conditions (Supplementary Fig. 4 and Supplementary Table 14).
We defined more than 1,000 decision rule sets for lipid subclasses/adducts for various MS platforms and experimental conditions. These decision rule sets cover the major lipid subclasses and mass spectrometers commonly used today and will serve as a point of entry for investigators unfamiliar with lipid data analysis. The direct visual feedback particularly provides an easy introduction to fragmentation patterns of lipids. Importantly, decision rule sets developed are provided along with software for the algorithm, which can be downloaded from http://genome.tugraz.at/lda2. In addition, raw data, results of detailed analysis, comments about information content that can be derived from the various adduct ions, and suggestions about optimal collision energies for subclasses/adducts are available.
Application note 3
In general, isobars or isomers from different lipid subclasses/adducts are chromatographically separated only slightly, and such separation cannot be judged from MS1 spectra. Thus, we expanded our algorithm to separate MS1 peaks consisting of pairs of isobaric or isomeric subclasses/adducts. The algorithm extracts ion chromatograms from the absolute intensities of distinct fragments belonging solely to one or the other species, and computes the retention time (RT) maxima of either lipid species. A weighted mean (based on abundances of the fragments) is used to estimate the RT of such maxima. If there is at least one MS/MS scan between the maxima, or the maxima are in the vicinity of two different adjacent MS/MS spectra (in the range of 20% of the distance between the spectra), the mean of the two RTs is defined as the position of the split of the MS1 peak. If the RT cannot be determined using absolute intensities, the same procedure is applied to relative intensities. If this also fails, the MS1 peak intensity is distributed according to the intensities of the distinct fragments. However, for isobars/isomers of different subclasses/adducts (e.g. protonated PC and PE), this approach is highly inaccurate and should be avoided, because intensities of the fragments typically do not reflect the MS1 intensities. To verify this, we pooled lipid standards of PC, PE, LPC, and LPE subclasses, where isomeric species can be observed as protonated and sodiated adduct ions, and deliberately worsened chromatography parameters to generate overlapping MS1 peaks (Supplementary Fig. 14). This experiment revealed that a successful peak split primarily depends on the availability of MS/MS scans. Whereas successful splits were frequent for PC/PE species and for platforms with high MS/MS scan rates, less well-separated LPC/LPE species were often left unsplit (Supplementary Table 4). Nonetheless, the presence of either isomer was detected in almost all cases (97%).
Application note 4
LDA detects and assigns structural isomers of the same lipid subclass/adduct from a shared MS1 peak, whereupon the abundance of the MS1 peak is split according to intensities detected in the MS/MS spectra (Supplementary Fig. 12). To this end we determined detection rate, accuracy, and variability of results obtained from experiments with structural isomers mixed in concentration ratios up to 1:20. As expected, results varied depending on MS ion mode and ionization mode (Supplementary Tables 6 and 7, and Supplementary Fig. 13). Whereas negative ion mode generally produced results with low coefficients of variation, positive ion mode led to results with high coefficients of variation due to low abundance chain fragments. In fact, at higher concentration ratios, chain fragments of low-abundance species are not detectable at all. An interesting showcase for a potential pitfall due to a low MS/MS sampling rate was found for PE acquired on QTOF in positive ion mode. Whereas chromatographically separated isomeric PE 36:4 species produced excellent lipid molecular species ratios, mixed PE 36:2 species produced much higher abundances for PE 18:0/18:2 in comparison to PE 18:1/18:1. The reason is that, in this particular case, the QTOF instrument reported MS/MS spectra only at the end of the MS1 peak; therefore, the slightly earlier eluting PE 18:1/18:1 yielded lower 18:1 chain intensities. Interestingly, chain fragments of some species reflect the true ratios quite well (e.g. PC 34:0 in negative ion mode), while others usually underestimate the true ratio (e.g. PE 36:4). Generally, to derive absolute intensities of pairs of structural isomers from the same MS1 peak, calibration curves are strongly recommended.
Application note 5
In a benchmark test of LDA versus LipidBlast7, we used data from both the first control experiment and the biological experiment, both acquired on Orbitrap Velos Pro in CID +50% and -50%, and on 4000 QTRAP +45 eV and -45 eV, respectively. For LipidBlast evaluation, we used the recommended MSPepSearchGUI (http://peptide.nist.gov/software/ms_pep_search_gui/MSPepSearch.html). The same m/z tolerances were applied in both LDA and LipidBlast. The specificity and sensitivity of LipidBlast depend on a so called matching factor22, a value ranging from 0-999. Using the default setting of 450 for the matching factor, many lipid standards in control experiment 1 were not detected. Consequently, the matching factor was lowered to 10, in which case LipidBlast detected almost all of the lipid standards in negative ion mode. Further reduction did not improve the sensitivity of LipidBlast. In positive ion mode, irrespective of the matching factor setting, LipidBlast was not able to identify as many lipid molecular species as was LDA. Details about the LipidBlast parameters are given in Supplementary Note 2. In this benchmark, we used only lipid subclasses/adducts that both LDA and LipidBlast are able to detect. Correct assignment of lipid species and lipid molecular species identified in liver lipidomes was verified by manual inspection of the spectra, and by aligning them with the respective retention time data23.
Code availability and technical details
The algorithm presented is embedded in the Java software package LDA (version 2.5.2) which performs MS1 peak deconvolution13, and supports several operating systems such as Windows, MacOS, Linux and other Unix-based systems. Calculations were performed on a 64-bit Windows 7 desktop PC equipped with an Intel Core i7-2600 CPU at 3.4GHz and 16GB RAM under Windows 7. Decision rule sets were tested on the following MS/MS platforms: 4000 QTRAP and QTRAP 6500 from AB Sciex; G6550A QTOF from Agilent Technologies; Orbitrap Elite, Orbitrap Velos Pro in CID and HCD mode, and Q Exactive from Thermo Fisher Scientific; SYNAPT G1 HDMS QTOF from Waters. The primary raw data format is mzXML24; however, the software allows for direct processing of vendor formats from AB Sciex, Agilent Technologies, Bruker Daltonics and Thermo Fisher Scientific by an integrated version of msConvert25, as we obtained permission for redistribution of vendor-provided libraries from respective mass spectrometer manufacturers. For Waters “.raw” directories, installation of Mass++ (http://masspp.jp) is required. LDA including the decision rule sets is freely available from (http://genome.tugraz.at/lda2). The source code is released under a GNU GPL v3 license and is available from https://github.com/ThallingerLab/LDA2/releases/tag/2.5.2.
Data availability
Data and analysis of results from control experiments 1-3, the biological experiment, LipidBlast benchmarking, HCD characterization, and detection of regio-isomers are available from the MetaboLights26 repository with accession numbers MTBLS394 (Control experiment 1), MTBLS391 (Control experiment 2), MTBLS398 (Control experiment 3), MTBLS396 (Biological experiment), MTBLS397 (Benchmarking), and MTBLS462 (HCD characterization and regio-isomers), respectively. Raw data and results are available from authors’ website too (http://genome.tugraz.at/lda2). Detailed data documentation can be found in Supplementary Note 5.
Supplementary Material
Acknowledgements
Support by the Austrian Science Fund (FWF Project Grant P26148 to G.G.T) and the Austrian Ministry for Science, Research and Economy (HSRSM Grant Omics Center Graz, BioTechMed-Graz to G.G.T) is gratefully acknowledged. M.J.O.W. and Q.Z. were funded by the BBSRC (UK; Grant BBS/E/B/000C0415). We thank R. Salek for his extensive help in MetaboLights upload. Furthermore, we thank AB Sciex, Agilent Technologies, Bruker Daltonics and Thermo Fisher Scientific for providing permission to distribute the WiffReader SDK the MassHunter DAC, the CompassXtract, and the MSFileReader libraries in the software.
Footnotes
Author Contributions
J.H., A.T., M.T., F.S., G.H., H.C.K. and G.G.T. designed the study. J.H., A.T., M.T., G.N.R., and H.C.K. designed the experiments. K.A.Z. and G.H. provided the biological samples. A.T., M.T., G.N.R., F.T., A.C., M.R.W., A.F., C.E.W., A.M.A., O.Q., Q.Z., and M.J.O.W. designed and performed the mass spectrometric experiments. J.H. and A.Z. implemented the algorithm and the software. J.H., A.T., O.A.Z. and H.C.K. developed the decision rule sets. J.H. and A.T. benchmarked the algorithm in comparison to LipidBlast and prepared the spectral evidence for the novel species. J.H. and H.C.K. prepared and uploaded the data in MetaboLights. J.H., A.T., F.S., and G.G.T. wrote the manuscript in cooperation with all contributing authors.
Competing Financial Interests
The authors declare no competing financial interests.
References
- 1.Wenk MR. Cell. 2010;143:888–895. doi: 10.1016/j.cell.2010.11.033. [DOI] [PubMed] [Google Scholar]
- 2.Quehenberger O, Dennis EA. N Engl J Med. 2011;365:1812–1823. doi: 10.1056/NEJMra1104901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dove A. Science. 2015;347:788–790. [Google Scholar]
- 4.Liebisch G, et al. J Lipid Res. 2013;54:1523–1530. doi: 10.1194/jlr.M033506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Song H, Hsu FF, Ladenson J, Turk J. J Am Soc Mass Spectrom. 2007;18:1848–1858. doi: 10.1016/j.jasms.2007.07.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Taguchi R, Ishikawa M. J Chromatogr A. 2010;1217:4229–4239. doi: 10.1016/j.chroma.2010.04.034. [DOI] [PubMed] [Google Scholar]
- 7.Kind T, et al. Nat Methods. 2013;10:755–758. doi: 10.1038/nmeth.2551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hsu FF, Turk J. J Chromatogr B Analyt Technol Biomed Life Sci. 2009;877:2673–2695. doi: 10.1016/j.jchromb.2009.02.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sud M, et al. Nucleic Acids Res. 2007;35:D527–D532. doi: 10.1093/nar/gkl838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Degtyarenko K, et al. Nucleic Acids Res. 2008;36:D344–D350. [Google Scholar]
- 11.Wishart DS, et al. Nucleic Acids Res. 2009;37:D603–D610. [Google Scholar]
- 12.Jewison T, et al. Nucleic Acids Res. 2012;40:D815–D820. doi: 10.1093/nar/gkr916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hartler J, et al. Bioinformatics. 2011;27:572–577. doi: 10.1093/bioinformatics/btq699. [DOI] [PubMed] [Google Scholar]
- 14.Holcapek M, Jandera P, Zderadicka P, Hrubá L. J Chromatogr A. 2003;1010:195–215. doi: 10.1016/s0021-9673(03)01030-6. [DOI] [PubMed] [Google Scholar]
- 15.Han X, Yang K, Gross RW. Mass Spectrom Rev. 2012;31:134–178. doi: 10.1002/mas.20342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Herzog R, et al. PLoS One. 2012;7:e29851. doi: 10.1371/journal.pone.0029851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yang K, Cheng H, Gross RW, Han X. Anal Chem. 2009;81:4356–4368. doi: 10.1021/ac900241u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Husen P, et al. PLoS One. 2013;8:e79736. doi: 10.1371/journal.pone.0079736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cajka T, Fiehn O. Trends Analyt Chem. 2014;61:192–206. doi: 10.1016/j.trac.2014.04.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Liebisch G, Ejsing CS, Ekroos K. Clin Chem. 2015;61:1542–1544. doi: 10.1373/clinchem.2015.244830. [DOI] [PubMed] [Google Scholar]
- 21.Murphy RC, Axelsen PH. Mass Spectrom Rev. 2011;30:579–599. doi: 10.1002/mas.20284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Stein SE, Scott DR. J Am Soc Mass Spectrom. 1994;5:859–866. doi: 10.1016/1044-0305(94)87009-8. [DOI] [PubMed] [Google Scholar]
- 23.Fauland A, et al. J Lipid Res. 2011;52:2314–2322. doi: 10.1194/jlr.D016550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Pedrioli PG, et al. Nat Biotechnol. 2004;22:1459–1466. doi: 10.1038/nbt1031. [DOI] [PubMed] [Google Scholar]
- 25.Chambers MC, et al. Nat Biotechnol. 2012;30:918–920. doi: 10.1038/nbt.2377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Haug K, et al. Nucleic Acids Res. 2013;41:D781–D786. doi: 10.1093/nar/gks1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data and analysis of results from control experiments 1-3, the biological experiment, LipidBlast benchmarking, HCD characterization, and detection of regio-isomers are available from the MetaboLights26 repository with accession numbers MTBLS394 (Control experiment 1), MTBLS391 (Control experiment 2), MTBLS398 (Control experiment 3), MTBLS396 (Biological experiment), MTBLS397 (Benchmarking), and MTBLS462 (HCD characterization and regio-isomers), respectively. Raw data and results are available from authors’ website too (http://genome.tugraz.at/lda2). Detailed data documentation can be found in Supplementary Note 5.