Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Aug 1.
Published in final edited form as: Biochim Biophys Acta. 2017 Mar 2;1862(8):766–770. doi: 10.1016/j.bbalip.2017.02.016

Common cases of improper lipid annotation using high-resolution tandem mass spectrometry data and corresponding limitations in biological interpretation

Jeremy P Koelmel a, Candice Z Ulmer b, Christina M Jones b, Richard A Yost a,c, John A Bowden b,*
PMCID: PMC5584053  NIHMSID: NIHMS868451  PMID: 28263877

1. Introduction

Lipids have a wide-range of biological functions enabled by enormous diversity in lipid structure. Combinations of structural motifs (e.g., fatty acids and head group moieties linked to various backbones), linkages (e.g., ester, ether, vinyl ether), and complexes with non-lipid species (e.g., carbohydrates and proteins) result in countless possible structures. The lipidome - the entire collection of individual lipid species in cells, tissues or biofluids - has been estimated to be composed of 1000 to more than 180,000 molecular lipid species [1,2], but many of these species are likely very low in abundance or have not been observed [3]. These estimations do not consider isomeric lipid species with different fatty acyl double-bond positions and configurations (cis or trans), positional isomers (e.g., sn1, sn2), and stereoisomers (R or S). Ekroos et al. determined that the number of phosphatidylcholine (PC) positional isomers in Madin-Darby canine kidney II cells nearly doubled the total number of individual lipid species [4], which highlights the substantial presence of lipid isomers in nature. Furthermore, these lipid isomers can also exhibit a variety of specific biological roles. For example, the acyl position of membrane lipids can impact the enzymatic activity that occurs within cellular membranes [5]. Shinzawa-Itoh et al. [6] found biological specificity of acyl chain double bond configurations; though the mitochondrial inner membrane where bovine cytochrome c oxidase (CcO) acquires its phospholipids contains trans-vaccenate, only cis-vaccenate is incorporated into subunit III of CcO[6]. Researchers have shown differing roles of individual conjugated linoleic acid (CLA) isomers; while the cis-9,trans-11 isomer has been shown to more broadly inhibit tumorigenesis in vitro, the trans-10,cis-12 isomer has been shown to increase concentrations of human blood lipids, such as triglycerides (TG) and the ratio of LDL to HDL cholesterol, when compared to the cis-9,trans-11 isomer [79]. Structural elucidation is vital in ensuring that biological properties are properly associated with the correct lipid species.

Lipidomics—the quantitative measurement of the lipidome—is a massive undertaking owing to the enormous compositional complexity of lipids. Lipid species have generally been measured and identified using tandem mass spectrometry (MS/MS). Since many lipids are isobaric species sharing the same nominal mass [10,11], fragmentation provides more detailed structural information. Liquid chromatography is commonly utilized to separate lipid classes and isomeric species before mass spectrometric detection to further aid in their structural elucidation and quantification. The advent of high-resolution hybrid mass spectrometers, for example, time of flight and orbitrap mass spectrometers [12,13], has allowed for enhanced lipid identification and structural annotation when compared to unit resolution mass spectrometers because of improved specificity, sensitivity, and reproducibility [14, 15]. High mass accuracy (often sub-ppm) can narrow the list of possible molecular formula by providing the isotopic structure detail of precursor lipid ions. The addition of resolved isobaric fragment ions reduce false positive and negative molecular identities [16]. While ultra-high performance liquid chromatography high-resolution tandem mass spectrometry (UHPLC-HRMS/MS) provides more accurate identifications than traditional approaches such as those implementing triple quadrupoles, data acquired using UHPLC-HRMS/MS often does not provide sufficient information to characterize all structural details of lipid molecules. Incorrect annotation will lead to erroneous biological interpretation of the data. Therefore, we provide guidelines for annotating lipids, discuss the limitations in biological interpretation of lipid species, and summarize both software and instrumental advancements in lipid identification. Software which applies these guidelines is essential to implementation and harmonization by the wider lipidomics community.

1.1. Guidelines for lipid annotation

Community-accepted guidelines for lipid annotations [1719] generated/accepted by the International Lipids Classification and Nomenclature Committee have been implemented and promoted by the LIPID Metabolites And Pathways Strategy (LIPID MAPS) consortium [2022], and are meant to completely characterize the lipid molecule as shown in Fig. 1. However, conventional tandem mass spectrometric experiments cannot be used to generate all structural information of a given lipid molecule. Therefore, shorthand notation has been proposed to only confer the level of structural detail known based on experimental data [23]. We will define this structural detail for a given lipid species, as the structural resolution. Moreover, we summarize existing guidelines supplemented by new recommendations to prevent over-reporting of lipid structural resolution and to further encourage the use of a common nomenclature system for lipidomics.

Fig. 1.

Fig. 1

Annotation of a phosphatidylcholine (PC) species outlining how to annotate each structural detail of glycerophospholipids. The lipid is annotated using LIPID MAPS nomenclature based off of International Union of Pure and Applied Chemists and the International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) Commission on Biochemical Nomenclature. The [R] conformation is often not indicated in annotation, while [S], being the less common form, is specifically referred to.

1.1.1. Do not annotate lipids using only exact mass

Often researchers entering the lipidomics field will annotate peaks and features based on exact mass only, for example as in Gerspach et al. [24]. A feature is a peak, or group of peaks across numerous samples, represented by a specific m/z and any other measurements, such as a specific retention time if chromatography is used, or drift time if ion mobility is used [25]. Since the lipidome is diverse, with enormous overlap in exact mass, we strongly warn against annotating features using only exact mass, especially for previously uncharacterized sample types. It is important to note that exact mass search engines, such as Metlin and LIPID MAPS, provide lipid matches annotated as fully characterized molecular species, which cannot be elucidated from exact mass alone.

1.1.2. Annotate by sum composition when class specific fragmentation is observed

The most basic annotation of lipids is by lipid class and the sum composition of carbons and double bonds in the lipid fatty acyl chains (Table 1). The sum composition annotation is useful in cases where the majority of fragmentation intensity is in class specific fragments. Examples include phosphatidylethanolamine (PE) [M + H]+ (neutral loss (NL) of m/z 141.0191) [26], phosphatidylinositol (PI) [M + NH4]+ (NL of m/z 277.0562) [27], and sulfatide [M − H] (m/z 96.9601) [28], with these base peaks in the fragmentation specific to the lipid head group. Annotation by lipid class can often lead to false positives if fragments are not specific to only that lipid class. A common case is the incorrect annotation of protonated adducts of sphingomyelin (SM) and phosphatidylcholine (PC) and their lysolipid, oxidized lipid, and ether-linked lipid corollaries using m/z 184.0733, for example as in Jin et al. [29]. Isobaric isotopic peaks of co-eluting SM and PC species will be co-isolated for fragmentation and hence the lipid class represented by m/z 184.0733 is ambiguous. In this case, identifications should be noted as tentative unless reconstructed ion chromatograms of the PC and SM species within 3 Da do not overlap or fatty acyl constituents are observed.

Table 1.

Different degrees of structural resolution.

Structural resolution Annotation
Carbons and double bonds PC(34:2)
Fatty acyl constituents PC(16:0_18:2)
Positional isomers PC(16:0/18:2)
Double bond position PC(16:0/18:2(10,12))
Double bond Cis/Trans PC(16:0/18:2(10E,12Z))
Stereochemistry PC(16:0/18:2(10E,12Z)[R])

1.1.3. Denote fatty acyl chains only when fatty acyl fragment(s) are observed

Lipids can often be annotated based on fatty acyl constituents (Table 1). Technically, in lipids with two fatty acyl constituents, only one fatty acyl chain is needed for identification, as the other can be deduced using the exact mass of the precursor. This can be a helpful strategy when sn1-and sn2-linked fatty acyl chains have different fragment efficiencies, as in PC [M + H]+ adducts [30]. Without assumptions based on biology or specific approaches, identification by fatty acyl chain constituents is often the limit of structural resolution that can be obtained.

1.1.4. Use the underscore to annotate lipid species with unknown positional isomers

Traditional UHPLC-HRMS platforms with tandem MS do not provide information about the double bond position or orientation, stereochemistry, and in many instances, the position of the fatty acyl chain on the glycerol backbone (sn1 or sn2). Lipid identifications where the positional isomeric level of the fatty acyl chains is known is indicated by a slash “/”. The underscore “_” was proposed by Liebisch et al. (2013) [31] for instances where there is certainty in the composition of the fatty acyl constituents, but not their placement on the glycerol backbone. Despite the proposed shorthand notation, there has been a mix of annotation styles present in literature. For example, of the lipidomics research articles published in 2017 determined on Science Direct (accessed 02/09/2017), 3 articles [29,32,33] incorrectly used “/”, 3 articles [3436] used “-”, and1article [37] used “_” between fatty acyl constituents, when positional isomers were not identified. One potential source of confusion in annotating lipids is that current lipid identification software, for example LipidSearch, MS-DIAL [38], LipidBlast [39], and Greazy [40] all employ “/”when fatty acyl positions are not known. However, to further advance the lipidomics community, all lipid identification software should improve lipid annotation by incorporating the slash “/” or “_” correctly based on the MS/MS data. Otherwise, an incorrect level of structural detail is assigned to the lipid annotation, providing the user with a level of certainty, which is misleading for biomarker discovery, disease etiology studies, and translational science with other omics areas.

1.1.5. Report plasmanyl species using O- and plasmenyl species using P-

Some of the most problematic cases for lipid annotation include plasmenyl and plasmanyl ether-linked species, which are depicted in Fig. 2. One problem is the use of varying annotation style. For plasmanyl lipid species, lipids are often annotated using an “e” or an “O-”, while plasmenyl lipids are often annotated using a lowercase “p” or a capital “P-”. We suggest using “O-” and “P-”, the annotation style used by LIPID MAPS [20,22,31]. Another problem arises because the vinyl ether linkage in plasmenyl species and the ether linkage in plasmanyl species only differ by a degree of unsaturation, leading to differing structures with the same molecular formula. For example, plasmanyl PE(O-16:0/22:6) will have the exact same mass as plasmenyl PE(P-16:0/22:5) and cannot be distinguished based on class specific fragments. In this case, we suggest including both annotation by sum composition, for example PE(P-38:5) and PE(O-38:6). In the case of ether-linked PC, the formate adduct will yield an abundant sn2 fatty acyl fragment when fragmented in negative ion mode; the ether-linked PE species can also be distinguished using fragmentation [41]. Hence, the vinyl ether- and ether-linked lipids can be distinguished using fragmentation, although co-elution of plasmenyl and plasmanyl species often occurs, in which case both species should be reported.

Fig. 2.

Fig. 2

General structure for plasmanyl and plasmenyl phospholipid species containing a glycerol backbone.

1.1.6. Report all possible lipid candidates for a feature separated by a pipe “|”, not just the top few lipid candidates

TGs are the most common case where co-elution of isomeric species occurs. For example, our laboratory tentatively identified 2607 TG ions ([M + Na]+ and [M + NH4]+) in human plasma across 370 features, meaning that, on average, each feature had 7 co-eluting TGs identified (unpublished data). For one feature at m/z 920.8635 in human plasma, 49 TGs were tentatively identified. TGs are just one example of co-eluting molecules, for the same human plasma analysis in positive ion mode we found that 40% of features with lipid annotations have at least two co-eluting lipids identified. It is important to note that most software only include one lipid identification for a given feature in the final report, which is based on the false assumption that there are few instances of co-eluting lipids. Examples of annotated lipids using pipes can be found in Supplementary Table S-4 of Koelmel et al. [42], for example for m/z 766.5391 at retention time 7.06, the feature was annotated as PE(18:0_20:5)+H | PE(18:1_20:4)+H | PE(16:0_22:5)+H, with annotations ranked by a score based on the MS/MS spectra.

1.1.7. Use comprehensive MS/MS libraries whenever possible

Even when annotations include all lipids identified for a respective feature, co-eluting lipids not contained in that software's libraries may still exist. In this case, biological interpretation will be confounded by multiple uncharacterized lipids or other molecules contributing to a feature's intensity. One potential example is oxidized species, which can overlap with non-oxidized species, but are not contained in most lipidomics software.

1.1.8. Use pre-analytical steps to prevent degradation and interconversion

Pre-analytical steps can also influence the correct annotation of features by affecting the stability and intensity of lipids or leading to interconversion of the lipid species observed. Sample handling and preparation techniques, involving homogenization, freeze-thawing, and/or exposure to air or light, can result in lipid oxidation or (non)enzymatic degradation or interconversion [43,44]. For example, our studies have shown that by not quenching enzymatic activity during sample preparation leads to increased lysophosphatidylcholines (LPCs) (+19.3 ± 1.8%) and decreased phosphatidylcholine precursors (−13.4 ± 2.6%) (unpublished data), likely caused by phospholipase A activity. In this case, stabilization techniques (e.g., heat treatment, additives such as antioxidants, and freeze drying) can be employed, and common byproducts of degradation and interconversion can be measured. For certain lyso-lipids, such as LPCs, acylmigration during sample preparation exists between the sn1 and sn2 isomer, complicating annotation and quantification [45]. Therefore, sn1 and sn2 isomers of lysospecies should be combined and reported as sum composition.

1.2. Lipid Annotation: Implications for Biological Interpretation

Currently, biochemical databases, for example, Kyoto Encyclopedia of Genes and Genomes (KEGG), are unable to capture the varying lipid structural resolution established by mass spectrometry. It is important to establish identifiers to query only biological information pertaining to known structural motifs. For example, for PC(16:0_18:1), the ideal case would be an identifier specific to the lipid class and to the fatty acyl constituents, but not the sn1 and sn2 positions, nor the double bond position. There is a general KEGG entry for the phosphatidylcholine class, C00157, but no KEGG entry for specific PCs. In this case, searching KEGG reduces the scope of biological inference to mechanisms general to all PCs. For the Human Metabolome Database (HMDB), identifiers exist for the specific lipid molecule, for example PC(16:0/18:1(9Z)). In this case, biological inferences can be too specific (i.e., based on sn1 andsn2 position) and thus lead to false interpretation of the data. It is important to note that currently, while specific lipid molecules exist in databases such as LIPID MAPS and HMDB, the curated pathways predominantly contain general lipid class biology. Therefore, current biological inference in lipidomics relies either on expertise or solely lipid class and fatty acid profile-based trends.

Universal chemical identifiers, which can convert a chemical structure into a machine readable string, and vice versa, such as the widely used International Chemical Identifier (InChI) [46], would be extremely useful for electronic record finding of mass spectrometric based lipid annotations. The InChI consists of layers, each presenting additional information on the molecular structure. For example, layers signifying cis versus trans double bonds, or chirality of the lipid molecule, can be omitted, in which case any lipid isomers will be found. The current limitation of InChI is that 3 layers are necessary, one if which is absolute bond connectivity; therefore, the position of a double bond and fatty acid on the backbone cannot be left undetermined. Chemical query languages, such as SYBYL [47], could be used, with the possibilities of storing Boolean logic, wild cards (unknown atoms and R-groups), and other functionalities allowing lipid annotations to be stored in a machine readable string which can cover all the different levels of structural resolution provided by mass spectrometry. While these identifiers exist, they are not widely implemented; in order for them to be useful they should be implemented in annotation software, databases (such as LIPID MAPS), and by chemical manufacturers.

1.3. Advancements and directions for lipid annotation

1.3.1. Instrumentation

Mass spectrometry techniques have been developed to enable detailed structural resolution of lipids, including fatty acyl positional isomers, double bond position, and double bond cis/trans isomerism (Table 1). Sn1 and sn2 positional isomers can be distinguished in tandem mass spectrometric approaches based on the relative ratios of the fatty acyl fragments. Because these relative ratios can vary between instruments, fragmentation method, and lipid classes, internal standards characterized by varying ratios of sn1 and sn2 isomers must be used for quantitative approaches. Standards are often impure and hence must first be characterized by measuring the ratio of the fatty acid concentrations after treatment with phospholipase A2, which only removes fatty acids from the sn2 position [4]. In addition, the lack of synthetic lipid standards to represent the diversity of lipid structures prohibits certainty in the intensity of lipid fragments derived from the glycerol backbone [31]. For identification of double bond positions, one promising technique is ozone-induced dissociation (OzID) [48], although specialized equipment for onsite generation of ozone and flow control is needed. In OzID, a traditional tandem mass spectrometric approach to characterize lipids by fatty acyl constituents is followed by the introduction of ozone, which induces fragmentation indicative of double bond positions. Another promising method for identification of both sn1 and sn2 fatty acyl positions, double bond positions, and cis and trans isoforms is to apply ion mobility spectrometry (IMS), a rapid and predictable separation device, in tandem with UHPLC-HRMS/MS studies [49]. Currently, these lipid isomers are not often baseline-resolved using IMS, but the resolving power of IMS is expected to increase with technological advances such as structures for lossless ion manipulation (SLIM) [50]. Because IMS is easily combined with various liquid chromatographic and mass spectrometric techniques, it could revolutionize molecular characterization of lipids in the near future.

1.3.2. Software

In untargeted studies, manual annotation of all lipid species is unrealistic; consequently, the lipidomics community relies on software to provide correctly annotated lipid species. Therefore, harmonization in the lipidomics community can only occur when software adopt commonly accepted naming conventions. Currently there is no common annotation method in lipidomics software for UHPLC-HRMS/MS studies. Open source software, for example Greazy [40], MS-DIAL [38], and LipidBlast [39], report species with fatty acyl constituents and positions, regardless of experimentally derived mass spectrometric information. For example in MS-DIAL, PC(16:0/18:1) will be reported even if only the m/z 184.7033 ion is observed, as long as the modified dot product score is high. In this case, the researcher must manually browse MS/MS spectra to determine the level of structure that can be reported. In software that implements rule-based identification such as LipidSearch (Thermo Scientific), annotation is based on fragments observed, and lipids will be reported either by lipid class (class, total carbons, and total double bonds), or by fatty acyl constituents. For MS/MS based identification, LipidMatch [42] is the only lipid identification software to date which employs all the annotation guidelines presented here, including using pipes “|” for multiple identifications “_” when fatty acyl position on the glycerol backbone is unknown, and annotates lipids by total carbons and degrees of unsaturation when only class specific fragments are observed. For exact mass searching, LipidPioneer [51], designed as a Microsoft Excel workbook, is the only template where users can generated exact mass libraries that provide exact masses and adducts for lipid species annotated with the slash, “/”, underscore, “_”, or only by class and total carbons and degrees of unsaturation depending on the users end use.

As annotation of lipid species becomes more accurate, we will continue to advance our understanding of the precise roles of individual lipids species in biological systems, advancing the utility of lipidomics.

Acknowledgments

Southeastern Center for Integrated Metabolomics (SECIM, NIH grant U24 DK097209).

Footnotes

Disclaimer

Certain commercial equipment, instruments, or materials are identified in this paper to specify adequately the experimental procedures. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology; nor does it imply that the materials or equipment identified are necessarily the best for the purpose. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health

Transparency document

The Transparency document associated with this article can be found, in online version.

References

RESOURCES