Abstract
The success of non-targeted analysis often depends on libraries containing reference mass spectra of known chemical compounds; the mass spectra of unknown compounds are compared to these reference mass spectra, leading to a probable compound identity. Typical calculations include the mean measured values for each ion m/z and intensity with no estimation of the variability of the measurement. This study presents a novel tool for the calculation of the variability of a measured mass spectrum, including the various data parameters that can impact the measured variability. Using perfluorooctanoic acid (PFOA) as the model compound, the variability of measured data-dependent fragmentation mass spectra (ddMS2) was calculated within replicate measurements of a simple solution of PFOA and a complex mixture (house dust extract) containing PFOA. The variability of the measured ddMS2 for PFOA in the solution and house dust extract were similar, with standard deviations about the measured m/z value ranged from m/z 0.00003 to m/z 0.00015 and the standard deviations about the measured relative intensity ranged from 0.0077 to 0.0211 relative intensity units. In addition, the selected parameters for the extraction of ddMS2 from a single analytical run varied between the sample types due to the increased presence of background ions in the house dust extract. Finally, the variability of the ddMS2 spectra for PFOA in both samples was used to calculate a more robust similarity factor, informing the confidence of the identification of unknown compounds.
Table of Contents Art
Graphical Abstract
Introduction
Non-targeted analysis (NTA) is defined as the identification of compounds in a complex mixture without a priori knowledge regarding the chemical composition of the mixture.1–3 These NTA techniques have been applied to a variety of materials that include environmental, biological, and nutritional matrices.2–5 The broad commercial availability of high-resolution accurate-mass mass spectrometers (HRAM-MS) has substantially improved the ability of researchers to identify unknown compounds in complex mixtures when paired with advanced data analytical techniques. These innovations complement the similar use of gas chromatography with electron ionization mass spectrometry (GC-EI-MS), which has been used for decades for the identification of unknown compounds using reference libraries.6,7
An integral component of an NTA technique is the data analysis method that translates raw instrument data into compound identifications.1 The unambiguous identification of a compound requires confirmation with a reference standard, but a probable chemical identity can be achieved through comparison of a measured mass spectrum with a reference mass spectrum.8 Oberacher et al. described fragmentation mass spectral databases as “indispensable tools” for non-targeted analysis.9 There are other studies that examined the calculation of similarity between the unknown and known fragmentation mass spectra.10 Conventionally, similarity calculations compare the mean measured fragmentation mass spectrum of the unknown compound to the mean measured fragmentation mass spectra of known compounds, which determines the similarity of the paired spectra.10
The calculation of the mean fragmentation mass spectrum from individually measured mass spectra (often collected in a single sample) is typical, but does not estimate the variability associated with the measurement. Mass spectrometry, especially electrospray ionization mass spectrometry, is notorious for signal instability, resulting in variable ionization of the precursor compound.11,12 Variability of fragmentation mass spectra for proteins has been previously reported.13 Without an estimate of measurement variability, comparison of two mass spectra provides a single value that represents spectral similarity, but does not provide statistical insight into the precision of that similarity value.
Analytical methods for targeted, quantitative analysis represent the quality of their measurement results through reported metrics associated with accuracy and precision, typically determined by replication of the measurement. With non-targeted, qualitative analysis, the measurement variability is not as easily communicated, although there is research addressing the use of quality control materials for determining accuracy and precision.4,5,11,14,15 Precision for NTA is often communicated as the whole-method reproducibility of the detection and/or identification of certain compounds in a test sample or quality control material.11 Currently, there is no approach for calculating and reporting the precision of the identification protocol (i.e. library spectral matching). In addition, there have been recent criticisms of NTA methods with regards to the reproducibility of their results.16
The main objectives of this study were to: 1) present a novel tool for the determination of the variability of fragmentation mass spectra, 2) evaluate the parameters that impact the variability calculation, and 3) demonstrate the use of the variability tool for the identification of a compound in a complex mixture. Using perfluorooctanoic acid (PFOA) as the model compound for this study, the variability of the fragmentation mass spectrum in a simple solution and complex environmental sample will be presented. The fragmentation behavior of PFOA has been well studied elsewhere.17 The intent of this study is to encourage future NTA methods to report the variability of the measured fragmentation mass spectra and use this information to better communicate the confidence of their compound identifications.
Materials and Methods
Chemicals
All solvents (methanol, acetonitrile, and water) used for extraction, dilution, and mobile phases were LCMS grade and acquired from Fisher Scientific (Waltham, MA). Formic acid was acquired from Fisher Scientific. NIST Reference Material (RM) 8446 Perfluorinated Carboxylic Acids and Perfluorooctane Sulfonamide in Methanol18 was used as the PFOA standard. RM 8446 contains two different solutions, only the ampoule containing PFOA was used for this study. NIST Standard Reference Material (SRM) 2585 Organic Contaminants in House Dust19 was used as the environmental matrix material.
Sample Preparation
The PFOA solution consisted of a gravimetric dilution of RM 8446 with methanol for a final PFOA mass fraction of 0.74 mg/kg. The solution was lightly shaken to mix and stored at 4 °C until analysis.
The dust extract solutions consisted of four individual extracts of SRM 2585 that have been spiked with PFOA. Four 0.5 g subsamples of SRM 2585 were transferred to clean 15 mL polypropylene centrifuge tubes. Then 9 g of methanol was added to the centrifuge tubes. The samples were vortexed briefly to mix and then sonicated at 40 °C for 30 min. After sonication, the samples were centrifuged at approximately 3000 g for 15 min. A 4 g aliquot of the supernatant was transferred to a clean 15 mL centrifuge tube. The samples were heated and evaporated to dryness under a stream of nitrogen. The vial was reconstituted with 0.4 g of the PFOA solution; the concentration of PFOA in the dust extract was approximately the same to that of the PFOA solution. The reconstituted solutions were vortexed briefly to dissolve the remaining residue and stored at 4 °C until analysis. All samples were transferred to clean 2 mL autosampler vials for analysis.
Instrumental Analysis
Chromatography was performed using an UltiMate 3000 liquid chromatograph (Thermo Fisher Scientific, Waltham, MA) with dual-gradient pump, temperature-controlled autosampler, and temperature-controlled columns compartment. All samples were maintained at 7 °C during analysis. A 5 μL sample was drawn from the autosampler vial using the μL-pickup program (with water as the transfer solvent) and injected. Separation was performed using an Agilent Technologies (Santa Clara, CA) Zorbax Eclipse Plus C18 with dimensions 2.1 mm inner diameter, 50 mm length, and 1.8 μm particle size and mobile phases consisting of 0.1 % formic acid in water (A) and 0.1 % formic acid in acetonitrile (B) and flow rate 0.350 mL/min. The column temperature was maintained at 40 °C. The mobile phase gradient program was as follows: initial composition of 25 % B by volume followed by a linear increase to 100 % B in 10 min, held at 100 % B for 5 min while increasing the flow rate to 0.5 mL/min for cleaning. Then the column was equilibrated for 5 min at the initial composition prior to injection.
Mass spectrometry was performed using a Q-Exactive Hybrid Mass Spectrometer (Thermo Fisher Scientific) with heated electrospray ionization (HESI) interface. The instrument mass accuracy calibration was checked within 48 hours of all analyses as described in the user’s manual. The ionization chamber settings were negative polarity electrospray ionization with spray voltage of −2500 V, capillary temperature of 380 °C, sheath gas and aux gas settings of 60 and 20, respectively, and probe heater temperature of 360 °C. After injection, all flow was diverted to waste for the first 1.5 min and switched back to waste after 9.9 min. MS data was collected using a full scan MS1, followed by a data-dependent MS2 (ddMS2) experiment. The MS1 settings include a scan range of m/z 150 to 1000, resolution of 70 000, automatic gain control (AGC) target of 3e6 ions, and maximum inject time (IT) of 100 ms. The ddMS2 settings was a resolution of 17 500, AGC target of 1e5 ions, and maximum IT of 50 ms. The top 3 ions were selected for fragmentation with an isolation window of m/z 4.0, stepped normalized collision energy (NCE) of 15, 30, and 45. For the purposes of this experiment, only ions listed in an inclusion list were included in ddMS2 fragmentation (Table S1 in Supporting Information (SI)). Instrument control and preliminary data analysis was performed using Xcalibur software version 2.2 (Thermo Fisher Scientific).
Data Pre-Processing
All Thermo raw data files (.raw) were converted into mzML format using Proteowizard’s MSConvert program.20 In addition to conversion, MSConvert was used to perform vendor peak-picking (centroiding) and thresholding to m/z signals above 1 intensity unit. All samples were inspected for the presence of PFOA (m/z 412.9664) and the chromatographic peak start and end time were manually selected and recorded for further data processing.
Data Analysis
All data analysis and uncertainty calculations were performed using R programming language (version 4.0.0)21 with RStudio (version 1.3.959).22 All data and scripts used for this analysis are available at https://doi.org/10.18434/mds2–2299.
For the development of a computational tool to estimate the variability of mass spectra, a single analysis of the PFOA solution was used. The MS1 extracted ion chromatogram for PFOA (m/z 412.9664) is shown in Figure 1, with ddMS2 scans indicated across the peak. To determine the retention times of the start and end of the MS1 chromatographic peak, manual inspection of the chromatogram was performed, the results are shown in the Table S2. There were 49 ddMS2 scans for the PFOA precursor ion (m/z 412.9664) within the time of the MS1 chromatographic peak.
Figure 1.
Extracted ion chromatogram for PFOA, red circles indicate the retention time of ddMS2 scans for PFOA.
All scans were normalized to either the a) maximum peak height within the mass spectrum (referred to as max normalization), or b) the sum of the peak heights within the mass spectrum (referred to as sum normalization), resulting in intensity values ranging between 0 and 1.0. An initial list of m/z bins were created based on the ddMS2 mass spectrum nearest to the apex of the MS1 chromatographic peak. Then, proceeding to the next nearest ddMS2 scan by scan number (selecting the lower scan number in the case of a tie), the bins are populated based on the mass accuracy of the instrument (in this study 10 ppm error with a minimum mass error of 0.002 u; where u is the atomic mass, similar to amu or Dalton). If there was no bin for a m/z value, an additional bin was created. A demonstration of this approach is shown in Figure S1.
Once all ddMS2 mass spectra were binned, the mean value of the binned m/z values and their paired normalized intensity values are calculated; then the standard deviation (for bins containing two or more values) for the respective m/z and normalized intensity values are determined. The total number of values within the bin was calculated to show observational frequency of the specific m/z value. This resulted in a table containing five columns (mean m/z (mz), standard deviation m/z (mz.u), mean intensity (int), standard deviation of the intensity (int.u), and observational frequency (n)). This table, and the graphical representation of the table, is referred to as a consensus mass spectrum. A representative consensus mass spectrum for the 49 ddMS2 scans for PFOA in the single sample is shown in Figure 2 using both methods for intensity normalization. The tables for the respective consensus mass spectra are listed in Tables S3 and S4. As can be seen in the expanded section of the sum-normalized consensus mass spectrum in the inset graphic in Figure 2, the standard deviation of the m/z and normalized intensity values are represented by error bars.
Figure 2.
Consensus Mass Spectra for PFOA using sum normalized intensity (A), max normalized intensity (B), and an expanded view of a single ion in the sum normalized intensity (inset in A). The black lines and circles represent the mean m/z and intensity pair. Red error bars represent the standard deviation about the m/z value (x-axis) and the normalized intensity value (y-axis). Sum normalization was used for the rest of this study.
As the intensity normalization function does not impact the m/z values, both types of consensus mass spectra had standard deviations of the binned m/z values that ranged from 0 u to 0.0033 u. The standard deviation for the intensities ranged from 0.0001 to 0.0664 relative intensity units and 0 to 0.1430 relative intensity units for sum and max normalization, respectively. While it is most common to use max normalization to represent mass spectra, setting the base ion intensity to 1.0 for each mass spectrum, this approach misrepresents the variability of the base ion relative to the other ions in the mass spectrum. Sum normalization was used for this study to determine the uncertainty of all fragment ions.
Results
Impact of Selected Parameters on Mass Spectrum Variability
Using the uncertainty estimation tool describe above for PFOA in the PFOA solution, there were 49 ddMS2 mass spectra within the MS1 chromatographic peak. As the quadrupole in the mass spectrometer did not have the same resolution as the Orbitrap, and to fragment the precursor ion with its isotopes, the isolation width of the quadrupole was set to m/z 4.0. With the presence of background compounds in the sample, the chromatograph, and the mass spectrometer, it was probable that other compounds with m/z values near the precursor ion (and eluting at the same retention time) could be fragmented simultaneously with the precursor ion. Fragment ions unrelated to the precursor ion of interest would affect the overall variability of the mass spectrum. Therefore, additional tools to remove fragment ions not associated with the precursor ion (referred to as background ions) were developed and evaluated for their impact on the mass spectrum variability.
First, as shown in the tabulated mass spectrum shown in the Table S3, fragment ions could be observed from 2 % (observed in 1 mass spectrum out of 49 mass spectra) to 100 % (observed in 49 mass spectra out of 49 mass spectra). Less frequently observed fragment ions may be due to the fragmentation of background ions. To evaluate the impact of observational frequency from 0 % to 100 % (in bins of 5 %), the distribution of the standard deviations for the m/z values and normalized intensity values were calculated at each observational frequency threshold. The results of this evaluation are shown in Figures 3A-C. A visual explanation of the boxplot distribution is shown in Figure S1. As observed in this figure, observational frequencies at or above 15 % did not have a markedly different distribution about the variability of the m/z value of the mass spectrum. For the normalized intensity values, the distribution changed until approximately 35 %. In Figure 3C, the number of fragment ions observed dropped significantly at 5 %, from 461 fragment ions to 27 fragment ions. At 35 %, the total number of ions in the mass spectrum was 8.
Figure 3.
Distribution of the standard deviations of the m/z values (A, D, G) and relative intensity (B, E, H) for the measured fragment ions, and number of fragment ions (C, F, I) in the ddMS2 spectrum for PFOA in solution as dependent on observation frequency, chromatographic peak height, and fragment ion correlation limits (shown as the Pearson correlation coefficient), respectively.
The Gaussian distribution of a chromatographic peak represents the transient mass of the compound of interest as it is detected by the mass spectrometer. The largest amount of the compound is detected at the apex of the chromatographic peak. At the edges of the peak a smaller amount of the compound was present; the associated ddMS2 mass spectra at those edges may be more contaminated with background ions, which affects variability of the mass spectrum. By limiting the ddMS2 mass spectra to those located at specific points across the chromatographic peak, defined by the fraction of the peak height (from 0 % to 90 % peak height; in bins of 5 %), the effect of their location can be observed. The result of this experiment is shown in Figures 3D-F. Due to the determination of peak height, there were some ddMS2 mass spectra that occurred below 0 % peak height, hence the number of fragment ions were different in Figure 3F than in Figure 3C. The distribution of the variability of the m/z values was consistent at or above 0 % peak height, although changes in the distribution were observed at 60 % peak height. The distribution of the variability of the normalized intensity values was consistent at or above 0 % peak height, but also changed around 60 % peak height. Similar to the observational frequency, there was a large decrease in the number of fragment ions from 0 % to 5 % peak height.
Finally, an approach to reduce the impact of background ions on the consensus mass spectrum calculation was identify and remove fragment ions that are not correlated to the precursor ion. This was performed by calculating the correlation (Pearson correlation coefficient) of the intensity of individual fragment ions to the intensity of the precursor ion over the chromatographic peak. Highly correlated ions are likely due to the direct fragmentation of the precursor ion, while background ions would be less correlated. To evaluate the impact of correlation on the variability of the mass spectrum, Pearson correlation coefficients between the intensities of individual fragment ions and the MS1 chromatographic peak were calculated and the distribution of the variability was determined with specific correlation thresholds (from −1.0 to 0.9, in bins of 0.1). No ions had Pearson correlation coefficient with the MS1 chromatographic peak at 1.0. The results of this experiment are shown in Figures 3G-I. Due to the calculation of the Pearson correlation coefficients, fragment ions observed less than twice were automatically removed. This resulted in the removal of over 400 fragment ions, as seen from the difference in the initial number of fragment ions between Figure 3C and Figure 3I on the leftmost side of the respective plots. Generally, there was no change in the distribution of the variability of the m/z values at any correlation threshold. This was similarly observed for the distribution of the variability of the normalized intensity values, although there was a slight change in the distribution at a correlation of 0.3. A nearly linear trend in the total number of fragment ions related to the correlation limit was observed in Figure 3I. There was no observed difference between using Pearson or Spearman correlation coefficient calculation, as shown in Figure S4.
The MS experiment described above, a targeted ddMS2 experiment, is not representative of a true non-targeted method (typically performed without an inclusion list). A non-targeted method may not result in the generation of a large number of ddMS2 mass spectra similar to this experiment. All the above-described parameters reduce the number of ddMS2 mass spectra available for creating a consensus mass spectrum. When a conservative approach was used, the selected parameter settings for observational frequency, peak height, and correlation were 55 %, 10 %, and 0.4 (shown in Figure 4), respectively. At these settings, there were 21 ddMS2 mass spectra present in the PFOA solution; greater than 50 % of the ddMS2 mass spectra were removed. This removal could impact the number of ddMS2 mass spectra available in a true non-targeted method. Of the 21 ddMS2 mass spectra, a subset of the mass spectra was randomly selected, selecting a total of 2, 3, 4, 5, 7, 10, and 15 random mass spectra. The random selection used the random sampling function within R (function sample) to select the designated number of mass spectra within the 21 mass spectra. This experiment was performed at each selection number 1,000 times to get a distribution of the possible results. The result of this experiment is shown in Figure 5. The standard deviations of the m/z and normalized intensity values for the 1,000 mass spectra at each point were used to display a distribution of the variability. For m/z and normalized intensity values, the distribution was consistent at 7 selected ddMS2 mass spectra or above. It was assumed that more mass spectra included in the calculation of the consensus mass spectra would result in values more representative of the true measurement variability. To determine the measurement variability of a consensus mass spectrum, the minimum ddMS2 mass spectrum should be 7 individual mass spectra within a single sample. This is similar to the approach at NIST to use between 5 and 500 individual mass spectra to create a single consensus mass spectrum for the NIST library.23
Figure 4.
Consensus ddMS2 spectrum for PFOA in the PFOA solution, red error bars represent the standard deviation of the measured m/z and intensity values. The inset figure is an expanded view of a single ion in the spectrum.
Figure 5.
Distribution of the standard deviations of the m/z values (A) and normalized intensity (B) as dependent on the number of ddMS2 spectra used to calculate the consensus mass spectrum.
The parameter selection experiment was applied to a spiked dust extract and the results are shown in the Figure S2. In summary, similar trends were observed, although the selected parameter settings were more restrictive due to the greater number of background ions from the complex matrix. The selected parameter settings were determined to be 80 % observational frequency, 10 % peak height, and correlation coefficient limit of 0.85. The consensus mass spectrum for PFOA in the dust extract, based on the selected parameter settings, is shown in Figure S3.
Estimate of Variability of Mass Spectra Between Samples
Using the selected parameter settings for the PFOA solution, the consensus mass spectra for PFOA in all the solution replicates were calculated. The summary of these samples is shown as distributions of the standard deviations for m/z and normalized intensity values, separately in Figures 6A-C. Generally, the distribution of the standard deviations for both sets of values were similar, although there were slight differences in a few samples. As observed in Figure 6C, six fragment ions were observed in all consensus mass spectra using the selected parameter settings.
Figure 6.
Distribution of the standard deviations of the m/z values (A,D) and the normalized intensity (B,E), and the number of fragment ions (C,F) over the replicate samples of the PFOA solution (left) and spiked dust extract (right). The x-axis represents the sequence order of the replicates for the 10 PFOA solution replicates (left) and the 40 spiked dust extract replicates (right).
The same experiment was performed for the spiked dust extract samples. The summary is shown in Figures 6D-F. Overall, the distribution of the variability of the m/z values are consistent with a few samples with a wider distribution. The distribution of the variability of the normalized intensity values were consistent with a few samples with varied distributions. The number of fragment ions included in the consensus mass spectra varied between samples.
As means and standard deviations are calculated for the m/z and normalized intensity values for each consensus mass spectrum, a combined consensus mass spectrum can be generated by calculating the mean of mean values and pooling the standard deviations via the root sum of the squares. The observational frequencies of each fragment ion across each mass spectrum were added together. The consensus mass spectra for PFOA within replicates of the PFOA solution and spiked dust extract samples, separately, are shown Figure 7A.
Figure 7.
Comparison of the consensus ddMS2 spectrum (A) for PFOA in the spiked house dust (top, black) and PFOA solution (bottom, red). The distribution of the similarity estimates scores (C) of 10 000 simulated spectra based on these consensus mass spectra is shown with the dot product (DP) and the reverse dot product (RDP). The comparison of the consensus ddMS2 spectrum (B) for PFOA in the spiked house dust with the less frequently (< 50 %) observed fragments removed (top, black) and the PFOA solution (bottom, red). The distribution of the similarity estimates scores (D) of 10 000 simulated spectra based on these consensus mass spectra is shown with the dot product (DP) and the reverse dot product (RDP).
Comparison of Consensus Mass Spectra
To calculate the similarity between two mass spectra, the dot product was calculated between paired m/z-intensity values, the calculation is described in previous literature.10 The weighting for the dot product were 1.0 and 0.5 for m/z and intensity values, respectively. These values can then range from 0 to 1.0, the greater value indicating a more similar mass spectra. A visual comparison of the consensus mass spectra for PFOA in the PFOA solution and the spiked dust extract is shown in Figure 7A. The comparison of the mean mass spectra between the PFOA solution and the spiked dust extract resulted in a dot product of 0.5852, with reverse dot product (PFOA solution to spiked dust extract) of 0.9998. The dot product score was not considered a “good match” based on previously-reported criteria of a dot product greater than or equal to 0.7.24 With the calculation of a consensus mass spectrum with variability, a more comprehensive similarity factor was calculated. A comparison tool was developed that generates a simulated mass spectrum using a random number generation based on a normal distribution (rnorm function in R) with the mean and standard deviation values for m/z and normalized intensity. The dot product and reverse dot product is calculated between each of pair of simulated mass spectra. This simulation can be performed numerous times to create a distribution of similarity scores.
For this study, the simulated comparison was performed 10 000 times and the distributions of the dot products (comparison of spiked dust extract to PFOA solution) and the reverse dot products (PFOA solution to spiked dust extract) are shown in Figure 7B. The dot product for the comparison ranged from 0.4422 to 0.9917. Within the 10 000 simulated mass spectra, there were instances where the dot product was below 0.7, the threshold for a “good” spectral match.24 The mean dot product was 0.5973 with a standard deviation of 0.0732. As PFOA was spiked into the dust extract, it can be assumed that the ddMS2 mass spectra attributed to precursor ion m/z 412.9664 was due to the fragmentation of PFOA and, ideally, should provide a good spectral match.
Previous studies have examined methods to “clean up” fragmentation mass spectra by looking at fragments that are consistently present across multiple samples.25 While all fragments in the PFOA solution consensus mass spectrum were consistently observed between samples, the frequency of fragment ions in the replicate analyses of the spiked dust extract ranged from 10 to 746 individual observations (summed over all extract samples). The lower frequency observations were indicative of background ions not associated with the precursor ion that were not removed with the selected parameter settings. By removing all fragment ions observed less than 50 % of the time across all replicate samples (visually demonstrated in Table S4), the consensus mass spectrum for PFOA in the spiked dust extract matches that of the PFOA solution, as shown in Figure 7C. For the simulated comparison the mean dot product was 0.9998 with standard deviation of 2.5×10−5, an excellent match score. The distribution of the dot product and reverse dot product of this comparison is shown in Figure 7D. By using the pooled mass spectral values and observational frequency, the match score was significantly improved in a complex matrix like house dust.
Conclusions
Libraries containing reference mass spectra continue to be an important tool for NTA techniques for the probable identification of unknown compounds, although, prior to this study, the variability of these measured mass spectra was not included in the identification scheme. The presented study was an approach to determine the variability of a measured mass spectrum, along with tools to pool multiple mass spectra and compare mass spectra that includes variability as a key component in the match scoring. While the included R scripts are not the only method that could be used for determining variability of measured mass spectra, the results of this study suggest that this approach can be a powerful tool for compound identification, especially in complex environmental matrices. Additional application of this approach to different compounds and matrices will be performed to evaluate the robustness of the tool. Further exploration of the variability between one or more instruments of the same model, and the possibility of comparing disparate mass spectra (e.g., mass spectra from two different brands of mass spectrometers), is needed.
Supplementary Material
Footnotes
Disclaimer
Certain commercial equipment, instruments, software, or materials are identified in this communication to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.
Supporting Information. Additional instrumental method and sample information, graphical representation of data analysis workflow, and additional uncertainty estimations are supplied as Supporting Information.
References
- (1).Hollender J; Schymanski EL; Singer HP; Ferguson PL Nontarget Screening with High Resolution Mass Spectrometry in the Environment: Ready to Go? Environ. Sci. Technol 2017, 51 (20), 11505–11512. 10.1021/acs.est.7b02184. [DOI] [PubMed] [Google Scholar]
- (2).Hernández F; Sancho JV; Ibáñez M; Abad E; Portolés T; Mattioli L. Current Use of High-Resolution Mass Spectrometry in the Environmental Sciences. Anal. Bioanal. Chem 2012, 403 (5), 1251–1264. 10.1007/s00216-012-5844-7. [DOI] [PubMed] [Google Scholar]
- (3).Hogenboom AC; Niessen WMA; Brinkman UAT Characterization of Photodegradation Products of Alachlor in Water by On-Line Solid-Phase Extraction Liquid Chromatography Combined with Tandem Mass Spectrometry and Orthogonal-Acceleration Time-of-Flight Mass Spectrometry. Rapid Commun. Mass Spectrom 2000, 14 (20), 1914–1924. . [DOI] [PubMed] [Google Scholar]
- (4).Du B; Lofton JM; Peter KT; Gipe AD; James CA; McIntyre JK; Scholz NL; Baker JE; Kolodziej EP Development of Suspect and Non-Target Screening Methods for Detection of Organic Contaminants in Highway Runoff and Fish Tissue with High-Resolution Time-of-Flight Mass Spectrometry. Environ. Sci. Process. Impacts 2017, 19 (9), 1185–1196. 10.1039/C7EM00243B. [DOI] [PubMed] [Google Scholar]
- (5).Naz S; Vallejo M; García A; Barbas C. Method Validation Strategies Involved in Non-Targeted Metabolomics. J. Chromatogr. A 2014, 1353, 99–105. 10.1016/j.chroma.2014.04.071. [DOI] [PubMed] [Google Scholar]
- (6).Stein S. Mass Spectral Reference Libraries: An Ever-Expanding Resource for Chemical Identification. Anal. Chem 2012, 84 (17), 7274–7282. 10.1021/ac301205z. [DOI] [PubMed] [Google Scholar]
- (7).Kind T; Tsugawa H; Cajka T; Ma Y; Lai Z; Mehta SS; Wohlgemuth G; Barupal DK; Showalter MR; Arita M; Fiehn O. Identification of Small Molecules Using Accurate Mass MS/MS Search. Mass Spectrom. Rev 2018, 37 (4), 513–532. 10.1002/mas.21535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).E L. Schymanski; Jeon J; Gulde R; Fenner K; Ruff M; Singer HP; Hollender J. Identifying Small Molecules via High Resolution Mass Spectrometry: Communicating Confidence. Environ. Sci. & Techno 2014, 48 (4), 2097–2098 10.1021/es5002105. [DOI] [PubMed] [Google Scholar]
- (9).Oberacher H; Sasse M; Antignac J-P; Guitton Y; Debrauwer L; Jamin EL; Schulze T; Krauss M; Covaci A; Caballero-Casero N; Rousseau K; Damont A; Fenaille F; Lamoree M; Schymanski EL A European Proposal for Quality Control and Quality Assurance of Tandem Mass Spectral Libraries. Environ. Sci. Eur 2020, 32 (1), 43. 10.1186/s12302-020-00314-9. [DOI] [Google Scholar]
- (10).Stein SE; D R. Scott. Optimization and Testing of Mass Spectral Library Search Algorithms for Compound Identification. J. Am. Soc. Mass Spectrom 1994, 5 (9), 859–866. 10.1021/jasms.8b00613. [DOI] [PubMed] [Google Scholar]
- (11).Ng B; Quinete N; Gardinali PR Assessing Accuracy, Precision and Selectivity Using Quality Controls for Non-Targeted Analysis. Sci. Total Environ 2020, 713, 136568. 10.1016/j.scitotenv.2020.136568. [DOI] [PubMed] [Google Scholar]
- (12).Crews B; Wikoff WR; Patti GJ; Woo H-K; Kalisiak E; Heideker J; Siuzdak G. Variability Analysis of Human Plasma and Cerebral Spinal Fluid Reveals Statistical Significance of Changes in Mass Spectrometry-Based Metabolomics Data. Anal. Chem 2009, 81 (20), 8538–8544. 10.1021/ac9014947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).Venable JD; Yates JR Impact of Ion Trap Tandem Mass Spectra Variability on the Identification of Peptides. Anal. Chem 2004, 76 (10), 2928–2937. 10.1021/ac0348219. [DOI] [PubMed] [Google Scholar]
- (14).Schymanski EL; Singer HP; Slobodnik J; Ipolyi IM; Oswald P; Krauss M; Schulze T; Haglund P; Letzel T; Grosse S; Thomaidis NS; Bletsou A; Zwiener C; Ibáñez M; Portolés T; de Boer R; Reid MJ; Onghena M; Kunkel U; Schulz W; Guillon A; Noyon N; Leroy G; Bados P; Bogialli S; Stipaničev D; Rostkowski P; Hollender J. Non-Target Screening with High-Resolution Mass Spectrometry: Critical Review Using a Collaborative Trial on Water Analysis. Anal. Bioanal. Chem 2015, 407 (21), 6237–6255. 10.1007/s00216-015-8681-7. [DOI] [PubMed] [Google Scholar]
- (15).Begley P; Francis-McIntyre S; Dunn WB Broadhurst DI; Halsall A; Tseng A; Knowles J; Goodacre R; Kell B, Development D. and Performance of a Gas Chromatography−Time-of-Flight Mass Spectrometry Analysis for Large-Scale Nontargeted Metabolomic Studies of Human Serum. Anal. Chem 2009, 81 (16), 7038–7046. 10.1021/ac9011599. [DOI] [PubMed] [Google Scholar]
- (16).Hites RA; Jobst J, Is K. Nontargeted Screening Reproducible? Environ. Sci. & Technol 2018, 52 (21), 11975–11976 10.1021/acs.est.8b05671. [DOI] [PubMed] [Google Scholar]
- (17).Arsenault G; McAlees A; McCrindle R; Riddell N. Analysis of Perfluoroalkyl Anion Fragmentation Pathways for Perfluoroalkyl Carboxylates and Sulfonates during Liquid Chromatography/Tandem Mass Spectrometry: Evidence for Fluorine Migration Prior to Secondary and Tertiary Fragmentation. Rapid Commun. Mass Spectrom 2007, 21 (23), 3803–3814. 10.1002/rcm.3274. [DOI] [PubMed] [Google Scholar]
- (18).National Institute of Standards & Technology. Report of Investigation Reference Material 8446 Perfluorinated Carboxylic Acids and Perfluorooctane Sulfonamide in Methanol; Gaithersburg, MD, 2014. [Google Scholar]
- (19).National Institute of Standards & Technology. Certificate of Analysis Standard Reference Material 2585 Organic Contaminants in House Dust; Gaithersburg, MD, 2018. [Google Scholar]
- (20).Chambers MC; Maclean B; Burke R; Amodei D; Ruderman DL; Neumann S; Gatto L; Fischer B; Pratt B; Egertson J; Hoff K; Kessner D; Tasman N; Shulman N; Frewen B; Baker TA; Brusniak M-Y; Paulse C; Creasy D; Flashner L; Kani K; Moulding C; Seymour SL; Nuwaysir LM; Lefebvre B; Kuhlmann F; Roark J; Rainer P; Detlev S; Hemenway T; Huhmer A; Langridge J; Connolly B; Chadick T; Holly K; Eckels J; Deutsch EW; Moritz RL; Katz JE; Agus DB; MacCoss M; Tabb DL; Mallick P. A Cross-Platform Toolkit for Mass Spectrometry and Proteomics. Nat. Biotechnol 2012, 30 (10), 918–920. 10.1038/nbt.2377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).R Core Team. R: A Language and Environment for Statistical Computing https://www.r-project.org.
- (22).RStudio Team. RStudio: Integrated Development for R http://www.rstudio.com/.
- (23).Yang X; Neta P; Stein E, Quality S. Control for Building Libraries from Electrospray Ionization Tandem Mass Spectra. Anal. Chem 2014, 86 (13), 6393–6400. 10.1021/ac500711m. [DOI] [PubMed] [Google Scholar]
- (24).Champarnaud E; Hopley C. Evaluation of the Comparability of Spectra Generated Using a Tuning Point Protocol on Twelve Electrospray Ionisation Tandem-in-Space Mass Spectrometers. Rapid Commun. Mass Spectrom 2011, 25 (8), 1001–1007. 10.1002/rcm.4940. [DOI] [PubMed] [Google Scholar]
- (25).Lam H; Deutsch EW; Eddes JS; Eng JK; Stein SE; Aebersold R. Building Consensus Spectral Libraries for Peptide Identification in Proteomics. Nat. Methods 2008, 5 (10), 873–875. 10.1038/nmeth.1254. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.