Skip to main content
Science Advances logoLink to Science Advances
. 2025 Sep 5;11(36):eadw2799. doi: 10.1126/sciadv.adw2799

Machine learning– and multilayer molecular network–assisted screening hunts fentanyl compounds

Changzhi Shi 1,2,, Wanli Li 1,, Yang Wang 1,3, Xi Chen 1, Meixiang Yu 4, Hai Zhang 4, Zecang You 2, Maoyong Song 5, Xiaojun Deng 1,3,*, Mingliang Fang 2,*
PMCID: PMC12412648  PMID: 40911666

Abstract

Fentanyl and its analogs are a global concern, making their accurate identification essential for public health. Here, we introduce Fentanyl-Hunter, a screening platform that uses a machine learning classifier and multilayer molecular network to select and annotate fentanyl compounds using mass spectrometry (MS). Our classification model, based on 772 fentanyl spectra and spectral binning feature engineering, achieved an F1 score of 0.868 ± 0.02. The multilayer network, based on spectral similarity and paired mass distances, covers more than 87% of known fentanyls. Fentanyl-Hunter identified fentanyl members in biological and environmental samples. During biotransformation, 35 metabolites from four widely consumed fentanyl derivatives were identified. Norfentanyl was the major fentanyl compound in wastewater. Retrospective screening of these biomarkers across more than 605,000 MS files in public datasets revealed fentanyl, sufentanil, norfentanyl, or remifentanil acid in more than 250 samples from eight major countries, indicating the potential widespread presence of fentanyl.


The Fentanyl-Hunter platform uses machine learning and molecular networking to identify fentanyl compounds.

INTRODUCTION

Fentanyl, a widely used synthetic opioid, can be fatal even at low exposure levels. In recent years, the illegal abuse of fentanyl has become increasingly prominent, evolving into a serious social issue (1). Approximately 61 million people worldwide were engaged in non–medical opioid use in 2022 (2). In Europe, fentanyl and its metabolites have been detected in the influent wastewater of 10 of 12 cities monitored (3). In 2023, fentanyl abuse accounted for ~75,000 deaths in the US, making up nearly 70% of all drug overdose fatalities (4). Aligned with the fourth wave of the drug overdose crisis, more than 1400 fentanyl analogs with similar or even greater potency have been synthesized. They are gradually becoming prevalent in the drug market (5, 6). These previously unidentified fentanyls were designed to evade law enforcement and bypass analytical detection, posing challenges for global regulation and control.

Over the past decade, high-resolution mass spectrometry (HRMS) and nontargeted analysis (NTA) have demonstrated capabilities in identifying fentanyl with high-throughput and confident identification (7). Liquid chromatography (LC)-quadrupole time-of-flight mass spectrometry combined with diagnostic product ion–based suspect screening analysis (SSA) allows for the identification of fentanyls in biological samples (8). In a recent analysis of 100 hair samples taken from opiate users, the NTA approach identified fentanyl analogs β-hydroxyfentanyl and methoxyacetylfentanyl, which were not on the panel of target analytes (9). The number of previously unknown fentanyl analogs is constantly increasing; however, most have a low detection rate. As structural modifications become more complex, rapid screening of fentanyls is becoming increasingly challenging. On the other hand, the metabolism of fentanyl by the human body can introduce complexities, and monitoring only parents may not accurately reflect human exposure (10). While stable isotopes and NTA in metabolomics have successfully identified biotransformation products, these methods are costly and time-intensive and require the accurate structure of the parent compound (11, 12). Little is known about the metabolites of the fentanyl analogs (13). These findings demonstrate the need to develop an NTA platform to identify known and unknown fentanyl family members among thousands of features in complex samples from illegal athletes, intoxication cases, drug abusers, and polluted environments.

The large-scale and unambiguous annotation of unknowns remains a challenging task in NTA (14, 15). The widely used strategy in the identification of knowns is to match the experimental tandem mass spectrometry (MS2) spectrum with those from the spectral library (16). However, numerous transformation products are absent from the spectral and structural databases. In addition, only a small number of fentanyls have data available for training biotransformation prediction tools (17). Molecular networking is a promising data-driven approach for finding and identifying unknown compounds, including antimicrobial agents and pharmaceutical transformation products, within the same subnetwork based on MS2 spectral similarity (18, 19). Alternatively, potential biological or abiotic pathways can be inferred by calculating mass changes between known structures during transformation, aiding in the identification of unknown compounds through a knowledge-guided multilayer network (20). Therefore, a platform designed to identify potential fentanyl metabolites accurately must be supported by a comprehensive spectral database of known fentanyl compounds, supplemented with multiple layers of orthogonal information.

In this study, we introduce a rapid, accurate, and comprehensive screening platform for fentanyl and its transformation products, named Fentanyl-Hunter, which outperforms previous analytical workflow. The platform contains a fentanyl-only MS feature filter (Fentanyl_Finder) based on machine learning and a fentanyl family identification module (Fentanyl_ID) assisted by a multilayer network. The screened features were annotated not only by matching chemical standards or known spectra used as seeds but also by identifying neighboring compounds in the multilayer network. We demonstrated method generalization in screening for metabolites and identified 27 unknowns for four fentanyl compounds (fentanyl, remifentanil, sufentanil, and alfentanil) in an in vitro enzymatic system and human urine samples. We also applied Fentanyl-Hunter to real influent wastewater samples, identifying a fentanyl compound. The global fentanyl exposure assessment was further conducted using the Mass Spectrometry Search Tool (MASST) and Fentanyl-Hunter. Our results show that Fentanyl-Hunter is effective and suggest that an unknown fentanyl family is as widespread as the legacy fentanyl globally.

RESULTS

Overall workflow for Fentanyl-Hunter

Fentanyl-Hunter is a comprehensive platform designed to identify fentanyl-like molecules on the basis of their MS1 and MS2 spectra. It includes a machine learning model for screening fentanyl compounds (Fentanyl_Finder) and a multilayer molecular network–assisted structure annotation tool (Fentanyl_ID). Figure 1 illustrates the schematic workflow of Fentanyl-Hunter and its applications in samples. Unlike traditional global molecular annotation in bioinformatics platforms, our strategy prioritizes screening for MS features of fentanyl-like compounds in Fentanyl_Finder. For the development of Fentanyl_Finder, both fentanyl and nonfentanyl spectra were processed following spectral purification, data augmentation, and spectral selection. Different spectral feature engineering methods and machine learning models were compared to select the optimal combination and ensure the generalization capability of the model for detecting emerging fentanyl compounds and their complex metabolites. Magnetic solid-phase extraction (MSPE) was used for sample handling in the analysis of low-concentration samples, such as human urine and wastewater.

Fig. 1. Overall workflow constructing the Fentanyl-Hunter method and its application for comprehensive profiling and annotation of the fentanyl family.

Fig. 1.

The workflow mainly includes (i) the training phase of the MS feature classifier for fentanyl, (ii) importing sample MS data for the fentanyl feature filter, (iii) seed fentanyl annotation, (iv) identifying the structure of neighbor fentanyl assisted by the multilayer network, and (v) confidence level assignment.

Within the workflow, the Fentanyl_Finder module filters nonfentanyl features and provides only fentanyl features for Fentanyl_ID to identify structures. Fentanyl_ID began with a positive MS peak table containing the peaks’ mass/charge ratio (m/z), retention time (RT), intensity, and MS2 spectra. First, the sample MS data were compared to a home-made fentanyl spectral library to annotate the batch of seeds. On the basis of the fundamental understanding that previously unknown fentanyl analogs are derived from specific structural modifications of existing fentanyl compounds, their structures represented by MS2 spectra are similar to those of the original fentanyl (21, 22). The same applies to both the parent compounds and their metabolites (23). The fentanyl structures were annotated using a multilayer molecular network, representing the annotation process from seeds (known fentanyl) to neighbors (suspect transformation products or unknowns). The edges between the fentanyl candidate features (nodes) were developed as reaction pairs on the basis of the spectral similarities and mass distances. A graphical user interface was also developed to enhance usability (fig. S1).

Developing the Fentanyl_Finder component

We developed a fentanyl-only MS feature filter using a self-developed random forest (RF) model that categorizes fentanyl and excludes nonfentanyl compounds. Both fentanyl and nonfentanyl high-resolution MS2 spectra were collected for model training. Fentanyl spectra were obtained from in-house generation and public spectral libraries. We generated 96 spectra with different collision energies for 24 unique fentanyl standards (Fig. 2A and table S1). In addition, 676 MS2 spectra for 279 unique fentanyl molecules were collected from HighResNPS (24), MoNA, and NIST (fig. S2). These spectra were also compiled into a spectral library in the msp format and used to identify seed molecules in the Fentanyl_ID module.

Fig. 2. Development and performance of the Fentanyl_Finder module.

Fig. 2.

(A) Extracted ion chromatograms for fentanyl standards used in the model training set. The corresponding labels are detailed in table S1. (B) The number of fentanyl compounds and spectra in the model training set varied following spectral acquisition, library collection, data augmentation, and cleaning. (C) Performance comparison results of different feature engineering. Mass binning outperforms other feature extraction strategies. ROC AUC, receiver-operating characteristic area under the curve. (D) Model comparison results of two nested cross-validation experiments. RF outperforms other machine learning models (denoted with an asterisk). (E) Importance of the representative features (top 10) and the SHAP values in the Fentanyl_Finder model. (F) Confusion matrices for the classification of fentanyl and nonfentanyl features in the two fentanyl compounds added to the urine matrix.

Known fentanyl represents a large chemical class, with 534 entries in the home-made fentanyl structure database, including known structures from the Cayman Chemical, EPA, SWGDRUG, and HighResNPS databases. Although accessible databases provide only a limited number of standard spectra, chemical space visualization between the two datasets demonstrated a comprehensive coverage of structural diversity within our library (fig. S3) (25). A data augmentation strategy from a previous study (26) was adjusted to increase the size of the fentanyl spectral pool to ensure a more robust model training (Fig. 2B). Four different intensity threshold cutoffs (80, 70, 60, and 50% of the maximum intensity) were applied to remove fragment ions with MS signals lower than the cutoffs (fig. S4). This process generates up to five spectra from a single original spectrum, which are then added to the spectral pool. The number of fragments in each spectrum was then reviewed, and spectra with fewer than three fragments were removed, resulting in a final set of 1873 spectra. The data augmentation and quality check strategies were created and applied to miniize data overfitting caused by the limited number of standard fentanyl MS2 spectra. Similarly, two types of nonfentanyl MS2 spectra were collected. The first type included 2954 high-entropy quality spectra (27) from public spectral libraries for urinary compounds in Exposome-Explorer (28) and not classified as “fentanyl” according to chemical classification using ClassyFire (29). The second type included 1486 in-house generated metabolites’ spectra from the urine samples of healthy volunteers who were not exposed to fentanyl. We added dereplication and fragment number check steps, resulting in 4361 nonfentanyl spectra (fig. S5).

Four spectral feature engineering methods were trained using the same training and testing datasets: mass binning (total intensity of ions per m/z bin), top N (m/z of the N highest intensity fragments), grid 1D (number of ions per m/z bin), and 2D [number of ions per axis (m/z and intensity)] (details in Materials and Methods). An overview of the feature engineering of the standard fentanyl spectrum is shown in fig. S6. The performance of feature engineering followed by the RF classifier was assessed via nested cross-validation (30), with five folds in the outer loop for algorithm evaluation and three folds in the inner loop for hyperparameter tuning using Bayesian optimization. The performance of mass binning was determined by the accuracy, precision, F1 score, and Matthews correlation coefficient (MCC) (Fig. 2C). We selected a bin width of 0.1 Da and 3500 bins for binning because 99.8% of the ions in the fentanyl dataset’s fragment distribution have an m/z range between 50 and 400. The performance decreased when a large number of bins and a broader mass range were used to divide the m/z axes (fig. S7). This is understandable because of the resolution of the spectrum used for training and the meaningless learning interval for too many zero-value features (31). The classification results of commonly used machine learning models are shown in Fig. 1D. The performances of the RF, support vector machine (SVM), k-nearest neighbors (KNN), and logistic regression (LR) classifiers were similar, with the RF classifier standing out. Performance metrics were boosted, with an F1 score of 0.868 ± 0.02 and an MCC of 0.844 ± 0.03 (n = 5). To investigate whether the final models were trained to predict whether a spectrum was from fentanyl capture characteristic fragments, the Shapley additive explanation (SHAP) analysis was conducted. The binning feature with the highest importance score (ranked first) was Bin_1381, which corresponded to the fragment ion m/z 105.0694 [C8H9]+. The second highest (ranked second) was Bin_550, representing the fragment ion m/z 188.1425 [C13H18N]+ in the spectra of the fentanyl standards (Fig. 2E and fig. S8).

These ions and the neutral loss of C9H10NO have also been used in the SSA of fentanyl (32). In addition, we examined the spectra that were misclassified as nonfentanyl compounds in the testing dataset. The misclassification of the same fentanyl compound was caused by the spectra obtained at either high or low collision energies (fig. S9). In this study, spectra were acquired using Fentanyl_Finder under moderate conditions at 30 eV to capture more spectral features for model applications. To assess the robustness of the fentanyl spectral extraction from complex matrices, we spiked two fentanyl standards (butyrfentanyl and 3-methylthiofentanyl) into a human urine sample at a concentration of 5 ng/ml. We then collected their MS2 spectra in the information-dependent acquisition mode and applied Fentanyl_Finder for analysis. Two fentanyls were screened, with only 26 false positives in the confusion matrix (Fig. 2F). False positives were further evaluated by applying the model to different biological and environmental matrices, including lake water, tap water, seawater, and urine and serum from healthy volunteers, all of which are known not to contain fentanyl. The false positive rate was found to be <1.5% (fig. S10). We conducted serial dilutions of 24 standard mixtures and measured the limit of detection (LOD) of Fentanyl_Finder. All of them were successfully identified, achieving a 100% recall rate at 5 ng/ml in water. With the assistance of MSPE, the LOD in urine reached <0.5 ng/ml for 23 fentanyls, with the exception of remifentanyl, which had an LOD of 5 ng/ml (table S2). This result also suggests that as long as a valid MS2 spectrum is available in the data, Fentanyl_Finder can detect it with good sensitivity.

Developing the Fentanyl_ID component

Compared to the identification of global NTA data, initial screening using the Fentanyl_Finder module reduced false positives and labor costs. However, structural annotation based solely on MS data remains a challenge. The Fentanyl_ID module was developed to assist in the structural annotation of fentanyl candidates (Fig. 3A). Fentanyl_ID follows a seed molecule–guided identification strategy, and the spectral library comprising 772 spectra from 279 fentanyl compounds serves not only for seed annotation in samples but also as virtual seeds added to MS data. This module involves three key steps: It performs exact searches to retrieve the structures of seeds from the fentanyl spectral library, uses a script to construct a multilayer network linking neighboring unknowns with annotated or virtual seeds, and interprets the structures of unknowns on the basis of potential transformations, predicted MS2 data, RT, and other orthogonal information. The first step was implemented automatically using the identification function in NTA data processing tools such as MS-DIAL. In the following steps, we chose the exact mass difference and MS2 spectral similarity of the nodes (results of Fentanyl_Finder) to establish the noninterfering edges. Unlike traditional molecular networks, the network in Fentanyl_ID integrates three-layer networks, including a fentanyl-only network, a spectral similarity network, and a paired mass distance (PMD) network. Our fentanyl spectral library and chemical space datasets were used to develop and validate the proposed module.

Fig. 3. Structural relationships in the fentanyl family and development of the Fentanyl_ID module.

Fig. 3.

(A) The workflow of the Fentanyl_ID module, assisted by a multilayer network, enables the annotation of fentanyl structures. (B) Proportion of fentanyl compounds in networks constructed from the 179 structures included in the spectral library using different similarity algorithms at various score thresholds. (C) Top 15 modification PMD frequencies used for PMD network construction in the fentanyl library (classical PMDs) and the fentanyl space (recently discovered PMDs). (D) Chemical space visualization was performed using the Tanimoto coefficient matrix between the fentanyl library and fentanyl space datasets (Morgan fingerprint). The matrix was reduced via multidimensional scaling, with each point representing a structure, where distances indicate structural similarity. The central structure was clustered using K-means, with fentanyl (CAS: 437-38-7) labeled nearby. (E) The relationship between the predicted retention index and the retention time was established using fentanyl standards with an R2 value of 0.7089.

Considering their structural similarity, parents and transformation products often exhibit similar spectra (23). Given the rapid development of spectral similarity algorithms, the top four algorithms were selected from a pool of 44 algorithms [via the spectral_entropy Python package (27)] because of their effectiveness in distinguishing between transformation pairs and nonpairs, particularly those with a score difference greater than 0.3 (fig. S11 and table S4). The dataset was extracted from a Human Exposome and Metabolite Mass Spectral Database (HExpMSDB) with 107,514 spectra of exogenous compounds and their phase I reaction metabolite pairs and 165,574 of nonpairs (33). To test the accuracy and coverage of the spectral similarity network, we used fentanyls with spectra available in the library. The MS for the ID distance version 1 algorithm demonstrated the lowest coverage when the similarity score was <0.5. When the score threshold exceeded 0.5, the algorithm achieved a coverage of 81% for fentanyls in the spectral library (Fig. 3B).

The PMD can reflect chemical reactions or metabolism (34). The early fentanyl derivatives were modified from the core chemical structure of fentanyl (CAS: 437-38-7), which was also evident in their structural clustering, as the centers overlapped with fentanyl (fig. S12). The classical PMD list was derived from the mass differences identified in a spectral library, which included 144 variations between 279 fentanyl derivatives and fentanyl itself (Fig. 3A). The most common modifications are carbon skeleton elongation (14.0157 Da for CH4 and 28.0313 Da for C2H4), hydroxylation (30.0106 Da for H2CO and 15.9949 Da for O), and halogen substitution (17.9906 Da for H─F), among others (table S3). To test the accuracy of the PMD network, we used 279 recently discovered fentanyls in the chemical space that were absent from the spectral library and excluded isomers with the same formula for each test. All recently discovered fentanyls could be inferred from earlier fentanyls in the library, and 87.6% can be covered with the top 15 classical PMDs. However, the recently discovered PMD rankings shifted, with only 6 of the top 15 maintaining their positions (Fig. 3C). These identified modifications suggest the emergence of other compounds with previously unidentified transformation pathways, in addition to fentanyl, as the fentanyl industry evolves. This is further supported by the structural clustering analysis, which shows distances from fentanyl ranging from 0.014 to 0.070 (Fig. 3D). In addition, another 13 PMDs were added to the list corresponding to metabolic pathways (34), including two unique demethylation pathways for fentanyls (table S5 and fig. S13) (13).

The final stage of the Fentanyl_ID module involves chemical structure assignment based on the location information of the unknowns in the network and confidence level assignments (details in fig. S14). The structure assignment of each compound was based on a candidate structure list predicted by BioTransformer (35) using seed compounds or known structural substitution sites (fig. S15) (36). The top-ranked structure was determined on the basis of MS2 scores from MetFrag, RT matching scores from FenGNN-RT, and their combined total score. The custom retention index prediction script, FenGNN-RT, modified from GNN-RT (37) was adapted to predict the retention times of the fentanyl compounds in the LC gradient. A total of 23 fentanyls was used to establish a linear relationship with an R2 of 0.7089, and predictions for five fentanyl metabolite standards fell within the 99% confidence interval (Fig. 3E). The confidence level assignment was slightly modified from the Schymanski scale (38), subdividing it into four levels (from level 4 to level 1) on the basis of orthogonal information that could be matched.

Fast annotation of fentanyl metabolites

We used Fentanyl-Hunter to screen for fentanyl metabolites, both in vitro and in vivo. The comprehensive monitoring of fentanyl metabolites is essential for assessing drug abuse, toxicity, and metabolism, with critical applications in preventing overdoses and providing key forensic and toxicological evidence. The spectra of phase I and II metabolites fentanyl-OH and fentanyl-Glu (via the pyCFMID Python package) exhibited shared binning features for positive Fentanyl_Finder detection (fig. S16). MSPE pretreatment with HLB (hydrophilic-lipophilic balance) beads ensured a 90% recovery of the reported metabolites and their parent carfentanil compared to WCX (weak cation exchange) beads (fig. S17).

We incubated four globally high-consumption fentanyls (fentanyl, remifentanil, sufentanil, and alfentanil) with human liver microsomes. Fentanyl_ID identified 8 known metabolites and 27 previously unrecognized metabolites in the networks (Fig. 4A, fig. S18, and table S6). Specifically, 13 fentanyl-like features were structurally identified from 773 MS peaks of in vitro fentanyl metabolites (Fig. 4B). Among them, FM1, FM4, and FM5 were confirmed by coelution with authentic chemical standards (fig. S19), while six other fentanyl metabolites were supported by structural elucidation on the basis of downstream information from the confirmed metabolic pathways of these standards (fig. S20 and Fig. 4C). All the assigned structures were validated as follows: Five metabolites were confirmed using chemical standards (level 1) (fig. S19), 11 were supported by validated metabolic pathways (levels 2a and 2b) (fig. S21), and 19 were classified as level 3 on the basis of LC-HRMS data, including predicted RT and MS2 spectra (Fig. 4C and figs. S22 and S23). Network-based inference may propagate uncertainty, and misclassification at downstream nodes remains a concern. Therefore, only nodes supported by chemical standards or validated downstream pathways were used to anchor propagation, and annotations above level 3 were assigned with caution. The only known metabolite of alfentanil is identical to the primary metabolite of sufentanyl, norsufentanyl (AM1/SM1; CAS: 61086-18-8) (39). These identified fentanyl metabolites not only serve as more specific exposure biomarkers but also demonstrate distinct kinetic characteristics (figs. S24 and S25).

Fig. 4. Identification of fentanyl metabolites by Fentanyl-Hunter.

Fig. 4.

(A) Number of nodes representing the in vitro metabolites of different fentanyls in the multilayer network of the Fentanyl_Finder module. (B) MS peaks detected in the biotransformation sample of fentanyl. (C) Identification of fentanyl and its metabolites in the multilayer network. The nodes were fentanyl candidates labeled with IDs based on PeakIDs in the peak table and confidence level. The edges represent spectral similarities >0.5 and are colored according to different modifications in PMDs. (D) Profile of fentanyl and its metabolites over a 5-hour incubation with human liver microsomes. Data are presented as the means ± SD of three independent repeats. (E) Precision and FDR comparison of fentanyl and its metabolites across Fentanyl-Hunter and other widely used analytical tools. Data were obtained from the results of three tools evaluated at different threshold values: Fentanyl-Hunter; GNPS-LT (similarity cutoff: 80%) and GNPS-HT (85%), which are two settings of GNPS; and SSA-LT (mass tolerance: 0.5 mDa) and SSA-HT (1 mDa), which are two settings of SSA. (F) Extracted ion chromatograms for fentanyl metabolites detected in human urine samples.

The pharmacokinetic curve of fentanyl metabolites demonstrated a rapid increase within 1 hour, followed by stability over the next 5 hours. The signal intensities of the found metabolites were relatively low, which may explain why they were overlooked (Fig. 4D and fig. S25). On the basis of the identification of fentanyl metabolites and raw MS data, we compared the screening performances of commonly used methods, including GNPS (40) and diagnostic product ion–based SSA, with that of Fentanyl-Hunter. The threshold values were found to influence the tool performance. To ensure a fair comparison, we applied both high and low thresholds to align the accuracy and FDR levels (Fig. 4E). Among the evaluated methods, Fentanyl-Hunter exhibited the lowest FDR and highest accuracy.

To systematically investigate the screening capability of Fentanyl-Hunter, we collected urine samples from a volunteer patient using fentanyl transdermal patches at various time points. Four metabolites were identified 21 hours after treatment. FM1 and F116 were consistent with in vitro findings, while the other two metabolites differed and were presumed to be glucuronidation products of F116 (F116-Glu) and FM5 (FM5-Glu) (table S7 and Fig. 4F). Fentanyl was detectable in the urine 5 hours after application of the transdermal patch, with the highest concentration of fentanyl and its metabolites observed at sampling after 12 hours, followed by a rapid decline (fig. S26). The primary metabolite was norfentanyl, which is consistent with the results of the in vitro pharmacokinetics. These metabolites could be used as biomarkers for fentanyl.

Fentanyl-Hunter detects fentanyl in wastewater

The chemical analysis of biomarkers in composite influent wastewater allows us to back-calculate the consumption of fentanyl ingested by a population (3, 41). We used Fentanyl-Hunter to screen for fentanyl and its metabolites in wastewater samples obtained from a municipal wastewater treatment (Fig. 5A). The FM1 (norfentanyl) peak, with a peak intensity of 1 × 103, was identified by matching its spectrum using the fentanyl library (fig. S27). Norfentanyl is an essential marker for epidemiological studies of opioid-related wastewater (42). In another wastewater sample, the MS feature was assigned the molecular formula C20H30N2O3. It was identified as carboxypentanyl fentanyl (CAS: 526191-19-5) on the basis of the fentanyl structure database or alternatively as an isomeric transformation product of fentanyl because of the poor quality of the MS2 spectrum (fig. S28). In a 2017 study targeting fentanyl in municipal wastewater, the low detection rate aligned with the trend of fentanyl rarity across China (43). Only fentanyl metabolites were detected in this study, with no identification of any fentanyl-related substances beyond those approved as pharmaceuticals in China. However, the detection of metabolites suggests the potential use, emphasizing that testing for the fentanyl compound alone is inadequate in sewage epidemiology.

Fig. 5. Identification of fentanyl in wastewater and retrospective screening using Fentanyl-Hunter and MASST.

Fig. 5.

(A) MS peaks detected in a wastewater sample included a fentanyl-like peak labeled as FM1. (B) Global distribution of fentanyl features based on retrospective screening, with total sample numbers and the respective numbers for human (hum) and environment (env) samples. (C) Retrospective screening statistics for detected fentanyl compounds in files or datasets. The number of files and datasets is indicated in parentheses. (D) The workflow includes searching MS2 spectra against all public datasets using MASST and providing raw MS data for in-depth fentanyl screening with Fentanyl-Hunter. (E) The identification process of despropionyl remifentanil was performed on MS data sourced from GNPS/MassIVE.

Screening reveals global fentanyl prevalence

Although Fentanyl-Hunter identified features of fentanyl and its metabolites in the limited samples analyzed, it remains uncertain whether these substances are present in global samples, which would indicate their worldwide prevalence. MASST is a powerful tool based on public MS databases (MTBLS, NMDR, and GNPS/MassIVE) for screening compounds in diverse sample studies worldwide, allowing the linking of compounds by location, type, and time (4446). Four high-consumption fentanyls and 45 known and previously unidentified metabolites were identified through MASST searches (Fig. 5B; details in Materials and Methods). The screening returned four fentanyl compounds or metabolites [FM1 (norfentanyl), RM1 (remifentanil acid), sufentanil, and fentanyl] that were identified in 250 MS files across 14 studies (table S8). These datasets can be categorized into four sample types, with the majority found in human samples from 2019 to 2022, highlighting the widespread use of fentanyl-like compounds as analgesics (Fig. 5C). In a biological sample from a cancer research dataset (MSV000088324), Fentanyl-Hunter was used to detect four substances: propyl-norfentanyl, RM1, RM2, and FM1 with confidence level 2a (Fig. 5D and fig. S29). Notably, fentanyl and its major metabolite FM1 were detected in 14 environmental samples from 2018 to 2020, including sewage sludge, surface water, and seawater. Fentanyl detected in sewage sludge during the COVID-19 pandemic is presented in fig. S30. Surface seawater samples from the Ellen Browning Scripps Memorial Pier (47) and its NTA data show MASST matching for FM1. Despropionyl remifentanil, a potential metabolite of remifentanil, was identified in a Luxembourgish surface water sample from MSV000087190 using Fentanyl-Hunter with a confidence level of 3 (Fig. 5E and fig. S31).

Retrospective screening of samples gathered from eight countries further showed that fentanyl substances are found worldwide, including Europe, the US, China, Israel, Indonesia, Australia, and Argentina (Fig. 5B). The US recorded the highest number of detected samples (202) owing to the large proportion of samples in its public dataset. This result does not rule out the possibility of legitimate medical use, as all human samples detected in China were from patients with cancer pain. Given the inherent uncertainty of level 3 annotations, we emphasize that they should not be used as the sole basis for regulatory or forensic decisions without further validation. Furthermore, the results may not accurately reflect the regional prevalence of fentanyl misuse because of variations in sample pretreatment types, most of which follow metabolomic analysis methods. However, these findings validate our methods and suggest that the risks of fentanyl abuse and environmental pollution should be closely scrutinized and carefully reevaluated.

DISCUSSION

The regulation of drug abuse is a challenge for the international community. The rapid diversification of fentanyl analogs has led to complicated regulatory efforts. Current NTA platforms that rely on MS and spectral libraries are insufficient for detecting unique fentanyl compounds and their metabolites. We developed an automated fentanyl family screening and identification platform named Fentanyl-Hunter, which combines artificial intelligence (AI) and a multilayer molecular network with MS analysis. This platform could serve as a robust tool for the detection and regulation of fentanyl variants and metabolites in public health, forensic science, environmental monitoring, and law enforcement applications.

The application of Fentanyl-Hunter notably enhanced analysis efficiency and increased the number of annotated fentanyl-related substances in biological and environmental samples compared to conventional tools such as GNPS. This tool consists of two main components: candidate peak extraction (Fentanyl_Finder) and structural identification assistance (Fentanyl_ID). Using the MS2 spectral feature engineering method based on mass binning and an RF classifier, Fentanyl_Finder accurately identified the possible presence of fentanyl-like compounds from thousands of MS features. Using a multilayer network that incorporates candidate compound nodes, spectral similarity edges, transformation PMD network edges, and tools for spectral and retention time prediction, Fentanyl_ID enables rapid and high-confidence identification of fentanyl analogs. In this study, machine learning demonstrated a strong capability for NTA MS data mining. The model effectively learned the structural features of the spectral data, given a sufficient number of positive (1873) and negative (4361) examples, along with appropriate feature extraction methods. Compared with classical fragment ion–based suspect screening, the ML model can capture complex nonlinear features and offer greater robustness. Suspect screening relies on expert-defined thresholds and rules. By contrast, data-driven approaches automatically learn features and patterns, thereby reducing the limitations caused by human bias and enabling adaptability to a wide range of emerging compound types. The RF model used in this study integrates multiple decision trees for classification or regression, allowing it to simultaneously account for multiple feature dimensions of the data rather than relying solely on specific key fragment ions.

Molecular network technology enables the propagation of known compounds into unknowns (i.e., compounds not present in spectral or structural databases). This seed molecule–guided strategy has demonstrated high efficiency and confidence in structure identification in a previous report (45). This strategy was further developed and applied to the treatment with fentanyl. A fentanyl spectral library serves not only for seed annotation in samples but also as virtual seeds. The use of virtual seeds ensures that uniquely detected compounds that are missing in the upstream or downstream detections can still be connected. The network concept was expanded to include an additional layer of PMD links. This originated from lead optimization in pharmaceutical development, which led to the common PMDs. Therefore, the PMD network can not only uncover unknown compound structures but also highlight emerging process trends, which are crucial for regulating the illicit synthesis of new psychoactive substances.

The detection of fentanyl and its metabolites in any sample is a critical concern. In this study, we identified 27 previously overlooked fentanyl metabolites in the laboratory and detected two of them in the urine of fentanyl users. The identification of two additional metabolites in urine samples underscores the fact that in vitro studies may not capture the full spectrum of in vivo metabolism. Conducting the NTA of test samples rather than relying solely on targeted analysis or suspect screening is essential for ensuring comprehensive regulatory oversight. While Fentanyl-Hunter heavily relies on MS2 spectra in MS data for compound screening and annotation, the use of information-dependent acquisition mode ensures high-quality spectra but limits coverage, potentially resulting in false negatives for low-abundance peaks (48). The platform assigns structures through RT and MS2 score ranking and enhances annotation confidence through metabolic pathway–based inference. However, it cannot overcome the inherent limitations of MS in distinguishing stereoisomers or closely related isomers without chemical standards for definitive confirmation.

By combining MASST and Fentanyl-Hunter staining, the widespread presence of these fentanyl markers was detected in various sample types. Unexpectedly, this included environmental samples ranging from seawater in the US to surface waters in Luxembourg. These findings underscore the urgent need for the strengthened regulation of the fentanyl family, particularly regarding variants and metabolites.

AI-powered drug discovery is often regarded as a double-edged sword (5, 49). In this study, AI is applied in a safer and more secure manner for drug regulation. By leveraging the black-box nature of AI, the use of the open-source pretrained platform does not allow direct access to the precise chemical structures or mass spectra of fentanyls, ensuring the safety of its widespread adoption.

MATERIALS AND METHODS

Chemicals

The 26 authentic standards used in this study were purchased from Yuansi (Shanghai, China) and are summarized in table S1. All fentanyl standards were used to test the mass accuracy, LOD of Fentanyl-Hunter, recovery of the pretreatment method, metabolism, and identification confirmation. Twenty-four standards were used in the training set for the Fentanyl_Finder model, and two were used to confirm the identification of the Fentanyl_ID metabolite structures. LC-MS–grade solvents, including methanol, acetonitrile, and formic acid, were purchased from Thermo Fisher Scientific (Waltham, MA). Ultrapure water was obtained using a Milli-Q Integral water purification system (Millipore Corporation, Bedford, US).

Fentanyl MS2 spectral library

A total of 772 MS2 spectra of 279 fentanyl compounds was collected from public spectral databases and in-house standards of the fentanyl MS2 spectral library. This library was used to form the training set after amplification, cleaning, and chemical information supplementation and served as a reference for identifying seed fentanyls in Fentanyl_ID. Further details are provided in text S1.

Training of the machine learning model

The machine learning models were trained using the fentanyl MS2 spectral library and a nonfentanyl spectral dataset (text S2). Spectra were vectorized using a mass binning algorithm chosen from four available methods, with empty bins (i.e., those containing only zeros) removed. To avoid artificially inflating spectral model performance, chemicals were not shared between training and test sets. An RF classifier was used to predict whether a compound was labeled as fentanyl. The individual macroaveraged F1 score and macroaveraged MCC were calculated for each fold and averaged to assess the cross-validated results. The F1 score is the harmonic mean of the precision and recall, and the MCC score is the correlation coefficient between the observed and predicted binary classifications. A higher F1 or MCC indicated a more accurate model prediction. The calculations for F1 (Eq. 1) and MCC (Eq. 2) are presented below

F1=2TP2TP+FP+FN (1)
MCC=TPTNFPFN(TP+FP)(TP+FN)(TN+FP)(TN+FN) (2)

where TP, TN, FP, and FN refer to the number of true positives, true negatives, false positives, and false negatives, respectively. The detailed procedural settings are provided in text S3. Furthermore, we evaluated the model’s ability to identify abiotic oxidative transformation products of fentanyl compounds, and its screening results showed high concordance with those obtained from classical NTA on the basis of statistical analysis (text S4 and fig. S32).

Confidence level for annotations of Fentanyl_ID

All fentanyl candidates were identified using our Fentanyl_ID framework, with the assistance of other open-source tools. They were further confirmed by reference standards, spectral libraries, or LC-HRMS diagnostic evidence, such as RT and MS2 spectral comparisons. Confidence level 1 was assigned to compounds confirmed by reference standards or nuclear magnetic resonance analysis. Confidence level 2a was assigned to matches where the MS2 spectrum corresponded to the fentanyl spectral library. These two levels represent seed fentanyl compounds. Confidence level 2b was assigned to inferred metabolite structures on the basis of matching reported metabolites or identifying metabolic pathways. Confidence level 3 was assigned to top-ranked tentative structures of unknowns on the basis of BioTransformer predictions, relative spectral matching scores from MetFrag, and RT matching scores from FenGNN-RT. In addition, spectra predicted by the pyCFMID Python package that exhibited a similarity score greater than 0.5 when compared with experimental spectra were also considered. Compounds classified as level 4 could not be structurally identified.

Human urine sample collection

The sample collection was approved by the Ethics Committee of Shanghai First Maternity and Infant Hospital, and all the participants signed a written informed consent (KS24475). Approximately 40 ml of urine was collected from 10 healthy volunteers in October 2022 (five males and five females; aged 21 to 45 years). None of the volunteers had a record of using psychotropic substances such as fentanyl or other drugs. These urine samples were used in the nonfentanyl training set for the Fentanyl_Finder model and as substrates in additive experiments to assess the recovery and LOD of our method. In addition, urine samples were collected from a patient with cancer who used fentanyl transdermal patches for pain relief in July 2024 (female, aged 56 years). Urine samples (50 ml) were collected at 5, 16, 21, and 24 hours postmedication and used for fentanyl metabolite screening and kinetic analysis. After sampling, urine samples were immediately transferred to the laboratory and then frozen at −40°C until analysis.

Wastewater sample collection

Influent wastewater was collected from an urban wastewater treatment plant in Jiangsu Province. It had been stored in a polytetrafluoroethylene bottle in the −20°C environment, given that the collection was completed in September 2018. After melting at room temperature, it was used for the pretreatment and analysis of the fentanyl family. The procedural blank did not contain any samples, and only Milli-Q water was treated identically to control potential contamination.

Liver microsomal incubation of fentanyls

We selected four widely used fentanyl analogs, fentanyl, remifentanil, sufentanil, and alfentanil, as substrates for the in vitro metabolic assays. In the experiment on fentanyl elimination and biotransformation dynamics, incubation was terminated at intervals of 0, 15, 30, 60, 120, 180, 240, and 300 min. Details of the incubation with human liver microsomes (mixed sex) are provided in text S5.

Sample pretreatment by MSPE

All samples used in this study were pretreated before LC-HRMS analysis using MSPE. The detailed materials and MSPE procedure were modified from a previous study (37) and are shown in text S6.

LC-HRMS analysis

The samples were analyzed using an Exion LC AD Series system (AB Sciex, US) coupled with an AB Sciex Zeno TOF 7600 mass spectrometer. Chromatographic separation was achieved on a T3 column (3.0 by 50 mm, 1.8 μm, Agilent, US) using a mobile phase of 0.1% formic acid in water (A) and acetonitrile (B). Further details are provided in text S7.

MS data preanalysis for Fentanyl-Hunter

Primary data analysis was performed using Fentanyl-Hunter, which requires the importation of an MS peak table with MS2 spectra (in txt format). The MS peak table comprises feature information generated from raw MS files using common peak-picking software, such as MS-DIAL. The features (i.e., unique pairs of m/z and chromatographic RT values) were extracted and grouped, and the blanks were removed. The MS2 spectra matched the corresponding individual MS features. The detailed settings for the automated preanalysis are provided in text S8.

Global screening using MASST

A single-spectrum search was performed using the MASST pipeline (https://masst.gnps2.org/). The precursor ion mass tolerance and fragment ion tolerance were set to 0.01 Da. “Non-redundant MS/MS” was selected from public databases for screening, including GNPS, Metabolomics Workbench, and MetaboLights. The spectra of fentanyl and its metabolites (table S6) were filtered using the same parameters applied to the input demo data. Matches between the input spectra and the library spectra were retained only if they had a score exceeding 0.6 and a minimum of two matching peaks.

Retrospective screening using Fentanyl-Hunter

MASST allows the individual screening of only a limited number of known spectra of specific compounds. To further investigate the potential missing fentanyl-like compounds, the raw data identified by MASST were downloaded using FileZilla.exe for the Fentanyl-Hunter analysis. The raw data were converted into the mzXML format using MSConvert, then preanalyzed using MS-DIAL, and further analyzed with Fentanyl-Hunter.

Acknowledgments

Funding: M.F. was sponsored by the National Key R&D Program (grant no. 2024YFA0918900), and the Strategic Priority Research Program of the Chinese Academy of Sciences (grant no. XDB0750300). Y.W. was sponsored by the Shanghai Municipal Education Commission’s “Artificial Intelligence Promotes Scientific Research Paradigm Reform and Empowers Discipline Leap Plan” project. X.D. was sponsored by the Shanghai Science and Technology Innovation Action Plan’s “Research on the Standardization of Analytical Methods for Key Functional Components in Sports Nutrition Foods” project (grant no. 23DZ2203800).

Authors contributions: Conceptualization: M.F. and X.D.; methodology: C.S. and W.L.; investigation: Y.W., X.C., M.Y., and H.Z.; visualization: C.S. and Z.Y.; funding acquisition: X.D., Y.W., and M.F.; supervision: M.F.; writing—original draft: C.S. and W.L.; writing—review and editing: M.S., M.F., and X.D.

Competing interests: X.D., Y.W., W.L., and C.S. are listed as inventors on patent application CN119324012A, submitted by Shanghai University of Sport, which covers the machine learning–assisted screening method for fentanyl compounds. The other authors declare that they have no competing interests.

Data and materials availability: The Fentanyl-Hunter platform was developed in Python, with a graphical user interface built using Electron and Vue 3. The source code, along with a user manual, help documentation, demonstration data, and example results, is freely available for scientific research at https://github.com/FangLabNTU/Fentanyl-Hunter and https://zenodo.org/records/15233025. The raw MS data in this study associated with DOI 10.5281/zenodo.14204596 is hosted on Zenodo. Because of the strict regulation of fentanyl-related substances, the seized drug chemical database SWGDRUG (www.swgdrug.org/) and spectral databases such as HighResNPS (https://highresnps.com/) and NIST used in this study prohibit publicly sharing their metainformation or spectra under license restrictions. Extensive data for scientific research are available to qualified users, such as forensic toxicologists, upon credential verification with M.F., pending scientific review, and completion of a material transfer agreement.

Supplementary Materials

The PDF file includes:

Supplementary Texts S1 to S8

Figs. S1 to S32

Legends for tables S1 to S8

sciadv.adw2799_sm.pdf (6.7MB, pdf)

Other Supplementary Material for this manuscript includes the following:

Tables S1 to S8

REFERENCES AND NOTES

  • 1.Kuczyńska K., Grzonkowski P., Kacprzak Ł., Zawilska J. B., Abuse of fentanyl: An emerging problem to face. Forensic Sci. Int. 289, 207–214 (2018). [DOI] [PubMed] [Google Scholar]
  • 2.World Health Organization, WHO Drug Information (World Health Organization, 2022). [Google Scholar]
  • 3.Salgueiro-Gonzalez N., Béen F., Bijlsma L., Boogaerts T., Covaci A., Baz-Lomba J. A., Kasprzyk-Hordern B., Matias J., Ort C., Bodík I., Influent wastewater analysis to investigate emerging trends of new psychoactive substances use in Europe. Water Res. 254, 121390 (2024). [DOI] [PubMed] [Google Scholar]
  • 4.N. Duchovny, R. Mutter, “The opioid crisis and recent federal policy responses” (Congressional Budget Office, 2022); www.cbo.gov/publication/58221 [accessed 28 September 2022].
  • 5.Skinnider M. A., Wang F., Pasin D., Greiner R., Foster L. J., Dalsgaard P. W., Wishart D. S., A deep generative model enables automated structure elucidation of novel psychoactive substances. Nat. Mach. Intell. 3, 973–984 (2021). [Google Scholar]
  • 6.Friedman J., Shover C. L., Charting the fourth wave: Geographic, temporal, race/ethnicity and demographic trends in polysubstance fentanyl overdose deaths in the United States, 2010–2021. Addiction 118, 2477–2485 (2023). [DOI] [PubMed] [Google Scholar]
  • 7.Phillips A. L., Williams A. J., Sobus J. R., Ulrich E. M., Gundersen J., Langlois-Miller C., Newton S. R., A framework for utilizing high-resolution mass spectrometry and nontargeted analysis in rapid response and emergency situations. Environ. Toxicol. Chem. 41, 1117–1130 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Swanson K. D., Shaner R. L., Krajewski L. C., Bragg W. A., Johnson R. C., Hamelin E. I., Use of diagnostic ions for the detection of fentanyl analogs in human matrices by LC–QTOF. J. Am. Soc. Mass Spectrom. 32, 2852–2859 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Salomone A., Di Corcia D., Negri P., Kolia M., Amante E., Gerace E., Vincenti M., Targeted and untargeted detection of fentanyl analogues and their metabolites in hair by means of UHPLC-QTOF-HRMS. Anal. Bioanal. Chem. 413, 225–233 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Koppel N., Maini Rekdal V., Balskus E. P., Chemical transformation of xenobiotics by the human gut microbiota. Science 356, eaag2770 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Takahashi M., Izumi Y., Iwahashi F., Nakayama Y., Iwakoshi M., Nakao M., Yamato S., Fukusaki E., Bamba T., Highly accurate detection and identification methodology of xenobiotic metabolites using stable isotope labeling, data mining techniques, and time-dependent profiling based on LC/HRMS/MS. Anal. Chem. 90, 9068–9076 (2018). [DOI] [PubMed] [Google Scholar]
  • 12.Bowen T. J., Southam A. D., Hall A. R., Weber R. J., Lloyd G. R., Macdonald R., Wilson A., Pointon A., Viant M. R., Simultaneously discovering the fate and biochemical effects of pharmaceuticals through untargeted metabolomics. Nat. Commun. 14, 4653 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wilde M., Pichini S., Pacifici R., Tagliabracci A., Busardò F. P., Auwärter V., Solimini R., Metabolic pathways and potencies of new fentanyl analogs. Front. Pharmacol. 10, 238 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sindelar M., Patti G. J., Chemical discovery in the era of metabolomics. J. Am. Chem. Soc. 142, 9097–9105 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Deng M., Ji X., Peng B., Fang M., In vitro and in vivo biotransformation profiling of 6PPD-quinone toward their detection in human urine. Environ. Sci. Technol. 58, 9113–9124 (2024). [DOI] [PubMed] [Google Scholar]
  • 16.Kind T., Tsugawa H., Cajka T., Ma Y., Lai Z., Mehta S. S., Wohlgemuth G., Barupal D. K., Showalter M. R., Arita M., Fiehn O., Identification of small molecules using accurate mass MS/MS search. Mass Spectrom. Rev. 37, 513–532 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Le L., Jun-Bo Z., Hui Y., Research progress on metabolite identification of synthetic cannabinoid new psychoactive substances. J. Forensic Med. 37, 459 (2021). [DOI] [PubMed] [Google Scholar]
  • 18.Aron A. T., Gentry E. C., McPhail K. L., Nothias L.-F., Nothias-Esposito M., Bouslimani A., Petras D., Gauglitz J. M., Sikora N., Vargas F., van der Hooft J., Ernst M., Kang K. B., Aceves C. M., Caraballo-Rodríguez A. M., Koester I., Weldon K. C., Bertrand S., Roullier C., Sun K., Tehan R. M., Boya P. C. A., Christian M. H., Gutiérrez M., Ulloa A. M., Tejeda Mora J. A., Mojica-Flores R., Lakey-Beitia J., Vásquez-Chaves V., Zhang Y., Calderón A. I., Tayler N., Keyzers R. A., Tugizimana F., Ndlovu N., Aksenov A. A., Jarmusch A. K., Schmid R., Truman A. W., Bandeira N., Wang M., Dorrestein P. C., Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat. Protoc. 15, 1954–1991 (2020). [DOI] [PubMed] [Google Scholar]
  • 19.Wu G., Wang X., Zhang X., Ren H., Wang Y., Yu Q., Wei S., Geng J., Nontarget screening based on molecular networking strategy to identify transformation products of citalopram and sertraline in wastewater. Water Res. 232, 119509–119517 (2023). [DOI] [PubMed] [Google Scholar]
  • 20.Zhou Z., Luo M., Zhang H., Yin Y., Cai Y., Zhu Z.-J., Metabolite annotation from knowns to unknowns through knowledge-guided multi-layer metabolic networking. Nat. Commun. 13, 6656 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Armenian P., Vo K. T., Barr-Walker J., Lynch K. L., Fentanyl, fentanyl analogs and novel synthetic opioids: A comprehensive review. Neuropharmacology 134, 121–132 (2018). [DOI] [PubMed] [Google Scholar]
  • 22.Moorthy A. S., Kearsley A. J., Mallard W. G., Wallace W. E., Mass spectral similarity mapping applied to fentanyl analogs. Forensic Chem. 19, 100237–100245 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Shen X., Wang R., Xiong X., Yin Y., Cai Y., Ma Z., Liu N., Zhu Z.-J., Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics. Nat. Commun. 10, 1516–1529 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Mardal M., Andreasen M. F., Mollerup C. B., Stockham P., Telving R., Thomaidis N. S., Diamanti K. S., Linnet K., Dalsgaard P. W., HighResNPS. com: An online crowd-sourced HR-MS database for suspect and non-targeted screening of new psychoactive substances. J. Anal. Toxicol. 43, 520–527 (2019). [DOI] [PubMed] [Google Scholar]
  • 25.Rahu I., Kull M., Kruve A., Predicting the activity of unidentified chemicals in complementary bioassays from the HRMS data to pinpoint potential endocrine disruptors. J. Chem. Inf. Model. 64, 3093–3104 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Xing S., Jiao Y., Salehzadeh M., Soma K. K., Huan T., SteroidXtract: Deep learning-based pattern recognition enables comprehensive and rapid extraction of steroid-like metabolic features for automated biology-driven metabolomics. Anal. Chem. 93, 5735–5743 (2021). [DOI] [PubMed] [Google Scholar]
  • 27.Li Y., Kind T., Folz J., Vaniya A., Mehta S. S., Fiehn O., Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification. Nat. Methods 18, 1524–1531 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Neveu V., Moussy A., Rouaix H., Wedekind R., Pon A., Knox C., Wishart D. S., Scalbert A., Exposome-Explorer: A manually-curated database on biomarkers of exposure to dietary and environmental factors. Nucleic Acids Res. 45, D979–D984 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Djoumbou Feunang Y., Eisner R., Knox C., Chepelev L., Hastings J., Owen G., Fahy E., Steinbeck C., Subramanian S., Bolton E., ClassyFire: Automated chemical classification with a comprehensive, computable taxonomy. J. Chem. 8, 61 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Varma S., Simon R., Bias in error estimation when using cross-validation for model selection. BMC Bioinf. 7, 91 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Meekel N., Kruve A., Lamoree M. H., Been F. M., Machine learning-based classification for the prioritization of potentially hazardous chemicals with structural alerts in nontarget screening. Environ. Sci. Technol. 59, 5056–5065 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Guo X., Shang Y., Lv Y., Bai H., Ma Q., Suspect screening of fentanyl analogs using matrix-assisted ionization and a miniature mass spectrometer with a custom expandable mass spectral library. Anal. Chem. 93, 10152–10159 (2021). [DOI] [PubMed] [Google Scholar]
  • 33.Zhao F., Li L., Lin P., Chen Y., Xing S., Du H., Wang Z., Yang J., Huan T., Long C., HExpPredict: In vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization. Environ. Health Perspect. 131, 37009–37019 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Shi C., Yang J., You Z., Zhang Z., Fang M., Suspect screening analysis by tandem mass spectra from metabolomics to exposomics. TrAC Trends Anal. Chem. 175, 117699 (2024). [Google Scholar]
  • 35.Wishart D. S., Tian S., Allen D., Oler E., Peters H., Lui V. W., Gautam V., Djoumbou-Feunang Y., Greiner R., Metz T. O., BioTransformer 3.0—A web server for accurately predicting metabolic transformation products. Nucleic Acids Res. 50, W115–W123 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Weedn V. W., Elizabeth Zaney M., McCord B., Lurie I., Baker A., Fentanyl-related substance scheduling as an effective drug control strategy. J. Forensic Sci. 66, 1186–1200 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Han F., Zhou D.-B., Song W., Hu Y.-Y., Lv Y.-N., Ding L., Zheng P., Jia X.-Y., Zhang L., Deng X.-J., Computational design and synthesis of molecular imprinted polymers for selective solid phase extraction of sulfonylurea herbicides. J. Chromatogr. A 1651, 462321 (2021). [DOI] [PubMed] [Google Scholar]
  • 38.Schymanski E. L., Jeon J., Gulde R., Fenner K., Ruff M., Singer H. P., Hollender J., Identifying small molecules via high resolution mass spectrometry: Communicating confidence. Environ. Sci. Technol. 48, 2097–2098 (2014). [DOI] [PubMed] [Google Scholar]
  • 39.Valaer A. K., Huber T., Andurkar S. V., Clark C. R., DeRuiter J., Development of a gas chromatographic—Mass spectrometric drug screening method for the n-dealkylated metabolites of fentanyl, sufentanil, and alfentanil. J. Chromatogr. Sci. 35, 461–466 (1997). [DOI] [PubMed] [Google Scholar]
  • 40.Petras D., Phelan V. V., Acharya D., Allen A. E., Aron A. T., Bandeira N., Bowen B. P., Belle-Oudry D., Boecker S., Cummings D. A. Jr., GNPS Dashboard: Collaborative exploration of mass spectrometry data in the web browser. Nat. Methods 19, 134–136 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Vogel E. J., Neyra M., Larsen D. A., Zeng T., Target and nontarget screening to support capacity scaling for substance use assessment through a statewide wastewater surveillance network in New York. Environ. Sci. Technol. 58, 8518–8530 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Gushgari A. J., Venkatesan A. K., Chen J., Steele J. C., Halden R. U., Long-term tracking of opioid consumption in two United States cities using wastewater-based epidemiology approach. Water Res. 161, 171–180 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Du P., Zhou Z., Wang Z., Xu Z., Zheng Q., Li X., He J., Li X., Cheng H., Thai P. K., Analysing wastewater to estimate fentanyl and tramadol use in major Chinese cities. Sci. Total Environ. 795, 148838–148844 (2021). [DOI] [PubMed] [Google Scholar]
  • 44.Mohanty I., Mannochio-Russo H., Schweer J. V., El Abiead Y., Bittremieux W., Xing S., Schmid R., Zuffa S., Vasquez F., Muti V. B., The underappreciated diversity of bile acid modifications. Cell 187, 1801–1818.e20 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Quinn R. A., Melnik A. V., Vrbanac A., Fu T., Patras K. A., Christy M. P., Bodai Z., Belda-Ferre P., Tripathi A., Chung L. K., Downes M., Welch R. D., Quinn M., Humphrey G., Panitchpakdi M., Weldon K. C., Aksenov A., da Silva R., Avila-Pacheco J., Clish C., Bae S., Mallick H., Franzosa E. A., Lloyd-Price J., Bussell R., Thron T., Nelson A. T., Wang M., Leszczynski E., Vargas F., Gauglitz J. M., Meehan M. J., Gentry E., Arthur T. D., Komor A. C., Poulsen O., Boland B. S., Chang J. T., Sandborn W. J., Lim M., Garg N., Lumeng J. C., Xavier R. J., Kazmierczak B. I., Jain R., Egan M., Rhee K. E., Ferguson D., Raffatellu M., Vlamakis H., Haddad G. G., Siegel D., Huttenhower C., Mazmanian S. K., Evans R. M., Nizet V., Knight R., Dorrestein P. C., Global chemical effects of the microbiome include new bile-acid conjugations. Nature 579, 123–129 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Wang M., Jarmusch A. K., Vargas F., Aksenov A. A., Gauglitz J. M., Weldon K., Petras D., da Silva R., Quinn R., Melnik A. V., van der Hooft J., Caraballo-Rodríguez A. M., Nothias L. F., Aceves C. M., Panitchpakdi M., Brown E., di Ottavio F., Sikora N., Elijah E. O., Labarta-Bajo L., Gentry E. C., Shalapour S., Kyle K. E., Puckett S. P., Watrous J. D., Carpenter C. S., Bouslimani A., Ernst M., Swafford A. D., Zúñiga E. I., Balunas M. J., Klassen J. L., Loomba R., Knight R., Bandeira N., Dorrestein P. C., Mass spectrometry searches using MASST. Nat. Biotechnol. 38, 23–26 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Cancelada L., Torres R. R., Garrafa Luna J., Dorrestein P. C., Aluwihare L. I., Prather K. A., Petras D., Assessment of styrene-divinylbenzene polymer (PPL) solid-phase extraction and non-targeted tandem mass spectrometry for the analysis of xenobiotics in seawater. Limnol. Oceanogr. Methods 20, 89–101 (2022). [Google Scholar]
  • 48.Guo J., Huan T., Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography–mass spectrometry based untargeted metabolomics. Anal. Chem. 92, 8072–8080 (2020). [DOI] [PubMed] [Google Scholar]
  • 49.Urbina F., Lentzos F., Invernizzi C., Ekins S., Dual use of artificial-intelligence-powered drug discovery. Nat. Mach. Intell. 4, 189–191 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Texts S1 to S8

Figs. S1 to S32

Legends for tables S1 to S8

sciadv.adw2799_sm.pdf (6.7MB, pdf)

Tables S1 to S8


Articles from Science Advances are provided here courtesy of American Association for the Advancement of Science

RESOURCES