PreS/MD: Predictor of Sensitization Hazard for Chemical Substances Released From Medical Devices

Vinicius M Alves; Joyce V B Borba; Rodolpho C Braga; Daniel R Korn; Nicole Kleinstreuer; Kevin Causey; Alexander Tropsha; Diego Rua; Eugene N Muratov

doi:10.1093/toxsci/kfac078

. 2022 Aug 2;189(2):250–259. doi: 10.1093/toxsci/kfac078

PreS/MD: Predictor of Sensitization Hazard for Chemical Substances Released From Medical Devices

Vinicius M Alves ^1,^#, Joyce V B Borba ^2,^#, Rodolpho C Braga ³, Daniel R Korn ⁴, Nicole Kleinstreuer ⁵, Kevin Causey ⁶, Alexander Tropsha ^7,⁸, Diego Rua ^9,^✉, Eugene N Muratov ^10,^✉

PMCID: PMC9516038 PMID: 35916740

Abstract

In the United States, a pre-market regulatory submission for any medical device that comes into contact with either a patient or the clinical practitioner must include an adequate toxicity evaluation of chemical substances that can be released from the device during its intended use. These substances, also referred to as extractables and leachables, must be evaluated for their potential to induce sensitization/allergenicity, which traditionally has been done in animal assays such as the guinea pig maximization test (GPMT). However, advances in basic and applied science are continuously presenting opportunities to employ new approach methodologies, including computational methods which, when qualified, could replace animal testing methods to support regulatory submissions. Herein, we developed a new computational tool for rapid and accurate prediction of the GPMT outcome that we have named PreS/MD (predictor of sensitization for medical devices). To enable model development, we (1) collected, curated, and integrated the largest publicly available dataset for GPMT results; (2) succeeded in developing externally predictive (balanced accuracy of 70%–74% as evaluated by both 5-fold external cross-validation and testing of novel compounds) quantitative structure-activity relationships (QSAR) models for GPMT using machine learning algorithms, including deep learning; and (3) developed a publicly accessible web portal integrating PreS/MD models that can predict GPMT outcomes for any molecule of interest. We expect that PreS/MD will be used by both industry and regulatory scientists in medical device safety assessments and help replace, reduce, or refine the use of animals in toxicity testing. PreS/MD is freely available at https://presmd.mml.unc.edu/.

Keywords: sensitization, GPMT, QSAR, machine learning, deep learning, new approach methods

Sensitization is a toxicological endpoint associated with the ability of an offending chemical to cause or elicit an allergic response in some people following repeated exposures to the allergen (Grundström and Borrebaeck, 2019; ICCVAM, 2018). Traditionally, assessing the sensitization potential for a chemical or material has relied on the use of animal models such as the guinea pig. The guinea pig maximization test (GPMT) (Magnusson and Kligman, 1969) as well as the Buehler test (Buehler, 1965) were historically the predominantly used methods, and they continue to be recommended for certain chemical categories such as medical device materials. Alternative in vivo assays, such as the murine local lymph node assay (LLNA), are also employed to assess skin sensitization. However, more recently, regulatory agencies have been promoting the development of alternative in vitro, in chemico, and in silico methods that could help reduce, refine or replace testing in animals without compromising the acceptable standards for identifying sensitizers (Casati et al., 2018; Daniel et al., 2018; Kleinstreuer et al., 2018). In June 2021, the Organization for Economic Cooperation and Development published the first internationally harmonized guideline on defined approaches that combine multiple alternative methods to predict sensitization (OECD, 2021). The Food and Drug Administration (FDA) has published a Predictive Toxicology Roadmap to outline agency priorities and spur the development as well as evaluation of new approach methods (NAMs) that may enhance the prediction of human responses to substances relevant to FDA-regulated products (FDA, 2020a).

Medical devices encompass a vast array of products intended to treat patients or diagnose diseases or other health-compromising conditions (Bronzino, 2006). FDA’s Center for Devices and Radiological Health (CDRH) is responsible for the regulatory oversight of medical devices (as defined in Section 201(h) of the Food, Drug, and Cosmetic Act [Kramer et al., 2020]) sold in the United States. Medical devices require a premarket biocompatibility assessment described in the Guidance for Industry and FDA Staff on Use of International Standard ISO 10993-1, Biological evaluation of medical devices—Part 1: Evaluation and testing within a risk management process (FDA, 2020b).

Many medical devices, such as implants and glucose meters, contain chemical substances that may leach and cause toxicity (Hansel et al., 2020; Herman et al., 2019, 2020). Depending on the type and the duration of the contact with the body, a device may be required to evaluate certain biocompatibility endpoints, including the potential to produce localized sensitization responses (Reeve and Baldrick, 2017). Premarket submissions for medical devices address sensitization potential with data gathered primarily with the GPMT or Buehler tests as recommended by the International Organization for Standardization (ISO) standard 10993 Part 10 (FDA, 2020b). The GPMT, which includes intradermal injection at induction, is recommended to evaluate implantable medical devices and devices made from novel materials.

In the last several years, both our (Alves et al., 2018a; Borba et al., 2021, 2022; Braga et al., 2017) and other (Roberts et al., 2017; Toropova and Toropov, 2017) groups have developed computational models using LLNA data as a reference for predicting the sensitizing activity of chemicals. In an effort to support the assessment of chemical substances that can be released from medical devices regarding their sensitization potential and help reduce in vivo animal testing, we embarked on the development of a unique open-source computational tool and web app that we named PreS/MD (predictor of sensitization for medical devices). To achieve this goal, we (1) collected, curated, and integrated the largest publicly available dataset for GPMT results; (2) developed and externally validated structure-activity relationships (QSAR) models to predict GPMT; and (3) incorporated GMPT models into the PreS/MD web portal to help evaluate the sensitization potential for medical device extractables and leachables.

MATERIALS AND METHODS

The workflow employed in the study is depicted in Figure 1.

Data Collection and Curation

European Chemical Agency dataset

Experimental animal data from sensitization evaluations using the GPMT as recommended by the OECD Test Guideline No. 406 were retrieved from the European Chemical Agency (ECHA) study results database (https://iuclid6.echa.europa.eu/reach-study-results). Unfortunately, there were numerous problems with the collected raw data. For instance, many numerical data were represented as string variables, the units of measurements were not standardized through the datasets, and there were many “free text” data. Therefore, we extensively cleaned and standardized all the data and converted measurements to the same units in each dataset. We also used regex expressions to find essential features for the database that were described in text format; this was key to classifying endpoints into GHS hazard categories, followed by additional data processing steps. Specifically, after removing data labeled as “unreliable” by ECHA and non-modelable compounds (eg, mixtures, inorganics), 934 out of the original molecules were kept. Among 18 replicate chemical pairs in the dataset, biological annotations for 17 of them were concordant and 1 discordant, that is, duplicative compounds had different annotated classifications (sensitizer vs non-sensitizer). All the discordant replicates and one of each concordant replicate were removed. The final dataset comprised 920 unique chemical compounds, including 230 sensitizers and 690 non-sensitizers based on GPMT results.

Data from the literature

In addition to the data from the ECHA study results database, we also collected GPMT sensitization experimental data from the scientific literature (Devillers, 2000; Fedorowicz et al., 2005; Golla et al., 2009; Tomlinson et al., 2009; Yuan et al., 2009). After removing mixtures, inorganics, and counter ions, 701 out of the original 745 data points were kept. Only one pair of duplicates showed biological annotation disagreement among 221 chemicals with more than one data point in the dataset. Subsequently, the discordant replicates were removed and only one data point for each concordant replicate was kept. Thus, the final dataset had 365 unique chemical compounds, including 198 sensitizers and 167 non-sensitizers.

Combined GPMT data from ECHA and the literature

We merged the curated data from ECHA and the research literature and examined the content of this combined data. There were 41 pairs of duplicates between these 2 datasets, and the sensitization potential of only 6 of these pairs was annotated differently. These discordant records were removed, and only one record for each concordant pair of duplicates was kept. The merged dataset had 1238 unique compounds, including 411 sensitizers and 827 non-sensitizers. The curated data are available in Supplementary Table 1 and via both the PreS/MD portal and the Integrated Chemical Environment (https://ice.ntp.niehs.nih.gov/).

Additional test set

An additional literature search conducted after the model was developed identified 9 new compounds with GPMT data that were not part of the training set used for model development. The chemical structures were standardized and used as an additional test set (Supplementary Table 2).

Extractables and leachables set

We collected 474 compounds from the Extractables and Leachables Safety Information Exchange (ELSIE) Database (https://www.elsiedata.org/) that was publicly available on March 6, 2019. After the removal of inorganics, mixtures, and duplicates, 415 compounds remained. We found that 102 compounds were already present on our GPMT list (Supplementary Table 1). The remaining 313 unique compounds were used for virtual screening (Supplementary Table 3).

Data Curation

Datasets were thoroughly curated following the workflows developed by us earlier (Fourches et al., 2016). First, we performed chemical structure curation and removed mixtures, inorganics, and organometallic compounds, neutralized salts, normalized the specific chemotypes, and applied the special treatment to chemicals with multiple replicated records as follows: (1) when replicated records presented the same binary outcome, only one record was kept; (2) when a majority of replicated records presented the same binary outcome and one had different binary outcome, only one record with the majority binary outcome was kept; and (3) when binary replicated records had different outcomes, all of them were removed. In the ECHA dataset, only one pair of compounds out of 18 duplicate chemicals had discordant annotations. The data collected from the literature had only one pair of duplicates with discordant annotations among 221 chemicals. Finally, there were 41 pairs of replicates between these 2 datasets, and the sensitization potential was different for only 6 of these pairs. All the curated data are available in Supplementary Material.

Analysis of Chemical Fragments

We collected a list of the structural alerts historically related to skin sensitization from OCHEM ToxAlerts (Sushko et al., 2012). The list of fragments represented as SMARTS was used to query the curated GPMT dataset to analyze the presence of these fragments in both sensitizers and non-sensitizers.

QSAR Modeling

The modelability index (MODI) (Golbraikh et al., 2014) was calculated to estimate the feasibility of obtaining predictive QSAR models. We developed our models following the best practices of QSAR modeling (Cherkasov et al., 2014). The models were developed using open-source chemical descriptors based on ECFP4-like circular fingerprints with 2048 bits and an atom radius of 2 (Morgan2) calculated in RDKit (http://www.rdkit.org). Machine learning approaches included support vector machine (SVM) (Vapnik, 2000), random forest (RF) (Breiman, 2001), and light gradient boosting machines (lightGBM) algorithms implemented in Scikit-learn (Pedregosa et al., 2012). All models were optimized using a Bayesian approach implemented in Scikit-Optimize v.0.7.4. The details of hyperparameters explored in this work are available in the Supplementary Material. The Bayesian optimization is defined as follows (equation 1):

P (f| D_{1 : t}) \propto P (D_{1 : t}| f) P (f)

(1)

where, x_i is the ith sample, and $f$ (x_i) is the observation of the objective function at x_i. The observations D_1:_t = {x_1: _t, $f$ (x_1:_t)} are accumulated. The prior distribution is combined with the likelihood function P(D_1:_t| $f$ ) observing $D_{1 : t}$ , in which the model $f$ is multiplied by the prior probability $P (f)$ . In doing so, Bayesian optimization finds hyperparameters that maximize the objective function (G-mean score) by building a surrogate function (probabilistic model) based on past evaluation hyperparameters of the objective (Wu et al., 2019). The geometric (G)-mean was selected as the scorer because it measures the balance between classification performances on both the majority (nontoxic) and minority (toxic) classes.

The QSAR models employing deep learning were developed using Keras (https://keras.io/), a deep learning library, and Tensorflow (www.tensorflow.org), a flexible architecture that allows the deployment of calculations to desktops or servers, as backend. In addition, the following parameters of the deep learning method were optimized before model training: layer type (dense), hidden layers (3), activation function (ReLU), output layer function (sigmoid), model optimizer (Adam), and loss function (binary cross-entropy). The following hyperparameters were utilized for further deep learning training: epochs (5, 10, 50, 100) and batch size (10, 20, 40, 60, 80, 100).

The predictivity of the models was assessed using the metrics shown in equations 2–7:

Balanced accuracy:

Balanced accuracy = \frac{(sensitivity + specificity)}{2}

(2)

Sensitivity:

Sensitivity = \frac{TP}{TP + FN}

(3)

Specificity:

Specificity = \frac{TN}{TN + FP}

(4)

Positive predictive value (PPV):

PPV = \frac{TP}{TP + FP}

(5)

Negative predictive value (NPV):

NPV = \frac{TN}{TN + FN}

(6)

Kappa

Kappa = \frac{2 \times (TP \times TN - FN \times FP)}{(TP + FP) \times (FP + TN) + (TP + FN) + (FN + TN)}

(7)

where TP are the true positives, FP are the false positives, TN are the true negatives, and FN are the false negatives.

We followed an external 5-fold cross-validation procedure. First, the entire dataset is split into 5 parts of the same size. Then, for each iteration, one of these subsets (20% of compounds) is used as a test set, whereas the other 4 sets (80% of compounds) are used collectively as the training set. We repeat this procedure 5 times until each of the 5 subsets is used once as a test set. In addition, each training set is internally divided into multiple training and validation sets for model training and hyperparameter tuning. The models are generated using only the training set. The true test sets are never employed to generate or select the models. We repeated this procedure for each model. The final statistics are based on the consensus (average prediction) of these models. The consensus model considers the majority rule (at least 2 out of 3) for the final classification. In every case, only the modeling set was used to develop the models, whereas the external sets were used for the evaluation of their predictive power. In addition, 10 rounds of Y-randomization were performed for each dataset to assure that the model performance was not due to chance correlations.

According to OECD rules for QSAR modeling acceptance for regulatory purposes (OECD, 2007) and for the additional estimation of the reliability of predictions, we also characterized the applicability domain (AD) of developed models (Alves et al., 2021). In general, AD reflects whether a new compound is sufficiently similar to the training set to obtain reliable predictions (Tropsha, 2010), although some exceptions with very accurate predictions outside of AD and vice versa are also possible (Capuzzi et al., 2017). Here, AD of the developed models was estimated using the z-cut-off method (Tropsha and Golbraikh, 2007), based on suitability as determined in our previous toxicity studies (Alves et al., 2021; Kuz’min et al., 2008). Following this approach, AD of the models was calculated as D_cut-off = ⟨D⟩ + Zs, where Z is a similarity threshold parameter defined by a user (0.5 in this study) and ⟨D⟩ and s are the average and SD, respectively, of all dice similarities in the multidimensional descriptor space between each compound and its nearest neighbors for all compounds in the training set. The AD defined the coverage of the models, that is, the stricter the AD threshold, the smaller the coverage and the higher the confidence in the accuracy of the prediction for the compounds that remained within AD. In the PreS/MD web app, the user can visualize the similarity distribution of the training set and how far the query compound is from the threshold. If the query compound is below the threshold, then it is outside the model’s AD and vice versa. It is important to emphasize that the AD is purely a characteristic of a computational model and in this case should not be interpreted to refer to the medical device AD.

Mechanistic Interpretation of QSAR Models

Maps of predicted fragment contribution (Riniker and Landrum, 2013) were generated from the QSAR models to help identify and visualize the substructure(s) predicted to significantly contribute to the sensitization potential. Here, the contribution of an atom is estimated by a contribution difference obtained when the associated bits in the fingerprint corresponding to the atom are removed. Then, the normalized contributions were used to color code the atoms in a topography-like map, in which green indicates a negative contribution to toxicity (ie, sensitization hazard reduces when the atom is absent) and magenta indicates a positive contribution to toxicity (ie, sensitization hazard increases when the atom is present) (Riniker and Landrum, 2013).

Model Implementation

The PreS/MD web app was implemented on an Ubuntu Server. The app is coded using Flask (http://flask.pocoo.org), uWSGI (https://uwsgi-docs.readthedocs.org), Nginx (http://nginx.org), Python (https://www.python.org), RDKit (http://www.rdkit.org), scikit-learn (http://scikit-learn.org), and JavaScript (http://www.ecma-international.org). PreS/MD also includes the JSME molecule editor written in JavaScript (Bienfait and Ertl, 2013) supported by most popular web browsers. Java or Flash plugins are not required to use the app.

RESULTS AND DISCUSSIONS

Analysis of Chemical Fragments

Structural alerts are chemical fragments associated with particular chemical toxicity outcome (Blagg, 2010). Although these alerts may be essential to explain the mechanism of toxicity of a chemical, they do not act isolated from its chemical environment. For this reason, the blind prediction of chemical toxicity solely based on the presence/absence of structural alerts often leads to a high number of false positives in virtual screens (Alves et al., 2016b). We downloaded 92 structural alerts from OCHEM ToxAlerts and analyzed their distribution in the GPMT dataset. The complete list of structural alerts for the entire dataset and the summary of the presence of alerts in sensitizers and non-sensitizers is available in Supplementary Tables 4 and 5. Table 1 shows the top 10 structural alerts with higher positive and negative delta (difference in sensitizers vs. non-sensitizers). Phenyl esters, activated alkene (Michael acceptor), and epoxides had the highest positive delta. Here, only iso(thio)cyanates (n = 12) and isocyanates (n = 10) were present uniquely in sensitizers. Most alerts (n = 54) were present in less than 10 sensitizers (Supplementary Table 4). Eleven alerts were present only in non-sensitizers (Supplementary Table 4). Nineteen alerts that were present in sensitizers were absent in non-sensitizers. Of these, 15 were present in less than 5 sensitizers. Acylating agents, aromatic amines, and ketones were the compounds with the highest negative delta, that is, they were much more present in non-sensitizers than sensitizers. This analysis reinforces our previous findings (Alves et al., 2016b) that structural alerts often are disproportionally oversensitive, flagging safe compounds as toxic. They may be important to comprehend the mechanism of toxicity of a chemical, but because they do not act independently from the rest of the chemical structure, they should not be used alone for toxicity prediction.

Table 1.

Analysis of Structural Alerts Showing the Top 10 Structural Alerts With Higher Positive and Negative Delta (Difference in Sensitizers vs Non-Sensitizers)

Structural Alert	Non-Sensitizer	Sensitizer	Delta	Total
Phenyl esters	3	25	22	28
Activated alkene (Michael acceptor)	42	62	20	104
Epoxides	1	16	15	17
Aldehydes	17	29	12	46
Iso(thio)cyanates	0	12	12	12
Esters of aromatic alcohols and their thio and aza analogues	37	47	10	84
Isocyanates	0	10	10	10
Ortho-disubstituted benzenes	22	31	9	53
1,2-Dihydroxy aromatic compounds	3	12	9	15
α-Naphthols	9	3	−6	12
α, β-Unsaturated carbonyl and thiocarbonyl compounds	12	5	−7	17
Acid anhydrides and their thio and aza analogues	20	12	−8	32
Aromatic amines precursors	31	22	−9	53
Precursors of aldehydes and ketones	11	2	−9	13
Aldehydes and precursors	35	25	−10	60
Quaternary ammonium cation	16	1	−15	17
Acylating agents	41	20	−21	61
Aromatic amines	92	62	−30	154
Ketones	57	26	−31	83

Open in a new tab

QSAR Models for Predicting Sensitization Using GPMT Data

At project inception, we sought out for Buehler test data as well. Because the administration of the test material/compound is different in the Buehler test as compared with GPMT, we presumed that pooling the data together would create an artifact/confounder to the detriment of our model. Therefore, we decided to model the Buehler data separately from GPMT data. However, we identified only 21 sensitizers among 224 compounds with Buehler data, which was not enough to develop a predictive QSAR model for the Buehler data. For this reason, we decided to use only GPMT data (described in the Materials and Methods section).

High values of MODI (≥0.7), a parameter employed to estimate the feasibility of obtaining predictive models, allowed us to expect robust and predictive QSAR models to be developed for this dataset. The statistical characteristics of the sensitization models built and validated using GPMT data are shown in Table 1. The machine learning models built using RF, SVM, lightGBM, and deep learning were able to predict the test set with a balanced accuracy of 69%, 71%, 66%, and 72%, respectively (see Table 2). Despite the high balanced accuracy, only the deep learning model presented an acceptable PPV (>0.60). Despite the low PPV, the models have a high NPV, meaning these models are more prone to have false positives. However, when compounds are predicted to be non-sensitizers, the confidence in this prediction is as high as 0.80–0.84. Therefore, these models are highly predictive of identifying safer compounds and only positive predictions should be suggested for additional testing.

Table 2.

Statistical Characteristics of QSAR Models Developed for GPMT Based on the External Test Sets

Model	Balanced Accuracy	Sensitivity	Specificity	PPV	NPV	Kappa
RF	0.69	0.69	0.69	0.53	0.82	0.35
SVM	0.71	0.73	0.69	0.54	0.84	0.39
LightGBM	0.66	0.66	0.66	0.55	0.80	0.30
Deep learning	0.72	0.63	0.80	0.61	0.81	0.43

Open in a new tab

PreS/MD Usability

PreS/MD has an intuitive user interface (Figure 2). The user may draw a molecule of interest or directly paste the query chemical structure’s SMILES string in the “molecular editor” box. After hitting the “Predict” button, the user will receive the predicted sensitization potential based on a training set of GPMT results. These predictions are accompanied by several parameters, including the respective confidence value, the AD, and the maps of predicted fragment contributions. The confidence quantifies the challenge in decision-making for each prediction. In addition, it diagnoses when the model becomes less effective because of data distribution shifts. The confidence score is a number between 0 and 100, as the output shows the confidence of the predicted class. Here, we used the confidence-aware learning algorithm (Moon et al., 2020), which is an estimate of the compound being correctly predicted calculated based on the frequency that the internal models correctly predict compound. Particularly, a higher confidence threshold means a higher probability that the predicted value is correct.

Case Studies

As an example of a practical application, we tested PreS/MD by employing it to predict the sensitization potential of 9 chemical substances released from medical devices that we identified through literature search after our models were built so these compounds were not included in the training dataset. The results of prediction are shown in Table 3. PreS/MD correctly predicted 6 out of 9 compounds (balanced accuracy of 65%, sensitivity of 80%, specificity of 50%, PPV of 66%, and NPV of 66%). Although the evaluation of these 9 compounds presented low specificity, the NPV indicates the probability of predicted non-sensitizer being truly non-sensitizers is as high as 66%. Three compounds were outside the AD, 1,2-dibromo-2,4-dicyanobutane, 2-methyl-3(2H)-isothiazolone, and 4,5-dichloro-2-methyl-4-isothiazolin-3-one. When considering the AD, 5 out of 6 compounds inside the AD were correctly predicted. In this case, only ethanol was mispredicted to be a non-sensitizer in GPMT. Indeed, ethanol appears to be a sensitizer in a small fraction of humans (Lachenmeier, 2008), whereas its use as a vehicle in LLNA has been questioned (Betts et al., 2007). For the 3 compounds outside the AD, 2 were mispredicted. These examples reinforce the importance of taking into consideration the AD output when making sensitization hazard predictions with PreS/MD for compounds not seen by the model.

Table 3.

Experimental Activity and Predictions for Case Study Chemicals

Ingredient	GPMT	PreS/MD	Confidence (%)	AD
Abietic acid	Sensitizer	Sensitizer	99.4	Inside
Ethanol	Sensitizer	Non-sensitizer	99.5	Inside
Eugenol	Sensitizer	Sensitizer	97.8	Inside
Geraniol	Non-sensitizer	Non-sensitizer	98.3	Inside
Methylparaben	Non-sensitizer	Non-sensitizer	96.1	Inside
Sulfanilic acid	Sensitizer	Sensitizer	94.3	Inside
1,2-Dibromo-2,4-dicyanobutane	Non-sensitizer	Sensitizer	98.5	Outside
2-Methyl-3(2H)-isothiazolone	Sensitizer	Sensitizer	99.6	Outside
4,5-Dichloro-2-methyl-4-isothiazolin-3-one	Non-sensitizer	Sensitizer	99.5	Outside

Open in a new tab

In addition to these 9 compounds with GPMT data, we exploited our models to predict the ELSIE database containing 474 chemical substances known to be released from medical devices. After removing inorganics, mixtures, and duplicates, 415 compounds remained, and we found that 102 compounds were present in our curated GPMT list. Out of the 313 remaining compounds, our models predicted 98 compounds as sensitizers in the GPMT assay and 215 as non-sensitizers assuming that they were tested above their corresponding sensitization threshold.

The Use of GPMT to Predict Human Sensitization

Previously, we analyzed the correlation of LLNA and human skin sensitization data to understand how valuable the animal model is for informing risk/hazard assessment (Alves et al., 2016a). As GPMT is still being used to check the sensitization potential of extractables and leachables from medical devices (FDA, 2020b), we decided to conduct a similar analysis to one we previously reported (Alves et al., 2016a) and compare the overlap between our 1238 compounds with GPMT data and 138 compounds with human data (Alves et al., 2018a). As seen in Table 4, 81 compounds were both tested in GPMT and had human clinical data. In total, 46 compounds were sensitizers in both tests and 21 compounds were classified as non-sensitizers in both tests, whereas 14 disagreed in classification (see Supplementary Table 5). Therefore, our analysis has shown that the accuracy of using GPMT to predict human sensitization is estimated to have a balanced accuracy of 82%, sensitivity of 84%, PPV of 90%, specificity of 80%, and NPV of 70%. A previous analysis found that GPMT had a sensitivity of 70% and specificity of 100% (Haneke et al., 2001). However, the dataset analyzed by Haneke et al. was smaller, with 57 chemicals, and imbalanced, with only 3 non-sensitizers.

Table 4.

Comparison of Sensitization Profile of GPMT and Human Clinical Data

	Human
GPMT	Sensitizer	Non-Sensitizer	Total
Sensitizer	46	9	55
Non-sensitizer	5	21	26
Total	51	30	81

Open in a new tab

The variability of the GPMT has been documented as dependent on the total number of animals, dosage, and grade patterns of the sensitization response considered in the test (Andersen et al., 1995). Within the extensive data collected and curated in this work, GPMT data showed high reproducibility. In the ECHA dataset, only one pair of compounds out of 18 duplicate chemicals had discordant annotations. The data collected from the literature had only one pair of duplicates with discordant annotations among 221 chemicals. Finally, there were 41 pairs of replicates between these 2 datasets, and the sensitization potential was different for only 6 of these pairs.

In our previous analysis (Alves et al., 2018a), we found the accuracy of LLNA to predict human skin sensitization was estimated to have a balanced accuracy of 68%, sensitivity of 84%, and specificity of 52%, which is consistent with the results from the reference data collected to support the OECD GL 497 project (OECD, 2021). The low specificity means that LLNA is oversensitive to predict human skin sensitization, that is, more compounds tend to be skin sensitizers in mice than in humans. Here, our analysis shows that GPMT results have relatively higher concordance with human data, with specificity as high as 75%, as compared with LLNA.

An Alternative to Animal Testing to Assess Sensitization for Medical Devices

The GPMT was first published in 1969 (Magnusson and Kligman, 1969) and was considered the preferred animal method to assess sensitization caused by chemicals for decades. In 1989, the LLNA was first described (Kimber and Weisenberger, 1989). Since then, it underwent multiple evaluations and refinements, becoming the preferred animal testing for skin sensitization after the publication of OECD Testing Guideline No. 429 (OECD, 2010). However, international standards (ISO 10993 Part 10) (ISO, 2010) still recommend GPMT for the evaluation of chemical substances released from medical devices (Daniel et al., 2018).

Recently, a study evaluated the sensitization potential of chemicals present in medical devices using a combination of in chemico (DPRA) and in vitro (LuSens) methods in comparison with the LLNA method and suggested a testing strategy for the safety assessment of medical device extracts (Svobodová et al., 2021). The authors reported an overall concordance of 63.9%–82.5% between LLNA and DPRA and 80%–85.4% between LLNA and LuSens. Unfortunately, sensitivity and specificity for this analysis were not reported. The results shown in Table 4 reveal a high concordance between GPMT and human data. Previously, we found that LLNA tends to be oversensitive compared with the human response (Alves et al., 2016a, 2018a). Although GPMT shows a higher concordance with human data than the LLNA, it is essential to note that GPMT requires the sacrifice of several animals for each tested chemical (Coleman et al., 2015).

QSAR models developed in this study and implemented in the PreS/MD web app showed 66%–72% balanced accuracy. Although our analysis of replicates identified only 6 discordant out of 41 replicated entries, a previous study has shown that dose, number of animals, and response pattern may influence the outcome of the assay as interpreted by a toxicologist. Therefore, considering the absence of state-of-the-art predictors of GPMT as well as the variability of the assay, models such as ours can help minimize the use of GPMT and/or confirm potential shortcomings with the GPMT to identify a human sensitization hazard. Moreover, because the GPMT demonstrated higher concordance with human data than the LLNA for this dataset, we suggest that QSAR models based on GPMT data are more appropriate than QSAR models based on LLNA to predict the human response to chemicals such as medical device extractables and leachables.

CONCLUSIONS

Previously, our group developed the first QSAR model for skin sensitization based on human data (Alves et al., 2016a). Later, we employed an innovative approach using human, LLNA, and 3 validated nonanimal assays within a Bayesian model to predict the human response (Alves et al., 2018a). This integrative Bayesian model (Alves et al., 2018a) showed higher accuracy in predicting the human response than the model built using only human data. These models were implemented in a newer version of the Pred-Skin web app (Borba et al., 2021). Since the publication of the OECD Testing Guideline No. 429 (OECD, 2010), LLNA has been regarded as the preferred animal test for evaluating skin sensitization. However, the GPMT is still recommended by ISO standards for the nonclinical study of sensitization from medical devices. For this reason, we decided to develop a separate sensitization web application focusing on the safety evaluation of these devices (Haneke et al., 2001).

The application of predictive computational models such as the one reported here represents a step forward in characterizing the level of concern for sensitization potential among chemical substances released from medical devices. The model presented is essentially a hazard assessment tool and its predictions do not pre-empt the regulatory conclusions that are made on the basis of experimental data. However, CDRH’s Medical Device Development Tools (MDDT) program (FDA, 2022) provides a systematized pathway to qualify tools, such as PreS/MD, within a given context of use in the evaluation of medical devices, and the achievements of this program are in line with FDA’s Predictive Toxicology Roadmap. We plan to submit PreS/MD for MDDT qualification and show scientific support for the use of this nonanimal method to advance the evaluation approach during device development. For example, a medical device developer could use PreS/MD to screen available chemical options before manufacturing their innovative device to select ones that do not present a patient sensitization hazard. In the early phase of the device development, the experimental sensitization screening for any chemical (eg, plasticizers, additives, fillers, mold release agents, etc.) that could potentially leach out from a final finished device could be guided by the open-source PreS/MD.

Because no experimental assay is perfect, the binary determination of sensitization hazard as evaluated by GPMT can certainly benefit from PreS/MD predictions in a weight-of-evidence approach and help protect patients from the mentioned hazard. For this reason, PreS/MD predictions can be combined with Pred-Skin (Borba et al., 2021) as this will incorporate structural alerts for human skin sensitizers. The weight-of-evidence approach offered by these predictive computational models will increase confidence in GPMT outcomes for chemical substances that are not soluble in the standard recommended device extraction vehicle/solvents. Moreover, PreS/MD is a unique tool for consideration in a future battery of in silico, in chemico, and in vitro studies shown to adequately predict human sensitizers with an accuracy similar to or better than in vivo methods currently recommended for medical device biocompatibility evaluations.

Finally, although we have centered our discussion on the use of PreS/MD in sensitization hazard assessment to help speed medical device development and premarket safety evaluations, the role of this easy-to-use in silico tool could similarly support efforts in post-market surveillance. For example, PreS/MD will help with comparisons of extractable and leachable profiles for devices known to be sensitizing against similar devices established to be non-sensitizing, which in turn helps prioritize the gathering of corresponding analytical targeting data. Incorporating chemical substances sensitization induction thresholds and elicitation thresholds information is a viable improvement in a future version of PreS/MD to facilitate risk assessing the data gathered from analytical targeting work to quantify specific extractable and leachables.

In summary, here we described the development of PreS/MD, a web application to predict the sensitization potential of chemicals based on GPMT data employing molecular descriptors and multiple machine learning algorithms. Although GPMT is older than LLNA, PreS/MD is the first publicly available computational tool based on this assay and, for this reason, we could not compare our models with other skin sensitization computational tools, such as those executed by in a previous study (Golden et al., 2021). We also mined the literature thoroughly to find additional medical device extractable and leachable data to validate our models. Unfortunately, we found only 9 additional compounds with GPMT data as well as the ELSIE dataset (containing extractables and leachables), which we screened virtually and made the sensitization hazard predictions available in the Supplementary Materials accompanying this manuscript. We hope that the publication of this manuscript describing PreS/MD will motivate interested stakeholders to engage with the corresponding authors to discuss potential work in collaboration for both time-split validation of current models as well as the development of more predictive ones.

Despite the fact that non-animal assays have been explored to evaluate the potential skin sensitization effects of chemical hazards (Grundström and Borrebaeck, 2019), animals are still required by regulatory agencies to assess medical devices. Our results show that GPMT has a good correlation with human data, and it is notably higher than the LLNA correlation with human data for this dataset (ICCVAM and NICEATM, 1999). Our results show that the historical and publicly available GPMT data are sufficient to generate predictive and robust in silico models using machine learning approaches. The PreS/MD web application fulfills an unmet need to help modernize the evaluation of sensitization for medical devices and especially to developers screening available materials (as well as corresponding constituents, manufacturing agents, etc.) for their innovative devices to select ones that are unlikely to present a patient sensitization hazard without the need for testing in animals. Moreover, based on our previous findings, we expect that the models developed in this study are suitable for estimating other industrial chemicals’ toxicity in a weight-of-evidence approach (Alves et al., 2018b). The PreS/MD web application is publicly available at https://presmd.mml.unc.edu/.

Supplementary Material

kfac078_Supplementary_Data

Click here for additional data file.^{(216.5KB, zip)}

ACKNOWLEDGMENTS

J.B. thanks CNPq’s Science without Borders program for the financial support of her visit to the University of North Carolina at Chapel Hill. V.M.A. thanks the Lush Prize. V.A. is currently an employee of Takeda Pharmaceuticals.

FUNDING

NIH (1R43ES032371).

DECLARATION OF CONFLICTING INTERESTS

A.T., E.N.M., K.C., and V.M.A. are cofounders of Predictive, LLC, which develops computational methodologies and software for toxicity prediction. R.C.B. is the CTO of InsilicAll. D.R.K. was working on this contract as an unpaid volunteer. All the other authors declare no conflicts.

Contributor Information

Vinicius M Alves, Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

Joyce V B Borba, Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

Rodolpho C Braga, InsilicAll, Sao Paulo, SP 04571-010, Brazil.

Daniel R Korn, Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

Nicole Kleinstreuer, National Toxicology Program, Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27560, USA.

Kevin Causey, Predictive, LLC, Research Triangle Park, North Carolina, USA.

Alexander Tropsha, Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA; Predictive, LLC, Research Triangle Park, North Carolina, USA.

Diego Rua, Division of Biology, Chemistry, and Materials Science, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, USA.

Eugene N Muratov, Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

Data Availability

All curated datasets in Excel format and the results for virtual screening of the ELSIE dataset are freely available in the Supplementary Materials that accompany this manuscript.

Disclaimer: The findings and conclusions in this report are those of the authors and do not necessarily represent those of the Food and Drug Administration positions or policies. The mention of commercial products, their sources, or their use in connection with material reported herein is not to be construed as either an actual or implied endorsement of such products by the Department of Health and Human Services.

REFERENCES

Alves V. M., Auerbach S. S., Kleinstreuer N., Rooney J. P., Muratov E. N., Rusyn I., Tropsha A., Schmitt C. (2021). Curated data in—Trustworthy in silico models out: The impact of data quality on the reliability of artificial intelligence models as alternatives to animal testing. Altern. Lab. Anim. 49, 73–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
Alves V. M., Capuzzi S. J., Braga R. C., Borba J. V. B., Silva A. C., Luechtefeld T., Hartung T., Andrade C. H., Muratov E. N., Tropsha A. (2018a). A perspective and a new integrated computational strategy for skin sensitization assessment. ACS Sustainable Chem. Eng. 6, 2845–2859. [Google Scholar]
Alves V. M., Capuzzi S. J., Muratov E., Braga R. C., Thornton T., Fourches D., Strickland J., Kleinstreuer N., Andrade C. H., Tropsha A. (2016a). QSAR models of human data can enrich or replace LLNA testing for human skin sensitization. Green Chem. 18, 6501–6515. [DOI] [PMC free article] [PubMed] [Google Scholar]
Alves V. M., Muratov E. N., Capuzzi S. J., Politi R., Low Y., Braga R. C., Zakharov A. v., Sedykh A., Mokshyna E., Farag S., et al. (2016b). Alarms about structural alerts. Green Chem. 18, 4348–4360. [DOI] [PMC free article] [PubMed] [Google Scholar]
Alves V. M., Muratov E. N., Zakharov A., Muratov N. N., Andrade C. H., Tropsha A. (2018b). Chemical toxicity prediction for major classes of industrial chemicals: Is it possible to develop universal models covering cosmetics, drugs, and pesticides? Food Chem. Toxicol. 112, 526–534. [DOI] [PMC free article] [PubMed] [Google Scholar]
Andersen K. E., Vølund A., Frankild S. (1995). The guinea pig maximization test—With a multiple dose design. Acta Derm. Venereol. 75, 463–469. [DOI] [PubMed] [Google Scholar]
Betts C. J., Beresford L., Dearman R. J., Lalko J., Api A. P., Kimber I. (2007). The use of ethanol: Diethylphthalate as a vehicle for the local lymph node assay. Contact Dermatitis. 56, 70–75. [DOI] [PubMed] [Google Scholar]
Bienfait B., Ertl P. (2013). JSME: A free molecule editor in JavaScript. J. Cheminform. 5, 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
Blagg J. (2010). Structural alerts for toxicity. In Burger’s Medicinal Chemistry and Drug Discovery (D. J. Abraham, Ed.), pp. 301–334. John Wiley & Sons, Inc., Hoboken, NJ. [Google Scholar]
Borba J. V. B., Alves V. M., Braga R. C., Korn D. R., Overdahl K., Silva A. C., Hall S. U. S., Overdahl E., Kleinstreuer N., Strickland J., et al. (2022). STopTox: An in silico alternative to animal testing for acute systemic and topical toxicity. Environ. Health Perspect 130, 27012. [DOI] [PMC free article] [PubMed] [Google Scholar]
Borba J. V. B., Braga R. C., Alves V. M., Muratov E. N., Kleinstreuer N., Tropsha A., Andrade C. H. (2021). Pred-Skin: A web portal for accurate prediction of human skin sensitizers. Chem. Res. Toxicol. 34, 258–267. [DOI] [PubMed] [Google Scholar]
Braga R. C., Alves V. M., Muratov E. N., Strickland J., Kleinstreuer N., Trospsha A., Andrade C. H. (2017). Pred-Skin: A fast and reliable web application to assess skin sensitization effect of chemicals. J. Chem. Inf. Model. 57, 1013–1017. [DOI] [PubMed] [Google Scholar]
Breiman L. (2001). Random forests. Mach. Learn. 45, 5–32. [Google Scholar]
Bronzino J. D. (2006). Medical Devices and Systems, 1st ed. CRC Press, Boca Raton, FL. 10.1201/9781420003864. [DOI] [Google Scholar]
Buehler E. V. (1965). Delayed contact hypersensitivity in the guinea pig. Arch. Dermatol. 91, 171–177. [DOI] [PubMed] [Google Scholar]
Capuzzi S. J., Kim I. S.-J., Lam W. I., Thornton T. E., Muratov E. N., Pozefsky D., Tropsha A. (2017). Chembench: A publicly accessible, integrated cheminformatics portal. J. Chem. Inf. Model. 57, 105–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
Casati S., Aschberger K., Barroso J., Casey W., Delgado I., Kim T. S., Kleinstreuer N., Kojima H., Lee J. K., Lowit A., et al. (2018). Standardisation of defined approaches for skin sensitisation testing to support regulatory use and international adoption: Position of the International Cooperation on Alternative Test Methods. Arch. Toxicol. 92, 611–617. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cherkasov A., Muratov E. N., Fourches D., Varnek A., Baskin I. I., Cronin M., Dearden J., Gramatica P., Martin Y. C., Todeschini R., et al. (2014). QSAR modeling: Where have you been? Where are you going to? J. Med. Chem. 57, 4977–5010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Coleman K. P., McNamara L. R., Grailer T. P., Willoughby J. A., Keller D. J., Patel P., Thomas S., Dilworth C. (2015). Evaluation of an in vitro human dermal sensitization test for use with medical device extracts . Appl. In Vitro Toxicol. 1, 118–130. [Google Scholar]
Daniel A. B., Strickland J., Allen D., Casati S., Zuang V., Barroso J., Whelan M., Régimbald-Krnel M. J., Kojima H., Nishikawa A., et al. (2018). International regulatory requirements for skin sensitization testing. Regul. Toxicol. Pharmacol. 95, 52–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
Devillers J. (2000). A neural network SAR model for allergic contact dermatitis. Toxicol. Methods 10, 181–193. [Google Scholar]
FDA. (2020a). FDA’s predictive toxicology roadmap. Accessed January 20, 2022. https://www.fda.gov/science-research/about-science-research-fda/fdas-predictive-toxicology-roadmap.
FDA. (2020b). Use of International Standard ISO 10993-1, “Biological evaluation of medical devices—Part 1: Evaluation and testing within a risk management process.” Accessed September 28, 2021. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-international-standard-iso-10993-1-biological-evaluation-medical-devices-part-1-evaluation-and.
FDA. (2022). Medical device development tools (MDDT). Accessed March 30, 2022. https://www.fda.gov/medical-devices/science-and-research-medical-devices/medical-device-development-tools-mddt.
Fedorowicz A., Singh H., Soderholm S., Demchuk E. (2005). Structure-activity models for contact sensitization. Chem. Res. Toxicol. 18, 954–969. [DOI] [PubMed] [Google Scholar]
Fourches D., Muratov E., Tropsha A. (2016). Trust, but verify II: A practical guide to chemogenomics data curation. J. Chem. Inf. Model. 56, 1243–1252. [DOI] [PMC free article] [PubMed] [Google Scholar]
Golbraikh A., Muratov E., Fourches D., Tropsha A. (2014). Data set modelability by QSAR. J. Chem. Inf. Model. 54, 1–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Golden E., Macmillan D. S., Dameron G., Kern P., Hartung T., Maertens A. (2021). Evaluation of the global performance of eight in silico skin sensitization models using human data. ALTEX 38, 33–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
Golla S., Madihally S., Robinson R. L., Gasem K. A. M. (2009). Quantitative structure–property relationship modeling of skin sensitization: A quantitative prediction. Toxicol. In Vitro 23, 454–465. [DOI] [PubMed] [Google Scholar]
Grundström G., Borrebaeck C. (2019). Skin sensitization testing—What’s next? Int. J. Med. Sci. 20, 666. [DOI] [PMC free article] [PubMed] [Google Scholar]
Haneke K. E., Tice R. R., Carson B. L., Margolin B. H., Stokes W. S. (2001). ICCVAM evaluation of the murine local lymph node assay. Data analyses completed by the National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods. Regul. Toxicol. Pharmacol. 34, 274–286. [DOI] [PubMed] [Google Scholar]
Hansel K., Tramontana M., Bianchi L., Cerulli E., Patruno C., Napolitano M., Stingeni L. (2020). Contact sensitivity to electrocardiogram electrodes due to acrylic acid: A rare cause of medical device allergy. Contact Dermatitis. 82, 118–121. [DOI] [PubMed] [Google Scholar]
Herman A., Baeck M., de Montjoye L., Bruze M., Giertz E., Goossens A., Mowitz M. (2019). Allergic contact dermatitis caused by isobornyl acrylate in the Enlite glucose sensor and the Paradigm MiniMed Quick—Set insulin infusion set. Contact Dermatitis. 81, 432–437. [DOI] [PubMed] [Google Scholar]
Herman A., Darrigade A. S., de Montjoye L., Baeck M. (2020). Contact dermatitis caused by glucose sensors in diabetic children. Contact Dermatitis. 82, 105–111. [DOI] [PubMed] [Google Scholar]
ICCVAM. (2018). A strategic roadmap for establishing new approaches to evaluate the safety of chemicals and medical products in the United States. Accessed October 9, 2021. https://ntp.niehs.nih.gov/pubhealth/evalatm/natl-strategy/index.html.
ICCVAM and NICEATM. (1999). The murine local lymph node assay: A test method for assessing the allergic contact dermatitis potential of chemicals/compounds. Accessed October 4, 2021. http://ntp.niehs.nih.gov/iccvam/docs/immunotox_docs/llna/llnarep.pdf.
ISO. (2010). 10993-10:2010—Biological evaluation of medical devices—Part 10: Tests for irritation and skin sensitization. Accessed January 25, 2022. https://www.iso.org/standard/40884.html.
Kimber I., Weisenberger C. (1989). A murine local lymph node assay for the identification of contact allergens. Arch. Toxicol. 63, 274–282. [DOI] [PubMed] [Google Scholar]
Kleinstreuer N. C., Hoffmann S., Alépée N., Allen D., Ashikaga T., Casey W., Clouet E., Cluzel M., Desprez B., Gellatly N., et al. (2018). Non-animal methods to predict skin sensitization (II): An assessment of defined approaches. Crit. Rev. Toxicol. 48, 359–374. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kramer D. B., Xu S., Kesselheim A. S. (2020). Regulation of medical devices in the United States and European Union. In The Ethical Challenges of Emerging Medical Technologies (A. L. Caplan and B. Parent, Eds.), pp. 41–49. Taylor & Francis, London and New York, NY. [Google Scholar]
Kuz’min V. E., Muratov E. N., Artemenko A. G., Gorb L., Qasim M., Leszczynski J. (2008). The effects of characteristics of substituents on toxicity of the nitroaromatics: HiT QSAR study. J. Comput. Aided Mol. Des. 22, 747–759. [DOI] [PubMed] [Google Scholar]
Lachenmeier D. W. (2008). Safety evaluation of topical applications of ethanol on the skin and inside the oral cavity. J. Occup. Med. Toxicol. 3, 26. [DOI] [PMC free article] [PubMed] [Google Scholar]
Magnusson B., Kligman A. M. (1969). The identification of contact allergens by animal assay. The guinea pig maximization test. J. Invest. Dermatol. 52, 268–276. [DOI] [PubMed] [Google Scholar]
Moon J., Kim J., Shin Y., Hwang S. (2020). Confidence-aware learning for deep neural networks. Arxiv 2007.01458.
OECD. (2007, Sep. 3) Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models. Guidance Document No. 69. doi:10.1787/9789264085442-en. Accessed October 29, 2021. https://www.oecd-ilibrary.org/environment/guidance-document-on-the-validation-of-quantitative-structure-activity-relationship-q-sar-models_9789264085442-en.
OECD. (2010, Jul 23). Test No. 429: Skin sensitization: Local lymph node assay. OECD Guidelines for the Testing of Chemicals, Section 4. doi:10.1787/9789264071100-en. Accessed March 15, 2022. https://www.oecd-ilibrary.org/environment/test-no-429-skin-sensitisation_9789264071100-en.
OECD. (2021, Jun 22). Guideline No. 497: Defined approaches on skin sensitisation. doi:10.1787/b92879a4-en. Accessed March 30, 2022. 10.1787/b92879a4-en. [DOI]
Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V. (2012). Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, 2825–2830. [Google Scholar]
Reeve L., Baldrick P. (2017). Biocompatibility assessments for medical devices—Evolving regulatory considerations. Expert Rev. Med. Dev. 14, 161–167. [DOI] [PubMed] [Google Scholar]
Riniker S., Landrum G. (2013). Similarity maps—A visualization strategy for molecular fingerprints and machine-learning methods. J. Cheminform. 5, 43. [DOI] [PMC free article] [PubMed] [Google Scholar]
Roberts D. W., Aptula A., Api A. M. (2017). Structure–potency relationships for epoxides in allergic contact dermatitis. Chem. Res. Toxicol. 30, 524–531. [DOI] [PubMed] [Google Scholar]
Sushko I., Salmina E., Potemkin V. a., Poda G., Tetko I. v (2012). ToxAlerts: A web server of structural alerts for toxic chemicals and compounds with potential adverse reactions. J. Chem. Inf. Model. 52, 2310–2316. [DOI] [PMC free article] [PubMed] [Google Scholar]
Svobodová L., Rucki M., Vlkova A., Kejlova K., Jírová D., Dvorakova M., Kolarova H., Kandárová H., Pôbiš P., Heinonen T., et al. (2021). Sensitization potential of medical devices detected by in vitro and in vivo methods. ALTEX 38, 419–430. [DOI] [PubMed] [Google Scholar]
Tomlinson S. M., Malmstrom R. D., Russo A., Mueller N., Pang Y.-P., Watowich S. J. (2009). Structure-based discovery of dengue virus protease inhibitors. Antiviral Res. 82, 110–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
Toropova A. P., Toropov A. A. (2017). Hybrid optimal descriptors as a tool to predict skin sensitization in accordance to OECD principles. Toxicol. Lett. 275, 57–66. [DOI] [PubMed] [Google Scholar]
Tropsha A. (2010). Best practices for QSAR model development, validation, and exploitation. Mol. Inform. 29, 476–488. [DOI] [PubMed] [Google Scholar]
Tropsha A., Golbraikh A. (2007). Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr. Pharm. Des. 13, 3494–3504. [DOI] [PubMed] [Google Scholar]
Vapnik V. (2000). The Nature of Statistical Learning Theory, 2nd ed. Springer, New York. [Google Scholar]
Wu J., Chen X. Y., Zhang H., Xiong L. D., Lei H., Deng S. H. (2019). Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17, 26–40. [Google Scholar]
Yuan H., Huang J., Cao C. (2009). Prediction of skin sensitization with a particle swarm optimized support vector machine. Int. J. Mol. Sci. 10, 3237–3254. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

kfac078_Supplementary_Data

Click here for additional data file.^{(216.5KB, zip)}

Data Availability Statement

All curated datasets in Excel format and the results for virtual screening of the ELSIE dataset are freely available in the Supplementary Materials that accompany this manuscript.

[kfac078-B1] Alves V. M., Auerbach S. S., Kleinstreuer N., Rooney J. P., Muratov E. N., Rusyn I., Tropsha A., Schmitt C. (2021). Curated data in—Trustworthy in silico models out: The impact of data quality on the reliability of artificial intelligence models as alternatives to animal testing. Altern. Lab. Anim. 49, 73–82. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B2] Alves V. M., Capuzzi S. J., Braga R. C., Borba J. V. B., Silva A. C., Luechtefeld T., Hartung T., Andrade C. H., Muratov E. N., Tropsha A. (2018a). A perspective and a new integrated computational strategy for skin sensitization assessment. ACS Sustainable Chem. Eng. 6, 2845–2859. [Google Scholar]

[kfac078-B3] Alves V. M., Capuzzi S. J., Muratov E., Braga R. C., Thornton T., Fourches D., Strickland J., Kleinstreuer N., Andrade C. H., Tropsha A. (2016a). QSAR models of human data can enrich or replace LLNA testing for human skin sensitization. Green Chem. 18, 6501–6515. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B4] Alves V. M., Muratov E. N., Capuzzi S. J., Politi R., Low Y., Braga R. C., Zakharov A. v., Sedykh A., Mokshyna E., Farag S., et al. (2016b). Alarms about structural alerts. Green Chem. 18, 4348–4360. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B5] Alves V. M., Muratov E. N., Zakharov A., Muratov N. N., Andrade C. H., Tropsha A. (2018b). Chemical toxicity prediction for major classes of industrial chemicals: Is it possible to develop universal models covering cosmetics, drugs, and pesticides? Food Chem. Toxicol. 112, 526–534. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B6] Andersen K. E., Vølund A., Frankild S. (1995). The guinea pig maximization test—With a multiple dose design. Acta Derm. Venereol. 75, 463–469. [DOI] [PubMed] [Google Scholar]

[kfac078-B7] Betts C. J., Beresford L., Dearman R. J., Lalko J., Api A. P., Kimber I. (2007). The use of ethanol: Diethylphthalate as a vehicle for the local lymph node assay. Contact Dermatitis. 56, 70–75. [DOI] [PubMed] [Google Scholar]

[kfac078-B8] Bienfait B., Ertl P. (2013). JSME: A free molecule editor in JavaScript. J. Cheminform. 5, 24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B9] Blagg J. (2010). Structural alerts for toxicity. In Burger’s Medicinal Chemistry and Drug Discovery (D. J. Abraham, Ed.), pp. 301–334. John Wiley & Sons, Inc., Hoboken, NJ. [Google Scholar]

[kfac078-B10] Borba J. V. B., Alves V. M., Braga R. C., Korn D. R., Overdahl K., Silva A. C., Hall S. U. S., Overdahl E., Kleinstreuer N., Strickland J., et al. (2022). STopTox: An in silico alternative to animal testing for acute systemic and topical toxicity. Environ. Health Perspect 130, 27012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B11] Borba J. V. B., Braga R. C., Alves V. M., Muratov E. N., Kleinstreuer N., Tropsha A., Andrade C. H. (2021). Pred-Skin: A web portal for accurate prediction of human skin sensitizers. Chem. Res. Toxicol. 34, 258–267. [DOI] [PubMed] [Google Scholar]

[kfac078-B12] Braga R. C., Alves V. M., Muratov E. N., Strickland J., Kleinstreuer N., Trospsha A., Andrade C. H. (2017). Pred-Skin: A fast and reliable web application to assess skin sensitization effect of chemicals. J. Chem. Inf. Model. 57, 1013–1017. [DOI] [PubMed] [Google Scholar]

[kfac078-B13] Breiman L. (2001). Random forests. Mach. Learn. 45, 5–32. [Google Scholar]

[kfac078-B14] Bronzino J. D. (2006). Medical Devices and Systems, 1st ed. CRC Press, Boca Raton, FL. 10.1201/9781420003864. [DOI] [Google Scholar]

[kfac078-B15] Buehler E. V. (1965). Delayed contact hypersensitivity in the guinea pig. Arch. Dermatol. 91, 171–177. [DOI] [PubMed] [Google Scholar]

[kfac078-B16] Capuzzi S. J., Kim I. S.-J., Lam W. I., Thornton T. E., Muratov E. N., Pozefsky D., Tropsha A. (2017). Chembench: A publicly accessible, integrated cheminformatics portal. J. Chem. Inf. Model. 57, 105–108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B17] Casati S., Aschberger K., Barroso J., Casey W., Delgado I., Kim T. S., Kleinstreuer N., Kojima H., Lee J. K., Lowit A., et al. (2018). Standardisation of defined approaches for skin sensitisation testing to support regulatory use and international adoption: Position of the International Cooperation on Alternative Test Methods. Arch. Toxicol. 92, 611–617. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B18] Cherkasov A., Muratov E. N., Fourches D., Varnek A., Baskin I. I., Cronin M., Dearden J., Gramatica P., Martin Y. C., Todeschini R., et al. (2014). QSAR modeling: Where have you been? Where are you going to? J. Med. Chem. 57, 4977–5010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B19] Coleman K. P., McNamara L. R., Grailer T. P., Willoughby J. A., Keller D. J., Patel P., Thomas S., Dilworth C. (2015). Evaluation of an in vitro human dermal sensitization test for use with medical device extracts . Appl. In Vitro Toxicol. 1, 118–130. [Google Scholar]

[kfac078-B20] Daniel A. B., Strickland J., Allen D., Casati S., Zuang V., Barroso J., Whelan M., Régimbald-Krnel M. J., Kojima H., Nishikawa A., et al. (2018). International regulatory requirements for skin sensitization testing. Regul. Toxicol. Pharmacol. 95, 52–65. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B21] Devillers J. (2000). A neural network SAR model for allergic contact dermatitis. Toxicol. Methods 10, 181–193. [Google Scholar]

[kfac078-B22] FDA. (2020a). FDA’s predictive toxicology roadmap. Accessed January 20, 2022. https://www.fda.gov/science-research/about-science-research-fda/fdas-predictive-toxicology-roadmap.

[kfac078-B23] FDA. (2020b). Use of International Standard ISO 10993-1, “Biological evaluation of medical devices—Part 1: Evaluation and testing within a risk management process.” Accessed September 28, 2021. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-international-standard-iso-10993-1-biological-evaluation-medical-devices-part-1-evaluation-and.

[kfac078-B24] FDA. (2022). Medical device development tools (MDDT). Accessed March 30, 2022. https://www.fda.gov/medical-devices/science-and-research-medical-devices/medical-device-development-tools-mddt.

[kfac078-B25] Fedorowicz A., Singh H., Soderholm S., Demchuk E. (2005). Structure-activity models for contact sensitization. Chem. Res. Toxicol. 18, 954–969. [DOI] [PubMed] [Google Scholar]

[kfac078-B26] Fourches D., Muratov E., Tropsha A. (2016). Trust, but verify II: A practical guide to chemogenomics data curation. J. Chem. Inf. Model. 56, 1243–1252. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B27] Golbraikh A., Muratov E., Fourches D., Tropsha A. (2014). Data set modelability by QSAR. J. Chem. Inf. Model. 54, 1–4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B28] Golden E., Macmillan D. S., Dameron G., Kern P., Hartung T., Maertens A. (2021). Evaluation of the global performance of eight in silico skin sensitization models using human data. ALTEX 38, 33–48. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B29] Golla S., Madihally S., Robinson R. L., Gasem K. A. M. (2009). Quantitative structure–property relationship modeling of skin sensitization: A quantitative prediction. Toxicol. In Vitro 23, 454–465. [DOI] [PubMed] [Google Scholar]

[kfac078-B30] Grundström G., Borrebaeck C. (2019). Skin sensitization testing—What’s next? Int. J. Med. Sci. 20, 666. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B31] Haneke K. E., Tice R. R., Carson B. L., Margolin B. H., Stokes W. S. (2001). ICCVAM evaluation of the murine local lymph node assay. Data analyses completed by the National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods. Regul. Toxicol. Pharmacol. 34, 274–286. [DOI] [PubMed] [Google Scholar]

[kfac078-B32] Hansel K., Tramontana M., Bianchi L., Cerulli E., Patruno C., Napolitano M., Stingeni L. (2020). Contact sensitivity to electrocardiogram electrodes due to acrylic acid: A rare cause of medical device allergy. Contact Dermatitis. 82, 118–121. [DOI] [PubMed] [Google Scholar]

[kfac078-B33] Herman A., Baeck M., de Montjoye L., Bruze M., Giertz E., Goossens A., Mowitz M. (2019). Allergic contact dermatitis caused by isobornyl acrylate in the Enlite glucose sensor and the Paradigm MiniMed Quick—Set insulin infusion set. Contact Dermatitis. 81, 432–437. [DOI] [PubMed] [Google Scholar]

[kfac078-B34] Herman A., Darrigade A. S., de Montjoye L., Baeck M. (2020). Contact dermatitis caused by glucose sensors in diabetic children. Contact Dermatitis. 82, 105–111. [DOI] [PubMed] [Google Scholar]

[kfac078-B35] ICCVAM. (2018). A strategic roadmap for establishing new approaches to evaluate the safety of chemicals and medical products in the United States. Accessed October 9, 2021. https://ntp.niehs.nih.gov/pubhealth/evalatm/natl-strategy/index.html.

[kfac078-B36] ICCVAM and NICEATM. (1999). The murine local lymph node assay: A test method for assessing the allergic contact dermatitis potential of chemicals/compounds. Accessed October 4, 2021. http://ntp.niehs.nih.gov/iccvam/docs/immunotox_docs/llna/llnarep.pdf.

[kfac078-B37] ISO. (2010). 10993-10:2010—Biological evaluation of medical devices—Part 10: Tests for irritation and skin sensitization. Accessed January 25, 2022. https://www.iso.org/standard/40884.html.

[kfac078-B38] Kimber I., Weisenberger C. (1989). A murine local lymph node assay for the identification of contact allergens. Arch. Toxicol. 63, 274–282. [DOI] [PubMed] [Google Scholar]

[kfac078-B39] Kleinstreuer N. C., Hoffmann S., Alépée N., Allen D., Ashikaga T., Casey W., Clouet E., Cluzel M., Desprez B., Gellatly N., et al. (2018). Non-animal methods to predict skin sensitization (II): An assessment of defined approaches. Crit. Rev. Toxicol. 48, 359–374. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B40] Kramer D. B., Xu S., Kesselheim A. S. (2020). Regulation of medical devices in the United States and European Union. In The Ethical Challenges of Emerging Medical Technologies (A. L. Caplan and B. Parent, Eds.), pp. 41–49. Taylor & Francis, London and New York, NY. [Google Scholar]

[kfac078-B41] Kuz’min V. E., Muratov E. N., Artemenko A. G., Gorb L., Qasim M., Leszczynski J. (2008). The effects of characteristics of substituents on toxicity of the nitroaromatics: HiT QSAR study. J. Comput. Aided Mol. Des. 22, 747–759. [DOI] [PubMed] [Google Scholar]

[kfac078-B42] Lachenmeier D. W. (2008). Safety evaluation of topical applications of ethanol on the skin and inside the oral cavity. J. Occup. Med. Toxicol. 3, 26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B43] Magnusson B., Kligman A. M. (1969). The identification of contact allergens by animal assay. The guinea pig maximization test. J. Invest. Dermatol. 52, 268–276. [DOI] [PubMed] [Google Scholar]

[kfac078-B44] Moon J., Kim J., Shin Y., Hwang S. (2020). Confidence-aware learning for deep neural networks. Arxiv 2007.01458.

[kfac078-B45] OECD. (2007, Sep. 3) Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models. Guidance Document No. 69. doi:10.1787/9789264085442-en. Accessed October 29, 2021. https://www.oecd-ilibrary.org/environment/guidance-document-on-the-validation-of-quantitative-structure-activity-relationship-q-sar-models_9789264085442-en.

[kfac078-B46] OECD. (2010, Jul 23). Test No. 429: Skin sensitization: Local lymph node assay. OECD Guidelines for the Testing of Chemicals, Section 4. doi:10.1787/9789264071100-en. Accessed March 15, 2022. https://www.oecd-ilibrary.org/environment/test-no-429-skin-sensitisation_9789264071100-en.

[kfac078-B47] OECD. (2021, Jun 22). Guideline No. 497: Defined approaches on skin sensitisation. doi:10.1787/b92879a4-en. Accessed March 30, 2022. 10.1787/b92879a4-en. [DOI]

[kfac078-B48] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V. (2012). Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, 2825–2830. [Google Scholar]

[kfac078-B49] Reeve L., Baldrick P. (2017). Biocompatibility assessments for medical devices—Evolving regulatory considerations. Expert Rev. Med. Dev. 14, 161–167. [DOI] [PubMed] [Google Scholar]

[kfac078-B50] Riniker S., Landrum G. (2013). Similarity maps—A visualization strategy for molecular fingerprints and machine-learning methods. J. Cheminform. 5, 43. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B51] Roberts D. W., Aptula A., Api A. M. (2017). Structure–potency relationships for epoxides in allergic contact dermatitis. Chem. Res. Toxicol. 30, 524–531. [DOI] [PubMed] [Google Scholar]

[kfac078-B52] Sushko I., Salmina E., Potemkin V. a., Poda G., Tetko I. v (2012). ToxAlerts: A web server of structural alerts for toxic chemicals and compounds with potential adverse reactions. J. Chem. Inf. Model. 52, 2310–2316. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B53] Svobodová L., Rucki M., Vlkova A., Kejlova K., Jírová D., Dvorakova M., Kolarova H., Kandárová H., Pôbiš P., Heinonen T., et al. (2021). Sensitization potential of medical devices detected by in vitro and in vivo methods. ALTEX 38, 419–430. [DOI] [PubMed] [Google Scholar]

[kfac078-B54] Tomlinson S. M., Malmstrom R. D., Russo A., Mueller N., Pang Y.-P., Watowich S. J. (2009). Structure-based discovery of dengue virus protease inhibitors. Antiviral Res. 82, 110–114. [DOI] [PMC free article] [PubMed] [Google Scholar]

[kfac078-B55] Toropova A. P., Toropov A. A. (2017). Hybrid optimal descriptors as a tool to predict skin sensitization in accordance to OECD principles. Toxicol. Lett. 275, 57–66. [DOI] [PubMed] [Google Scholar]

[kfac078-B56] Tropsha A. (2010). Best practices for QSAR model development, validation, and exploitation. Mol. Inform. 29, 476–488. [DOI] [PubMed] [Google Scholar]

[kfac078-B57] Tropsha A., Golbraikh A. (2007). Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr. Pharm. Des. 13, 3494–3504. [DOI] [PubMed] [Google Scholar]

[kfac078-B58] Vapnik V. (2000). The Nature of Statistical Learning Theory, 2nd ed. Springer, New York. [Google Scholar]

[kfac078-B59] Wu J., Chen X. Y., Zhang H., Xiong L. D., Lei H., Deng S. H. (2019). Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17, 26–40. [Google Scholar]

[kfac078-B60] Yuan H., Huang J., Cao C. (2009). Prediction of skin sensitization with a particle swarm optimized support vector machine. Int. J. Mol. Sci. 10, 3237–3254. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

PreS/MD: Predictor of Sensitization Hazard for Chemical Substances Released From Medical Devices

Vinicius M Alves

Joyce V B Borba

Rodolpho C Braga

Daniel R Korn

Nicole Kleinstreuer

Kevin Causey

Alexander Tropsha

Diego Rua

Eugene N Muratov

Abstract

MATERIALS AND METHODS

Figure 1.

Data Collection and Curation

European Chemical Agency dataset

Data from the literature

Combined GPMT data from ECHA and the literature

Additional test set

Extractables and leachables set

Data Curation

Analysis of Chemical Fragments

QSAR Modeling

Mechanistic Interpretation of QSAR Models

Model Implementation

RESULTS AND DISCUSSIONS

Analysis of Chemical Fragments

Table 1.

QSAR Models for Predicting Sensitization Using GPMT Data

Table 2.

PreS/MD Usability

Figure 2.

Case Studies

Table 3.

The Use of GPMT to Predict Human Sensitization

Table 4.

An Alternative to Animal Testing to Assess Sensitization for Medical Devices

CONCLUSIONS

Supplementary Material

ACKNOWLEDGMENTS

FUNDING

DECLARATION OF CONFLICTING INTERESTS

Contributor Information

Data Availability

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases