Abstract
Clinical trials can fail to detect rare adverse events (AEs). We assessed the ability of pharmacological target adverse‐event (TAE) profiles to predict AEs on US Food and Drug Administration (FDA) drug labels at least 4 years after approval. TAE profiles were generated by aggregating AEs from the FDA adverse event reporting system (FAERS) reports and the FDA drug labels for drugs that hit a common target. A genetic algorithm (GA) was used to choose the adverse event (AE) case count (N), disproportionality score in FAERS (proportional reporting ratio (PRR)), and percent of comparator drug labels with an AE to maximize F‐measure. With FAERS data alone, precision, recall, and specificity were 0.57, 0.78, and 0.61, respectively. After including FDA drug label data, precision, recall, and specificity improved to 0.67, 0.81, and 0.71, respectively. Eighteen of 23 (78%) postmarket label changes were identified correctly. TAE analysis shows promise as a method to predict AEs at the time of drug approval.
Study Highlights.
WHAT IS THE CURRENT KNOWLEDGE ON THE TOPIC?
To date, predictive safety in the postmarket setting at the FDA has relied upon expert review of available evidence from case reports, medical records, FAERS, and the literature. Systemic, quantitative methods are being evaluated.
WHAT QUESTION DID THIS STUDY ADDRESS?
TAE profiles aggregate drug AEs by shared pharmacological targets. This study assesses the use of TAE profiles in anticipating significant postmarket drug AEs of interest.
WHAT DOES THIS STUDY ADD TO OUR KNOWLEDGE?
This study confirms that aggregating AEs by pharmacological target is predictive of postmarket AEs.
HOW MIGHT THIS CHANGE DRUG DISCOVERY, DEVELOPMENT, AND/OR THERAPEUTICS?
In addition to assisting with postmarket pharmacovigilance, this approach may also be used to anticipate AEs that may occur during drug development.
In 2016, the US Food and Drug Administration (FDA) received over 1.6 million adverse event (AE) reports and the number of reports has increased yearly.1 Many of these AEs are serious, including fatalities.2 Thus, drug AE prediction would serve a critical public health need. Although clinical trials may be a gold standard for detecting more common AEs, these trials are often not of a sufficient size or duration to detect rare or time‐dependent AEs that emerge when the drug is used in clinical practice. Indeed, a recent review of therapeutics approved by the FDA between 2001 and 2010 found that 32% of drugs experienced a postmarket safety event, including withdrawal from the market, addition of a boxed warning, or an FDA‐issued safety communication.3 Additionally, the studied population may be highly selective in a clinical trial. Because of exclusion criteria, many concomitant medications and comorbidities may be eliminated from a trial, leading to many potentially important drug interactions and AEs being missed. Furthermore, with trials becoming smaller and more selective, increasing emphasis and importance is placed on postmarket pharmacovigilance.
Traditional pharmacovigilance relies on data mining systems, such as the FDA adverse event reporting system (FAERS)4 and the Sentinel Initiative4 to obtain information about safety events once the drug is in the marketplace. However, these methods are not predictive or proactive; rather, they are reactive. A recent example is the FDA safety communication regarding increased risk for serotonin syndrome and adrenal insufficiency with opioid use.5 To overcome these weaknesses, the center for drug evaluation and research (CDER) has a strong interest in developing predictive methods to assist in postmarket surveillance of AEs. To date, there have been many efforts to predict AEs using a variety of data, including FAERS reports,6, 7 literature reports,7, 8 pathway/signaling,8, 9 cheminformatics,6, 8, 9 and chemogenomics data.8, 9 Although many of these models are promising, several are limited in drug or AE scope, accuracy, or usage of proprietary data.
We have developed a model to predict AEs based on pharmacological target adverse event (TAE) analysis. TAE analysis aggregates AE reports from drugs that share molecular targets with a drug of interest. This model represents a blend of approaches; it applies a mechanistic target analysis to an observational database. Here, we describe a pilot study with six drugs of interest to assess the ability of TAE analysis to predict postmarket AEs. We focus on a set of 43 AE categories of interest to regulators performing pharmacovigilance, referred to as designated medical events.
Materials and methods
Study overview
We performed a study to predict what AEs are listed on the FDA label current to January 2017 using only data that were available at the time of approval for the drug of interest. This was performed with a multilabel classification method. Predictions are generated by aggregating historical AEs from comparator drugs that share receptor pharmacology with a drug of interest. See Figure 1 for an overview of the target analysis workflow.
Drugs chosen and generation of TAE profiles
Six drugs with at least 4 years postmarket experience were chosen to represent a variety of therapeutic areas: certolizumab pegol, desvenlafaxine, etravirine, liraglutide, pazopanib, and rivaroxaban. The selected comparator drugs with shared pharmacologic targets are listed below.
Certolizumab: adalimumab
Desvenlafaxine: duloxetine, venlafaxine
Etravirine: delavirdine, didanosine, lamivudine, zalcitibine, zidovudine
Liraglutide: exenatide
Pazopanib: imatinib, palifermin, sorafenib, sunitinib
Rivaroxaban: ardeparin, fondaparinux, heparin
See Table 1 for additional details of the TAE profiles that were generated, which consist of the set of AEs associated with a pharmacological target. The TAE profiles were generated separately by using data from the FAERS and the FDA drug labels.
Table 1.
Drug | Approval | Indication | Targets | Comparators |
---|---|---|---|---|
Certolizumab Pegol | Apr 2008 | Crohn disease, rheumatoid arthritis, psoriatic arthritis, ankylosing spondylitis | TNF | Adalimumab |
Desvenlafaxine | Feb 2008 | Major depressive disorder | NET, SERT |
Duloxetine venlafaxine |
Etravirine | Mar 2012 | HIV‐1 infection in conjunction with other antiretrovirals | HIV‐1 RT |
Delavirdine didanosine lamivudine zalcitibine zidovudine |
Liraglutide | Jan 2010 | Improve glycemic control in adults with type 2 diabetes mellitus | GLP1R | Exenatide |
Pazopanib | Oct 2009 | Advanced renal cell carcinoma, advanced soft tissue sarcoma | VEGFR1, VEGFR2, VEGFR3, KIT, PDGFRA, PDGFRB, FGFR3, ITK/TSK, FGFR1 |
Imatinib, palifermin sorafenib, sunitinib |
Rivaroxaban | Jul 2011 | DVT, pulmonary embolism, risk reduction of DVT and PE, prophylaxis of DVT following hip or knee replacement surgery | F10 (Factor Xa) |
Ardeparin fondaparinux heparin |
DVT, deep vein thrombosis; FAERS, US Food and Drug Administration Adverse Event Reporting System; FGFR, fibroblast growth factor receptor; GLP1R, glucagon‐like peptide‐1 receptor; HIV, human immunodeficiency virus; ITK, tyrosine‐protein kinase ITK/TSK; KIT, KIT proto‐oncogene receptor tyrosine kinase NET, sodium‐dependent noradrenaline transporter PDGFR, platelet‐derived growth factor receptor; PE, pulmonary embolism; RT, reverse transcriptase SERT, sodium‐dependent serotonin transporter TNF, tumor necrosis factor; VEGFR, vascular endothelial growth factor receptor.
Six drugs were chosen for this study. The Targets column lists the pharmacological targets used to generate target‐adverse event profiles from the FAERS reports. The Comparators column lists drugs used to generate target‐adverse event profiles from historical product labels. Comparators share pharmacological targets with the six study drugs and have prior time on the US market.
TAEs from the FAERS reports
TAE profiles from the FAERS reports were generated using a bioinformatics tool, EFFECT.10 EFFECT aggregates the FAERS reports by mapping the active ingredients recorded in each case report to their respective pharmacological targets. The EFFECT knowledgebase can then be queried by target or a set of targets or a set of comparator drugs with shared targets to capture the subset of case reports, which can then be used to generate TAE profiles.
The publicly available FAERS data used in this study was mostly from 2004Q1 to 2015Q4. Within the data integration process, the FAERS medication synonyms are mapped to drugs and compounds in the DrugBank11 and PubChem.12 Based on this medication‐drug mapping the link to biomolecules and molecular mechanisms involved in pharmacodynamics and pharmacokinetics is established via UniProt13 and the pathway resources NCI Nature,14 Reactome,15 and BioCarta.16 Literature data is extracted based on co‐occurrence of EFFECT entity names and synonyms in PubMed17 abstracts. Drugs are classified according to the Anatomical Therapeutic Chemical classification system.18 Indications and reactions are classified using the MedDRA dictionary. Proportional reporting ratios (PRRs) are calculated using the approach described by van Puijenbroek et al.19 In a manner analogous to the computation of PRRs for drug AE pairs,19 2 × 2 contingency tables are generated and disproportionality scores computed for TAE pairs. In the case when multiple targets are used, disproportionality is computed for subset‐AE pairs. For a more detailed description, see Schotland et al.20 and Racz et al.21 In the case when multiple targets or comparator drugs are used, disproportionality is computed for subset‐AE pairs, as described above via 2 × 2 contingency tables.
The resulting profile is a list of AEs coded as a medical dictionary for regulatory activities (MedDRA) Preferred Terms,22 each with an associated case count (N) and disproportionality score (PRR with 95% confidence interval). The MedDRA terms were then mapped to a list of designated medical events. Designated medical events are MedDRA Preferred Terms grouped to capture similar AEs into mechanistic‐related safety events. For example, the MedDRA preferred terms “cerebral artery occlusion” and “cerebral artery thrombosis” may be used by different reporters to refer to the same AE. Thus, the combining of MedDRA Preferred Terms into designated medical events was designed to allow the aggregation of the FAERS reports to capture relevant medical events with similar etiologies and likely target‐related mechanisms. Unlabeled designated medical events represent key AEs that are followed by the FDA Office of Surveillance and Epidemiology in the postmarket setting. The existing list was expanded to a list of 43 categories (see Supporting Information for the full list). Figure 2 contains a list of the 43 designated medical events and their prevalence in the FAERS database and the FDA product labels. Roughly 900 (4%) of MedDRA preferred terms were used and no term was used more than once. The presence of one MedDRA preferred term was sufficient to assign the designated medical event to the TAE profile. See Table 2 for an example TAE profile generated from the FAERS reports. The specific queries for the six digital object identifiers (DOIs) can be found in Data S1 .
Table 2.
Adverse event (MedDRA preferred term) | N | PRR | PRR025 | Designated medical event |
---|---|---|---|---|
Drug exposure during pregnancy | 1,869 | 19.91 | 19.05 | |
Pyrexia | 1,113 | 3.29 | 3.11 | |
Anemia | 892 | 4.48 | 4.20 | Anemia |
Nausea | 806 | 1.12 | 1.05 | |
Vomiting | 805 | 1.81 | 1.70 | |
Diarrhea | 780 | 1.67 | 1.56 | |
Alanine aminotransferase increased | 692 | 8.62 | 8.00 | |
Aspartate aminotransferase increased | 617 | 8.41 | 7.78 | |
Pregnancy | 597 | 16.18 | 14.93 | |
Immune reconstitution syndrome | 555 | 214.99 | 193.92 | |
Renal failure acute | 520 | 4.42 | 4.06 | Renal toxicity |
Asthenia | 484 | 1.37 | 1.25 | |
Drug interaction | 482 | 3.11 | 2.85 | |
Rash | 468 | 1.44 | 1.32 | |
Weight decreased | 466 | 1.93 | 1.76 | |
Drug ineffective | 465 | 0.53 | 0.49 | |
Neutropenia | 462 | 5.15 | 4.70 | Neutropenia |
Dyspnea | 455 | 0.88 | 0.81 | |
Abdominal pain | 454 | 2.02 | 1.84 | |
Lactic acidosis | 436 | 21.37 | 19.42 | |
Headache | 433 | 0.79 | 0.72 | |
Abortion spontaneous | 431 | 9.93 | 9.03 | |
Renal failure | 428 | 3.20 | 2.91 | Renal toxicity |
Death | 407 | 0.58 | 0.53 | |
Fatigue | 394 | 0.69 | 0.62 | |
Premature baby | 388 | 15.04 | 13.61 | |
Jaundice | 369 | 10.08 | 9.10 | Hepatic toxicity |
Malaise | 369 | 0.96 | 0.86 | |
Blood bilirubin increased | 360 | 10.03 | 9.04 | |
Caesarean section | 360 | 17.75 | 15.99 | |
Blood creatinine increased | 354 | 4.76 | 4.29 | Renal toxicity |
Hepatic failure | 339 | 9.45 | 8.49 | Hepatic toxicity |
Pneumonia | 335 | 1.27 | 1.14 | |
Hepatitis | 330 | 12.54 | 11.25 | Hepatic toxicity |
Blood alkaline phosphatase increased | 318 | 8.39 | 7.52 | |
Drug resistance | 318 | 30.49 | 27.23 | |
Gamma‐glutamyltransferase increased | 313 | 9.99 | 8.94 | |
Pancreatitis | 304 | 4.62 | 4.13 | Acute and chronic pancreatitis |
Thrombocytopenia | 298 | 2.89 | 2.58 | Anemia |
Dizziness | 290 | 0.61 | 0.55 |
FAERS, US Food and Drug Administration Adverse Event Reporting System; HIV, human immunodeficiency virus; N, case count; PRR, proportional reporting ratio; PRR025, lower bound of PRR 95% confidence interval.
FAERS was queried for all case reports with drugs targeting HIV‐1 Reverse Transcriptase and dated prior to the approval of etravirine in March 2012. The subset of FAERS generated contains 4,935 MedDRA preferred terms with at least one report. The 40 most frequent preferred terms are shown here. Mapping of MedDRA preferred terms to designate medical events is shown in the last column.
TAEs from the FDA drug labels
For each comparator drug, AEs were manually curated from the most recent drug label published prior to the approval of the drug of interest and mapped to the MedDRA vocabulary. Similarly to the profile from FAERS, MedDRA preferred terms were mapped to the designated medical event list to create drug label TAE profiles. The mapping was performed such that the presence of only one MedDRA preferred term was sufficient to assign the designated medical event to the TAE profile. Finally, for each designated medical event, the proportion of comparator drug labels reporting that designated medical event was computed. See Table 3 for an example TAE profile generated from the FDA drug labels. Historical product labels were obtained from the National Library of Medicine DailyMed website.4, 13
Table 3.
DME | Delavirdine | Didanosine | Lamivudine | Zalcitibine | Zidovudine | Label score |
---|---|---|---|---|---|---|
Abnormal bleeding | • | • | 0.4 | |||
Accidents and injuries | • | • | 0.4 | |||
Acute and chronic pancreatitis | • | • | • | • | • | 1 |
Amyotrophic lateral sclerosis | 0 | |||||
Anemia | • | • | • | • | • | 1 |
Arterial thrombotic event | 0 | |||||
Cardiac arrhythmia | • | • | 0.4 | |||
Coagulopathies | 0 | |||||
Colitis (excl infective) | • | • | 0.4 | |||
Congenital disorders NEC | 0 | |||||
Deliria | • | • | • | 0.6 | ||
Encephalopathies | 0 | |||||
Edema | • | • | • | • | 0.8 | |
Extrapyramidal symptoms | • | • | 0.4 | |||
Hemolytic anemia | • | • | • | 0.6 | ||
Heart failure | • | • | • | 0.6 | ||
Hepatic toxicity | • | • | • | • | • | 1 |
Hypersensitivity | • | • | • | • | • | 1 |
Hypertension | • | • | 0.4 | |||
Impaired wound healing | • | • | • | • | 0.8 | |
Infection and infestation | • | • | • | 0.6 | ||
Interstitial lung disease | 0 | |||||
Malignancy | 0 | |||||
Metabolism | • | • | • | • | 0.8 | |
Myopathy | • | • | • | • | • | 1 |
Neuroleptic malignant syndrome | 0 | |||||
Neutropenia | • | • | • | • | • | 1 |
Peripheral neuropathy | • | • | • | • | • | 1 |
PML | 0 | |||||
Pulmonary hypertension | 0 | |||||
Renal toxicity | • | • | • | 0.6 | ||
Respiratory failure | 0 | |||||
Seizures | • | • | 0.4 | |||
Sepsis | 0 | |||||
Serotonin syndrome | 0 | |||||
Sleep disturbance | • | • | • | 0.6 | ||
Special senses impairment | • | • | • | 0.6 | ||
SJS/TEN | • | • | • | 0.6 | ||
Sudden death | 0 | |||||
Suicide | 0 | |||||
Thrombotic event, vessel unspecified | • | 0.2 | ||||
Torsade de pointes | 0 | |||||
Venous thrombotic event | 0 | |||||
DME percent | 0.53 | 0.21 | 0.26 | 0.60 | 0.40 |
DME, designated medical event; FDA, US Food and Drug Administration; HIV, human immunodeficiency virus; NEC, not elsewhere classified PML, progressive multifocal leukoencephalopathy; SJS/TEN, Stevens‐Johnson Syndrome/Toxic Epidermal Necrolysis
The five HIV‐1 Reverse Transcriptase inhibitors on the US market prior to the approval of etravirine are shown. MedDRA preferred terms were manually curated from the most recent label published prior to the approval of etravirine and mapped to DMEs. The percentage of events on each label is recorded on the bottom row and the percentage of labels containing each DME is recorded in the last column (label score).
Classification and decision tree analysis
We used classification and decision tree analysis23 to construct a multilabel decision tree such that TAE profiles (independent variables) were used to predict the approved product label of a drug of interest (dependent variable). The dependent variables to be predicted consisted of 43 designated medical events described earlier. There was no restriction on the number or combination of designated medical events to be predicted. Three features were used to construct the decision tree: N (FAERS case count), PRR025 (the lower bound of the PRR 95% confidence interval), and the proportion of comparator drug labels with AE (label score). For TAE profiles generated from the FAERS data, a designated medical event was considered a prediction if N and PRR025 were both greater than specified threshold (split) value. For profiles generated from drug labels, a designated medical event was considered a prediction if label score was greater than a specified threshold value.
Predicted designated medical events were compared to designated medical events on the current FDA drug label. Metrics evaluated include precision (positive predictive value), recall (sensitivity or true positive rate), specificity (true negative rate), and F1 (harmonic mean of precision and recall). Precision, recall, and F1 were computed as follows and macro‐averaged across the six study drugs where DME = designated medical event:
A genetic algorithm (GA) was used to choose threshold (split) values for N, PRR025, and label score to maximize macro‐averaged F1 across study drugs. The GA parameters were: mutation rate, 0.2; crossover rate, 0.8; population size, 100; elitism, 20; and maximum iterations, 1000. The R package GA was used to perform the calculations.24 Full details are provided as R code in Data S2 . Please see TA_pilot.Rmd or TA_pilot.Rproj.
In total, three classification analyses were performed: (i) predictions were made from TAE profiles generated from the FAERS data only; (ii) predictions were made from TAE profiles generated from the FDA product label data only; (iii) predictions were made from TAE profiles generated from both FAERS data and the FDA product labels.
Safety label changes
For each of the six drugs in the study, the original drug label was compared to the current label to identify label changes. Twenty‐three new designated medical event label changes were identified across the six drugs. Label changes were compared to classification and decision tree predictions made at maximum F1 and the percentage identified correctly was computed.
Results
Classification performance
Classification and decision tree analysis was performed for three data sets: (i) TAE profiles generated from the FAERS data only; (ii) TAE profiles generated from label data only; (iii) TAEs generated from a combination of the FAERS and the FDA drug labels (Table 4). For FAERS‐only TAE profiles, the algorithm chose N = 70 and PRR025 = 1.06 to maximize F1 to 0.64 (Table 4 , row 1). Precision, recall, and specificity were 0.57, 0.78, and 061, respectively. For label‐only TAE profiles, the algorithm chose label score = 0.18 to maximize F1 to 0.68 (Table 4 , row 2). Precision, recall, and specificity were 0.67, 0.75, and 0.72, respectively. Combing the FAERS and label data, the algorithm chose N = 170, PRR025 = 1.52, and label score = 0.45 to maximize F1 to 0.71 (Table 4 , row 3). Precision, recall, and specificity were 0.67, 0.81, and 0.71, respectively. Overall, there was improvement in performance when combining TAE profiles from the FDA product labels and the FAERS data.
Table 4.
TAE | Precision (SD) | Recall (SD) | F1 (SD) | Specificity (SD) | Accuracy (SD) | N | PRR025 | Lab | Max |
---|---|---|---|---|---|---|---|---|---|
FAERS | 0.57 (0.14) | 0.78 (0.2) | 0.64 (0.13) | 0.61 (0.18) | 0.69 (0.10) | 78 | 1.06 | NA | F1 |
Labels | 0.67 (0.17) | 0.75 (0.18) | 0.68 (0.12) | 0.72 (0.16) | 0.74 (0.11) | NA | NA | 0.18 | F1 |
FAERS + labels | 0.67 (0.15) | 0.81 (0.15) | 0.71 (0.10) | 0.71 (0.19) | 0.76 (0.09) | 170 | 1.52 | 0.45 | F1 |
F1, harmonic mean of precision and recall; FAERS, US Food and Drug Administration Adverse Event Reporting System; FDA, US Food and Drug Administration; Lab, label score; Max, the performance metric maximized by the genetic algorithm; N, case count; NA, not applicable; PRR025, proportional reporting ratio lower bound of 95% confidence interval; TAE, target adverse‐event profile(s) used.
Performance is compared for three sets of predictions: (i) target‐adverse events profiles generated from FAERS data only; (ii) target‐adverse events profiles generated from the FDA label data only; (iii) target‐adverse events generated from a combination of the FAERS and the FDA labels. A genetic algorithm was used to specify N, PRR025, and label score to maximize F1.
Postmarket safety label changes
After approval of the six drugs of interest, 23 label changes occurred in the postmarket setting (Table 5). These safety label changes were compared with predictions made at maximum F1 using TAE profiles from both the FAERS and product labels. Eighteen of 23 (78%) label changes were retrieved correctly.
Table 5.
Drug | DME change (original to current label) | Predicted |
---|---|---|
Certolizumab‐pegol | Arterial thrombotic event | N |
Deliria | Y | |
Hypertension | Y | |
Desvenlafaxine | Arterial thrombotic event | Y |
Neuroleptic malignant syndromeRespiratory failure | NY | |
Etravirine | Edema | Y |
Hemolytic anemia | Y | |
Peripheral neuropathy | Y | |
Thrombotic event, vessel unspecified | N | |
Liraglutide | Hepatic toxicity | N |
Renal toxicity | Y | |
Pazopanib | Acute and chronic pancreatitis | Y |
Cardiac arrhythmia | Y | |
Coagulopathies | Y | |
Colitis (excl infective) | Y | |
Impaired wound healing | Y | |
SJS/TEN | Y | |
Sudden death | N | |
Venous thrombotic event | Y | |
Rivaroxaban | Anemia | Y |
Infection and infestation | Y | |
Metabolism | Y | |
Percent predicted | 0.78 |
DME, designated medical event; SJS/TEN, Stevens‐Johnson Syndrome/Toxic Epidermal Necrolysis.
For each drug of interest, the original drug label was compared to the current label. Twenty‐three new DMEs were identified across the six drugs. These label changes were then compared to classification predictions made at maximum F1. The percentage of label changes identified is shown on the bottom row.
Discussion
We have presented a method for predicting labeled AEs of the FDA approved drugs with emphasis on designated medical events, a selection of AEs of high interest to postmarket safety reviewers at the FDA. The method is mechanistic, grouping AEs by shared pharmacological targets and combining them with observational data from the FAERS as well as the FDA drug labels. This method predicted the 2017 FDA drug labels for our drugs of interest with precision, recall, and specificity of 0.67, 0.81, and 0.71, respectively. Of great value, 78% of postmarket safety label changes were identified correctly.
Although there has been great interest in applying predictive methods to the problem of drug safety,25 our method has unique aspects that make it relevant to AE predictions at the time of drug approval. Kuhn et al.26 and Wang et al.9 so predict AEs on drug labels by integrating drug‐target data with the FAERS data9, 26, 27; however, they do not focus on predicting designated medical events. Additionally, Xu and Wang7 relies heavily on LINCS L1000 gene expression data, which may not be available for new molecular entities. Finally, several methodologies focus on one class of AEs or specific drug‐AE associations7, 28, 29; our methodology is distinctive in that we can predict a wide range of clinically significant AEs for any drug that has a comparator with similar target activities at the time of approval.
Predictions were made at maximum F1, which is the harmonic mean of precision and recall. The method, however, can be tuned to maximize recall (sensitivity) at the expense of specificity or precision (positive predictive value). The tunability and granularity of the method can be considered a strength. In fact, tuning to greater recall (sensitivity) may be useful to identify AEs for augmented pharmacovigilance activities. In practice, further mechanistic evaluations of the target's relationship to an AE and literature reports for the association are evaluated to support or lessen the strength of a prediction. Unlabeled signals for comparator drugs have been identified in the process, as an additional benefit of this methodology.
As with any method using voluntary postmarket reporting data, such as FAERS, under‐reporting or over‐reporting biases and stimulated reporting occur.30 However, FAERS and other postmarketing databases have successfully predicted AEs in several other models.31, 32 The strength of FAERS is the enhanced reporting of rare events as captured by our designated medical event list that are not identified in trials. Additionally, we have addressed potential biases from FAERS by adding molecular TAE profiles generated from comparator FDA drug labels. This reflects other methodologies, such as those developed by Gurulingappa et al.32 and Liu,32, 33 in which multiple sources were used as features to make one prediction for a drug‐AE association. The addition of the FDA drug labels to our methodology allowed the algorithm to choose higher values of N and PRR025, thereby reducing false‐positive predictions from the FAERS‐generated TAE profiles and improving specificity and precision. An additional limitation is the sample size of six drugs, which is too small to perform cross‐validation for estimation of generalization error; the resulting model must, therefore, be understood as hypothesis generating. A larger validation study is underway.
Several future enhancements and analyses are planned to further strengthen and determine overall performance. First, false‐positive predictions, or predictions made that were not on the label of the drug of interest, will be systematically and mechanistically analyzed using multiple sources. This will allow us to determine if this methodology identified AEs that may be of concern, but have not yet been recognized using current pharmacovigilance methods. Moreover, we are currently investigating additional features for prediction, including the likelihood‐ratio test34 and a third source, text‐mined AEs from literature. We are also investigating the possibility of developing a database for common indications and comorbidities to reduce confounding and false‐positive predictions from the FAERS data. Additionally, we are presently developing improved methodology incorporating multiple machine learning approaches to enhance the applicability, accuracy, and reliability of our model. We also plan to include drug structure and target similarity measures as features for machine learning. Last, we are planning to increase the sample size to further validate our model. A larger study will allow us to cross‐validate with independent test data, as well as assess model performance by subgroup (e.g., AE and number of comparator drugs).
TAE analysis shows promise as a predictive method to augment pharmacovigilance. With this approach, using the FAERS data and the FDA drug labels for comparator drugs that share pharmacological targets with a drug of interest, we can tune our classification performance metrics based on three predictors: number of FAERS cases, PRR025 in FAERS, and percent of comparator drug labels with the AE of interest. This allows us to choose metrics best suited for safety reviewers, such as increasing precision to allow for better decision making. With several additional enhancements and analyses to better quantify performance of this methodology on the horizon, this pilot study demonstrates promise for this approach. In summary, this informatics approach using real‐world data shows applicability to provide mechanistic data for drug safety evaluations for unlabeled AEs.
Funding
P.S. US Food and Drug Administration; R.R. US Food and Drug Administration; D.J. Molecular Health, GmbH; D.G.S. US Food and Drug Administration; K.B. US Food and Drug Administration.
Conflicts of Interest
P.S. None; R.R. None; D.J. is an employee of Molecular Health, GmbH, a shareholder in Molecular Health, GmbH, and the inventor of the EFFECT technology; R.L. None; D.G.S. None; K.B. None.
Author Contributions
P.S., R.R., R.L., D.G.S., and K.B. wrote the manuscript. P.S., R.R., K.B., and R.L. designed the research. P.S. and R.R. performed the research. P.S., R.R., D.G.S., and K.B. analyzed the data. D.J. contributed new reagents/analytical tools.
Disclaimer
This study reflects the views of the authors and should not be construed to represent the views or policies of the FDA.
Supporting information
Acknowledgments
We would like to acknowledge the following contributions: FDA Office of Surveillance and Epidemiology for help developing the designated medical event list. Darrell Abernethy, Ram Tiwari, and Ted Guo of the FDA for helpful discussions.
References
- 1. Food and Drug Administration . FDA Adverse Event Reporting System (FAERS) Public Dashboard. <https://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/ucm070093.htm> (2018). Accessed 10 May, 2018.
- 2. Lazarou, J. , Pomeranz, B.H. & Corey, P.N. Incidence of adverse drug reactions in hospitalized patients: a meta‐analysis of prospective studies. JAMA 279, 1200–1205 (1998). [DOI] [PubMed] [Google Scholar]
- 3. Downing, N.S. et al Postmarket safety events among novel therapeutics approved by the US Food and Drug Administration between 2001 and 2010. JAMA 317, 1854–1863 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. US Food and Drug Administration . FDA Adverse Event Reporting System (FAERS). <https://open.fda.gov/data/faers/> (2017).
- 5. US Food and Drug Administration . FDA Drug Safety Communication: FDA warns about several safety issues with opioid pain medicines; requires label changes. <https://www.fda.gov/Drugs/DrugSafety/ucm489676.htm> (2016).
- 6. Cheng, F. et al Adverse drug events: database construction and in silico prediction. J. Chem. Inf. Model. 53, 744–752 (2013). [DOI] [PubMed] [Google Scholar]
- 7. Xu, R. & Wang, Q. Automatic signal extraction, prioritizing and filtering approaches in detecting post‐marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS). J. Biomed. Inform. 47, 171–177 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Abernethy, D.R. , Bai, J.P. , Burkhart, K. , Xie, H.G. & Zhichkin, P. Integration of diverse data sources for prediction of adverse drug events. Clin. Pharmacol. Ther. 90, 645–646 (2011). [DOI] [PubMed] [Google Scholar]
- 9. Wang, Z. , Clark, N.R. & Ma'ayan, A. Drug‐induced adverse events prediction with the LINCS L1000 data. Bioinformatics 32, 2338–2345 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Molecular Health, Inc .Molecular Health MH EFFECT™. Vol. 2016 (Heidelberg, Germany, 2016).
- 11. Wishart, D.S. et al DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Pubchem . <http://pubchem.ncbi.nlm.nih.gov/>.
- 13. Uniprot . <http://www.uniprot.org/>.
- 14. NCI . Nature NCI Pathway Interaction Dabase. <http://www.ndexbio.org/>.
- 15. Reactome Inc . <http://www.reactome.org>.
- 16. Biocarta Inc. <www.biocarta.com>.
- 17. PubMed . <http://www.ncbi.nlm.nih.gov/pubmed/>.
- 18. World Health Organization . World Health Organization Anatomical Therapeutic Chemical classification system. <http://www.whocc.no/atc/structure_and_principles/>.
- 19. van Puijenbroek, E.P. , Bate, A. , Leufkens, H.G.M. , Lindquist, M. , Orre, R. & Egberts, A.C.G. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiol. Drug Saf. 11, 3–10 (2002). [DOI] [PubMed] [Google Scholar]
- 20. Schotland, P. , Bojunga, N. , Zien, A. , Trame, M.N. & Lesko, L.J. Improving drug safety with a systems pharmacology approach. Eur. J. Pharm. Sci. 94, 84–92 (2016). [DOI] [PubMed] [Google Scholar]
- 21. Racz, R. , Soldatos, T.G. , Jackson, D. & Burkhart, K. Association between serotonin syndrome and second‐generation antipsychotics via pharmacological target‐adverse event analysis. Clin. Transl. Sci. 11, 322–329 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH), I.C.o.H.o.T.R.f.R.o.P.f.H.U. MedDRA: Medical Dictionary for Regulatory Activities. <http://www.meddra.org/> (2017).
- 23. Breiman, L. , Friedman, J.H. , Olshen, R.A. & Stone, C.J. Classification and Regression Trees (Wadsworth & Brooks/Cole Advanced Books & Software: Monterey, CA, 1984). [Google Scholar]
- 24. Scrucca, L. GA: a package for genetic algorithms in R. J. Stat. Softw. 53, 1–37 (2013). [Google Scholar]
- 25. Bai, J.P. & Abernethy, D.R. Systems pharmacology to predict drug toxicity: integration across levels of biological organization. Annu. Rev. Pharmacol. Toxicol. 53, 451–473 (2013). [DOI] [PubMed] [Google Scholar]
- 26. Kuhn, M. et al Systematic identification of proteins that elicit drug side effects. Mol. Syst. Biol. 9, 663 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Shaked, I. , Oberhardt, M.A. , Atias, N. , Sharan, R. & Ruppin, E. Metabolic network prediction of drug side effects. Cell Syst. 2, 209–213 (2016). [DOI] [PubMed] [Google Scholar]
- 28. Jamal, S. , Goyal, S. , Shanker, A. & Grover, A. Predicting neurological adverse drug reactions based on biological, chemical and phenotypic properties of drugs using machine learning models. Sci. Rep. 7, 872 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Huang, L.C. , Wu, X. & Chen, J.Y. Predicting adverse side effects of drugs. BMC Genom. 12(suppl. 5), S11 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. US Food and Drug Administration . Guidance for Industry Good Pharmacovigilance Practices and Pharmacoepidemiologic Assessment (FDA, Rockville, MD, 2005). [Google Scholar]
- 31. Voss, E.A. , Boyce, R.D. , Ryan, P.B. , van der Lei, J. , Rijnbeek, P.R. & Schuemie, M.J. Accuracy of an automated knowledge base for identifying drug adverse reactions. J. Biomed. Inform. 66, 72–81 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Gurulingappa, H. , Toldo, L. , Rajput, A.M. , Kors, J.A. , Taweel, A. & Tayrouz, Y. Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Pharmacoepidemiol. Drug Saf. 22, 1189–1194 (2013). [DOI] [PubMed] [Google Scholar]
- 33. Liu, M. et al Large‐scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J. Am. Med. Inform. Assoc. 19, e28–e35 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Huang, L. , Zalkikar, J. & Tiwari, R.C. Likelihood ratio test‐based method for signal detection in drug classes using FDA's AERS database. J. Biopharm. Stat. 23, 178–200 (2013). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.