Abstract
Background
Adverse drug reaction (ADR) is a major burden for patients and healthcare industry. Early and accurate detection of potential ADRs can help to improve drug safety and reduce financial costs. Post-market spontaneous reports of ADRs remain a cornerstone of pharmacovigilance and a series of drug safety signal detection methods play an important role in providing drug safety insights. However, existing methods require sufficient case reports to generate signals, limiting their usages for newly approved drugs with few (or even no) reports.
Methods
In this study, we propose a label propagation framework to enhance drug safety signals by combining drug chemical structures with FDA Adverse Event Reporting System (FAERS). First, we compute original drug safety signals via common signal detection algorithms. Then, we construct a drug similarity network based on chemical structures. Finally, we generate enhanced drug safety signals by propagating original signals on the drug similarity network. Our proposed framework enriches post-market safety reports with pre-clinical drug similarity network, effectively alleviating issues of insufficient cases for newly approved drugs.
Results
We apply the label propagation framework to four popular signal detection algorithms (PRR, ROR, MGPS, BCPNN) and find that our proposed framework generates more accurate drug safety signals than the corresponding baselines. In addition, our framework identifies potential ADRs for newly approved drugs, thus paving the way for early detection of ADRs.
Conclusions
The proposed label propagation framework combines pre-clinical drug structures with post-market safety reports, generates enhanced drug safety signals, and can potentially help to accurately detect ADRs ahead of time.
Availability
The source code for this paper is available at: https://github.com/ruoqi-liu/LP-SDA.
Keywords: Adverse drug reactions, Signal Detection, FDA Adverse Event Reporting System, Drug similarity
Background
Adverse drug reactions (ADRs), identified as harmful and unintended reactions resulted from drug treatments, become main public health issues. Delayed detection of ADRs can cause a major damage to public health [1, 2] (e.g., accounting for significant amount of mortality and morbidity each year). It is estimated that over 2,000,000 serious ADRs occur among all hospitalized patients in the United States, which causes more than 100,000 deaths per year [2]. In addition, ADRs become the fourth leading cause of death in the United States, preceding serious medical events such as pulmonary disease, diabetes, AIDS and pneumonia [3]. Therefore, early detection of potential ADRs or drug safety signals can significantly reduce the health risk for patients and save money for additional hospital costs.
Though ADRs can be detected in both pre-marketing clinical trials and post-marketing surveillances, most ADR knowledges are revealed after the drugs being on market. Compared to clinical trials, post-marketing stage allows larger population and extended follow up. Real-world evidence, such as Spontaneous Reporting System (SRS) [4], Electronic Health Records (EHRs) [5], medical claims [6], social media and web search [7, 8], become important for detecting ADRs. Among those data sources, SRS remains a cornerstone of pharmacovigilance and are collected from a variety of sources, including healthcare providers, national authorities, pharmaceutical companies, medical literature and more recently directly from patients. SRS collects case reports such that each sample contains ADR status (Yes/No) and drug status (Yes/No). Such a structure allows SRS to be mined without an epidemiology design.
Due to the rich and valuable information offered by SRS data, a series of signal detection algorithms have been developed to detect drug safety signals from SRS. Proportional Reporting Rate (PRR) [9] and Reporting Odds Ratio (ROR) [10, 11] are the most commonly used methods, which are based on frequentist statistical analysis. And Multi-item Gamma Poisson Shrinker (MGPS) [12] and Bayesian Confidence Propagation Neural Network (BCPNN) [13]) are two Bayesian approaches that widely used for signal detection. Recently, another approach has emerged that combines pre-clinical drug structures with SRS to improve the original safety signals. Vilar et al. [14, 15] improve the original signals generated from health-care databases by incorporating biological and chemical information of drugs. Their methods firstly achieved improvement of performance in the analysis of two representative ADRs: rhabdomyolysis and pancreatitis. Vilar et al. [16] further demonstrate that other types of cheminformic similarity (e.g., 2D drug chemical structural similarity, adverse event profile similarity and target profile similarity) can also yield great results in the detection of drug safety signals. Moreover, Vilar et al. [17] present a 3D drug-ADR predictor, which incorporates 3D molecular structure similarity and drug-ADR standard reference, to improve ADRs identification and generate enriched drug-ADR signals. They apply the 3D drug-ADR predictor on SRS resources and find that the proposed predictor identifies more accurate signals than baseline methods. The underlying principle behind these approaches is that drugs with similar chemical structures are more likely to exhibit similar ADR [18]. In general, existing methods are developed to generate signals and/or re-rank original signals for drugs with enough reports in SRS, but few methods can be used to generate signals for newly approved drugs with few or even no safety reports in SRS.
There are some approaches that use machine learning techniques and pre-clinical information from large public drug databases to predict ADR [19–24]. Most of these methods typically use chemical, biological and phenotypic properties of drugs to build predictive models. In [19] for example, a computational approach is presented to predict the side effects of a given drug by incorporating information on other drugs and their side effects. They use drug-ADR pairs obtained from public drug databases both in the training process and performance evaluation. However, we just use these drug-ADR pairs as external evaluation resources which do not take part in the prior training process (A comparison of [19] and ours framework can be found in Fig. S1 of Additional file 1). To best of our knowledge, ours is the first signal detection framework that combines pre-clinical drug structures and post-market safety reports.
In this paper, we propose a label propagation framework to enhance drug safety signals by combining drug chemical structures with FDA Adverse Event Reporting System (FAERS) [25]. First of all, we compute original drug safety signals via common signal detection algorithms from FAERS. Then, we construct a drug-drug similarity network based on chemical structures. Finally, we generate enhanced drug safety signals by propagating original signals on the drug-drug similarity network. We apply the label propagation framework on four popular signal detection algorithms (PRR, ROR, MGPS, BCPNN) and find that our proposed framework can generate more accurate drug safety signals than the corresponding baseline methods. In addition, the proposed framework can identifies potential ADRs for newly approved drugs, thus providing promise for early detection of ADRs.
In general, the contributions of the paper lie in three-fold:
We propose a label propagation framework to generate enhanced drug safety signals, which incorporates the pre-clinical drug structures with the post-market safety reports.
We compare the proposed framework with four different state-of-the-art signal detection algorithms and evaluate the performance in detecting ADRs.
We also apply our framework on newly approved drugs (with few cases in SRS) and access whether pre-clinical drug structures can help to early detect safety signals prior to FDA safety label change.
Methods
Datasets
FAERS database
The SRS data used in this work is FAERS. we adopt a curated and standardized version of FAERS data from 2004 to 2014 [26]. After removing duplicate case records, mapping drug names to RxNorm concepts and ADR outcomes to Medical Dictionary for Regulatory Activities (MedDRA) codes [27], we obtain 4245 unique drugs, 17,671 ADRs and totalling 4,928,413 reports. We plot the frequencies of ADRs and drugs of FAERS data in Fig. 1 to demonstrate the data distribution of this dataset. The number of drugs associated with ADRs varies a lot with an average of 213 as shown in Fig. 1a. And the number of ADRs associated with each drug with an average of 887 in Fig. 1b.
Pubchem database
PubChem Compound database [28] provides unique chemical structure information of drugs. We map the concept IDs of drugs in FAERS into PubChem IDs using the exact drug names and then extract the drug chemical substructures from PubChem. Among 4245 unique drugs in FAERS, 2708 drugs are mapped and their chemical features are extracted from PubChem.
SIDER ground truth data
The Side Effect Resource (SIDER) database [29] contains approved drugs and their recorded ADRs, which are collected from package inserts (i.e., drug labels). In the SIDER version 4.1, it contains totalling 1430 drugs, 5868 ADRs and 139,756 drug-ADR pairs. We use drug-ADR pairs extracted from SIDER version 4.1 as positive controls for evaluation. Of 2708 drugs with chemical features, 843 drugs are mapped to SIDER by converting PubChem IDs to STITCH IDs in SIDER. ADRs in SIDER are recorded in both Lowest Level Terms (LLT) and Preferred Terms (PT) form of MedDRA. We select PT for ADRs as our evaluation dataset. Thus, we end up with 843 drugs, 842 ADRs and 65,636 drug-ADR pairs as the ground truth data in the experiment.As further validation of the approach, we also use OFFSIDES [30], a post-marketing dataset to test the performance (See Table S4 in Additional file 1).
Overall framework
The overall framework of this paper is outlined in Fig. 2. It consists of three main steps: computing original drug safety signals from FAERS reports, constructing a drug-drug similarity network from pre-clinical drug structures, and generating enhanced drug safety signals through a label propagation process.
Computing drug safety signals
Our study covers four commonly used signal detection algorithms. Table 1 lists the main properties of each algorithm. The proportional reporting ration (PRR) [9] and the reporting odds ratio (ROR) [10, 11] are two popular measurements of frequentist statistical methods. For each drug-adverse pair, we construct a 2 ×2 contingency table (Table 2) and compute the signal scores as follow:
1 |
Table 1.
Methods | Description | Signal score computation | |
---|---|---|---|
Frequentist statistical methods | Proportional Reporting Ratio (PRR) | Statistical method to calculate the relative risk in order to measure the association strength for a drug-ADR pair | PRR05: lower bound of the 95% confidence interval of relative risk reporting ratio distribution |
Reporting Odds Ratio (ROR) | Statistical method to calculate the odds ratio in order to measure the association strength for a drug-ADR pair | ROR05: lower bound of the 95% confidence interval of odds ratio distribution | |
Bayesian-based methods | Multi-item Gamma Poisson Shrinker (MGPS) | Bayesian-based method to prevent false-positive signals from multiple comparisons. Generate an adjusted value based on Reporting Ratio (RR) | EB05: lower bound of the 95% of the posterior distribution for RR |
Bayesian Confidence Propagation Neural Network (BCPNN) | Bayesian-based method to prevent false-positive signals from multiple comparisons. Generate an adjusted value based on Information Component (IC) | BCPNN25: lower bound of the 2.5% of the posterior distribution for IC |
Table 2.
Reports with ADR | Reports without ADR | Total | |
Reports with drug | a | b | a+b |
Reports without drug | c | d | c+d |
Total | a+c | b+d | a + b + c + d |
2 |
In this paper, we use PRR05 (referred as PPR) and ROR05 (referred as ROR) as baseline methods in the experiments. The multi-item gamma poisson shrinker (MGPS) [12, 31] and bayesian confidence propagation neural network (BCPNN) [13] are widely used Bayesian approaches for signal detection. We adopt EB05 of MGPS and BCPNN25 of BCPNN as our baseline methods.
Constructing drug similarity network
We construct a drug similarity network based on chemical structures. To be specific, we treat different drugs as nodes on the network, and compute edge weights on the network with drug chemical structure similarities. The similarity is based on a chemical structure fingerprint corresponding to the 881 chemical substructure [32] defined in PubChem. Each drug can be represented by an 881-dimensional binary profile whose elements indicate the presence or absence of corresponding PubChem substructures with value 1 or 0. The Jaccard similarity between two drugs can be calculated by:
3 |
where A and B denote the profiles of two drugs.
Generating enhanced drug safety signals
Label propagation algorithms are widely adopted in analyzing weighted N nodes graph to discover latent information [33] and have been applied to biomedical problems [34]. At the beginning of the algorithms, a small portion of nodes have labels and these labels are propagated to previously unlabeled nodes through the algorithms.
In our method, we generate enhanced drug safety signals via propagating original signals on the drug similarity network. The weighted N nodes graph is constructed based on the N×N drug similarity matrix A, where Ai,j≥0 represents the similarity for drug i and drug j. Drugs are treated as nodes in the graph and the edge weights are assigned by the drug similarities. The signal score matrix S of drug-ADR pairs, where Si,j denotes the signal score of drugi-ADRj combination, are considered as initial labels of nodes. For the drug Di, the initial labels are ith row of the signal scores matrix S, which are denoted as Si. The label information of initial drug nodes is propagated to the nodes through the weighted edges in the graph by an iterative approach. To guarantee the convergence of the updates, the original drug similarity matrix A needs to be normalized so that the row sum is one. We denote the normalized matrix as W.
Using W, we propagate labels from the labeled drug nodes to the unlabeled nodes. In every iteration, the label information of each node is updated by absorbing labels from its neighbors by a probability γ, and retaining labels of its previous labels by a probability (1−γ). The updating formula for a drug node i in the t th iteration from step t−1 to step t can be denoted as below,
4 |
In this formula, represents the updated label information of drug node i in tth iteration, and 0<γ<1 is the absorbing probability that determine the label information absorbed from neighbors. By considering all drug nodes at the same time, we can formulate the updating formula (4) into a matrix form,
5 |
After t iterations, (5) can be written as,
6 |
Since , the spectral radius ρ(W)≤1. And 0<γ<1, thus and , where I is the identity matrix of order N. Therefore, the iteration of updating formula will converge as (The proof of convergence can be found in [33]),
7 |
where Y is the final label information for N drug nodes and S is the matrix for initial label information.
To generate signals for a new drug, we regard the signals of the drug with all ADRs as 0. Then we calculate the similarities between new drugs and other drugs. Based on current similarity network, we can generate safety signals via label propagation, even there is no existing report.
In general, the original signal scores computed by common signal detection algorithms are further improved through the label propagation on the drug similarity network. The final labels (scores) can be regarded as the improved signals for drug-ADR pairs.
Results
Experiment setup
The known drug-ADR pairs extracted from SIDER are treated as positive controls, and the unknown drug-ADR pairs are referred as negative controls. Since the number of positive samples is much fewer than negative ones, we randomly sample part of negative controls from all unknown pairs. The size of negative samples is twice the size of positive controls. To fully demonstrate the performance of our methods, we also compile an evaluation dataset with all drug-ADR pairs from SIDER as reference positives and the complement set of SIDER drug-ADR pairs as reference negatives (i.e., without any sub-sampling of negatives). We conduct the experiments on this alternative dataset and report the results in Table S2 of Additional file 1.
In the performance comparison, we use Area Under the Curve (AUC) score, Area Under the Precision-Recall Curve (AUPR) score, precision, recall, accuracy and F1-score (F1) for performance comparison. AUC score is a graphical figure of true positive rate (TPR) and false positive rate (FPR), which can be plotted by varying the threshold value for output scores. The definition of TPR and FPR shows below:
8 |
Similarity, AUPR can be plotted in the same way based on precision and recall score. Precision measures the probability of the output identified safety signals being correct. Recall measures the probability of real true safety signals being estimated as the outputs. The equations of precision and recall are shown in 9.
9 |
Accuracy measures the probability of all ground labels of drug-pairs being estimated correctly. F1 is defined as the harmonic mean of precision and recall:
10 |
There is one parameter: absorbing probability (γ) of label propagation in the proposed method. We consider γ in {0.1,0.2,0.3,...,0.9} and build the model with γ that yields the maximum AUC score. We evaluate the performance of models on different parameters and show the results in the Fig. S2 of Additional file 1. The optimal values of γ for each signal detection algorithms are shown in Table S3 of Supplementary Materials.
Performance evaluation on all ADRs
We compare the proposed methods with four baselines (PRR, ROR, MGPS, BCPNN) using all years data and report the six metrics in Table 3. “LP-Method name” denotes the proposed method and which signal detection algorithm we use to generate original signals. From Table 3, we can observe that among these four signal detection algorithms, MGPS outperforms other baseline methods resulting in the best AUC scores and AUPR scores. And our methods are better than all the corresponding baseline methods in terms of AUC scores, AUPR scores and precision. The results demonstrate that drug-drug similarities can help to enhance the safety signals since the similar drugs may induce same ADRs. By this way, the original drug safety signals are improved by incorporating information from similar drugs.
Table 3.
Method | AUC | AUPR | Precision | Recall | Accuracy | F1 |
---|---|---|---|---|---|---|
PRR | 0.716 | 0.517 | 0.786 | 0.466 | 0.629 | 0.586 |
LP-PRR | 0.728 | 0.534 | 0.801 | 0.478 | 0.644 | 0.588 |
ROR | 0.716 | 0.518 | 0.786 | 0.466 | 0.629 | 0.585 |
LP-ROR | 0.728 | 0.534 | 0.801 | 0.477 | 0.643 | 0.588 |
MGPS | 0.727 | 0.544 | 0.746 | 0.483 | 0.649 | 0.586 |
LP-MGPS | 0.751 | 0.574 | 0.770 | 0.498 | 0.665 | 0.601 |
BCPNN | 0.670 | 0.445 | 0.867 | 0.428 | 0.570 | 0.573 |
LP-BCPNN | 0.671 | 0.449 | 0.911 | 0.428 | 0.574 | 0.573 |
Evaluation metrics of fixed levels of sensitivities and specificities values can be found in Table S1 of Additional file 1. The bold in the table is maximum values of that evaluation metrics on different methods
We also plot the yearly change curve for LP-MGPS and MGPS based on AUC scores and AUPR scores in Fig. 3. Here, 2004,2005,...,2014 of horizontal axis represent the reports we use to generate signals accumulated from 2004 to current year (i.e., 2008 denotes reports from 2004 to 2008 are utilized to generate signals). According to Fig. 3, we can find that our method LP-MGPS outperforms its corresponding baseline MGPS on every cumulative years. In addition, the proposed method can achieve better performance especially only with reports of early years.
Performance evaluation on representative ADRs
To further characterize the performance of the proposed method, we select ADRs from Designated Medical Event (DME) [35] for additional comparisons. DME contains standardized medical concept terms released by The European Medicines Agency (EMA), which is a list of inherently serious ADRs. We map the ADRs of DME with our datasets and remove the ADRs associated with less than 10 drugs. 31 ADRs are considered for performance evaluation and Table 4 shows the comparison of proposed LP-MGPS and the original MGPS algorithm on top 15 ADRs ranked by AUPR scores. “Number of positive drugs” denotes the number of drugs that associated with each ADR. Here, we use MGPS as our based signal detection algorithm since it yields highest AUC and AUPR scores for this task. According to the results, the proposed method is better than the corresponding baseline method on all 15 ADRs in terms of AUPR scores. And our methods outperform the baseline on most cases for AUC scores. (More experiments on these representative ADRs can be found in Table S5 and Table S6 of Additional file 1).
Table 4.
ADR concept ID | ADR name | Number of positive drugs | AUPR | AUC | ||
---|---|---|---|---|---|---|
MGPS | LP-MGPS | MGPS | LP-MGPS | |||
36009756 | Anaphylactic reaction | 373 | 0.968 | 0.973 | 0.779 | 0.798 |
35104877 | Febrile neutropenia | 52 | 0.968 | 0.972 | 0.955 | 0.962 |
35707713 | Pancreatitis | 197 | 0.956 | 0.959 | 0.862 | 0.865 |
36009762 | Angioedema | 328 | 0.949 | 0.955 | 0.794 | 0.807 |
35406359 | Deafness | 123 | 0.932 | 0.940 | 0.819 | 0.832 |
37019318 | Renal failure | 207 | 0.937 | 0.939 | 0.824 | 0.828 |
36009760 | Anaphylactoid shock | 151 | 0.869 | 0.928 | 0.681 | 0.756 |
35104879 | Granulocytopenia | 224 | 0.901 | 0.925 | 0.756 | 0.789 |
36009724 | Stevens-Johnson syndrome | 209 | 0.917 | 0.922 | 0.815 | 0.825 |
36516888 | Rhabdomyolysis | 90 | 0.914 | 0.920 | 0.866 | 0.868 |
35104103 | Bone marrow failure | 195 | 0.914 | 0.920 | 0.758 | 0.756 |
36009707 | Erythema multiforme | 252 | 0.911 | 0.918 | 0.777 | 0.782 |
35104281 | Haemolytic anaemia | 128 | 0.901 | 0.916 | 0.788 | 0.785 |
35909518 | Hepatic failure | 136 | 0.910 | 0.915 | 0.813 | 0.820 |
35104101 | Aplastic anaemia | 109 | 0.885 | 0.913 | 0.748 | 0.802 |
The bold in the table is maximum values of that evaluation metrics
Discussion
A label propagation framework is built in this study, which enriches post-market safety reports with pre-clinical drug similarity network to generate enhanced safety signals. The overall performance of the proposed method is superior, the performance on those important ADRs are good, and the MGPS-based method achieves the best performance.
We further demonstrate the performance of the proposed method on newly approved drugs which have few (or even no) reports in SRS. The safety related labels for a drug are released by FDA since the drug approval and ADRs are recorded in labeling information for drugs. The labeling information might be revised quarterly by port-marketing surveillance. Here, we report the performance of ADRs detection for two recently approved drugs “liraglutide” and “pazopanib” in Fig. 4. We use MGPS-based method to generate original signals since we obtain the best performance on MGPS. We compute the yearly rankings of the drug to the ADR and the number of drug-ADR cases in SRS. The horizontal axis here represents the cumulative years from 2004 to current year. The rank in vertical axis denotes the percentile of the drug ranking, which can be calculated by after sorting the entire drug list in a descending order.
Liraglutide is a medication used to treat diabetes or obesity [36], and it is approved for medical use in the United States in 2010 [37] and in Europe in 2009 [38]. In 2011, renal failure was updated to the labeling information of liraglutide [39]. According to Fig. 4a, we can find that Liraglutide-Renal failure first showed up in SRS in 2010 and accumulated to 11 cases in 2014. Thus, the baseline which entirely rely on the sufficient cases can only generate signals for this pair after 2010. The ranking of liraglutide gradually increases as more years data accumulated. The proposed method performs better than the baseline after 2010. More importantly, the proposed method is able to generate signals before 2010 and can predict liraglutide to cause renal failure as early as of 2005 by taking the case reports of liraglutide’s similar drugs into the consideration. Therefore, the proposed method can early detect the safety-related labeling changes than the labels revised by FDA.
Pazopanib is a medicine used for treatment of advanced renal cell carcinoma (RCC) and advanced soft tissue sarcoma (STS) [40]. It is approved for medical use in the United States in 2009 [41] and in Europe in 2010 [42]. The impaired wound healing was included in one of syndromes in labeling information of pazopanib in 2014 [43]. For Pazopanib-Impaired wound healing shown in Fig. 4b, it is initially reported by SRS in 2009 and continually accumulated up to 77 cases by 2014. The baseline can not generate signals for Pazopanib-Impaired wound healing without any cases. However, the proposed method is able to identify potential safety signals before 2009 and yearly rankings of the pazopanib confirm that our method can detect the safety signals prior to FDA safety label change.
The above instances confirm that the algorithm is able to detect drug safety signal before the approval, and consistently outperforms the state-of-the-art in early detection and before the drug label change which every pharmacy is trying to avoid.
Conclusions
In this paper, we present a label propagation framework, which integrates drug chemical information with post-market safety reports, to generate enhanced drug safety signals. The drug safety signals are enhanced through the process of label propagation with the drug similarity computed from the chemical information. We compare the performance of our methods with four different state-of-the-art signal detection algorithms (PRR, ROR, MGPS, BCPNN) using safety reports from SRS. The results demonstrate that the proposed methods outperform their corresponding baselines in generating accurate drug safety signals. Extensive experiments show that our methods are able to accurately detect potential ADRs for newly approved drugs with few safety reports, which pave the way for early detection of ADRs.
This study can be extended in multiple directions in the future in terms of both drug features and post-market real-world evidence. Other types of available data sources of drugs such as chemical-protein binding and therapeutic indication data can be leveraged for the construction of drug similarity networks. Furthermore, the label propagation framework can be applied to enhance drug safety signals generated by other real-world evidence such as EHRs and medical claims.
Supplementary information
Acknowledgements
Not applicable.
Abbreviations
- ADRs
Adverse drug reactions
- FAERS
FDA’s adverse event reporting system
- PRR
Proportional Reporting Ratio
- ROR
Reporting Odds Ratio
- MGPS
Multi-item Gamma Poisson Shrinker
- BCPNN
Bayesian Confidence Propagation Neural Network
Authors’ contributions
PZ conceived the project. RL and PZ developed the method. RL conducted the experiments. RL and PZ wrote the manuscript. All authors read and approved the final manuscript.
Funding
This work was funded in part by the National Center for Advancing Translational Research of the National Institutes of Health under award number CTSA Grant UL1TR002733. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Availability of data and materials
The datasets used and analyzed during the current study are available from the curated FAERS [26], SIDER [29]. The code is available at https://github.com/ruoqi-liu/LP-SDA. All datasets and software used in this study are fully accessible, free of charge.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
PZ is the member of the editorial board of BMC Medical Informatics and Decision Making. The authors declare that they have no other competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Ruoqi Liu, Email: liu.7324@osu.edu.
Ping Zhang, Email: zhang.10631@osu.edu.
Supplementary information
Supplementary information accompanies this paper at 10.1186/s12911-019-0999-1.
References
- 1.Edwards IR, Aronson JK. Adverse drug reactions: definitions, diagnosis, and management. Lancet. 2000;356(9237):1255–9. doi: 10.1016/S0140-6736(00)02799-9. [DOI] [PubMed] [Google Scholar]
- 2.Lazarou J, Pomeranz BH, Corey PN. Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. Jama. 1998;279(15):1200–5. doi: 10.1001/jama.279.15.1200. [DOI] [PubMed] [Google Scholar]
- 3.Giacomini KM, Krauss RM, Roden DM, Eichelbaum M, Hayden MR, Nakamura Y. When good drugs go bad. Nature. 2007;446(7139):975. doi: 10.1038/446975a. [DOI] [PubMed] [Google Scholar]
- 4.Harpaz R, DuMouchel W, LePendu P, Bauer-Mehren A, Ryan P, Shah NH. Performance of pharmacovigilance signal-detection algorithms for the fda adverse event reporting system. Clin Pharmacol Ther. 2013;93(6):539–46. doi: 10.1038/clpt.2013.24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Harpaz R, Vilar S, DuMouchel W, Salmasian H, Haerian K, Shah NH, Chase HS, Friedman C. Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions. J Am Med Inform Assoc. 2012;20(3):413–9. doi: 10.1136/amiajnl-2012-000930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li Y, Ryan PB, Wei Y, Friedman C. A method to combine signals from spontaneous reporting systems and observational healthcare data to detect adverse drug reactions. Drug Saf. 2015;38(10):895–908. doi: 10.1007/s40264-015-0314-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G. Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing. Association for Computational Linguistics: 2010. p. 117–125.
- 8.Nikfarjam A, Sarker A, O’connor K, Ginn R., Gonzalez G. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inform Assoc. 2015;22(3):671–81. doi: 10.1093/jamia/ocu041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Evans S, Waller PC, Davis S. Use of proportional reporting ratios (prrs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf. 2001;10(6):483–6. doi: 10.1002/pds.677. [DOI] [PubMed] [Google Scholar]
- 10.Rothman KJ, Lanes S, Sacks ST. The reporting odds ratio and its advantages over the proportional reporting ratio. Pharmacoepidemiol Drug Saf. 2004;13(8):519–23. doi: 10.1002/pds.1001. [DOI] [PubMed] [Google Scholar]
- 11.Waller P, Van Puijenbroek E, Egberts A, Evans S. The reporting odds ratio versus the proportional reporting ratio:’deuce’. Pharmacoepidemiol Drug Saf. 2004;13(8):525–6. doi: 10.1002/pds.1002. [DOI] [PubMed] [Google Scholar]
- 12.DuMouchel W. Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system. Am Stat. 1999;53(3):177–90. [Google Scholar]
- 13.Bate A, Lindquist M, Edwards IR, Olsson S, Orre R, Lansner A, De Freitas RM. A bayesian neural network method for adverse drug reaction signal generation. Eur J Clin Pharmacol. 1998;54(4):315–21. doi: 10.1007/s002280050466. [DOI] [PubMed] [Google Scholar]
- 14.Vilar S, Harpaz R, Chase HS, Costanzi S, Rabadan R, Friedman C. Facilitating adverse drug event detection in pharmacovigilance databases using molecular structure similarity: application to rhabdomyolysis. J Am Med Inform Assoc. 2011;18(Supplement_1):73–80. doi: 10.1136/amiajnl-2011-000417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Vilar S, Harpaz R, Santana L, Uriarte E, Friedman C. Enhancing adverse drug event detection in electronic health records using molecular structure similarity: application to pancreatitis. PloS One. 2012;7(7):41471. doi: 10.1371/journal.pone.0041471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Vilar S, Ryan P, Madigan D, Stang P, Schuemie M, Friedman C, Tatonetti N, Hripcsak G. Similarity-based modeling applied to signal detection in pharmacovigilance. CPT: Pharmacometrics Syst Pharmacol. 2014;3(9):1–9. doi: 10.1038/psp.2014.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Vilar S, Tatonetti NP, Hripcsak G. 3d pharmacophoric similarity improves multi adverse drug event identification in pharmacovigilance. Sci Rep. 2015;5:8809. doi: 10.1038/srep08809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Fliri AF, Loging WT, Thadeio PF, Volkmann RA. Analysis of drug-induced effect patterns to link structure and side effects of medicines. Nat Chem Biol. 2005;1(7):389. doi: 10.1038/nchembio747. [DOI] [PubMed] [Google Scholar]
- 19.Atias N, Sharan R. An algorithmic framework for predicting side effects of drugs. J Comput Biol. 2011;18(3):207–218. doi: 10.1089/cmb.2010.0255. [DOI] [PubMed] [Google Scholar]
- 20.Pauwels E, Stoven V, Yamanishi Y. Predicting drug side-effect profiles: a chemical fragment-based approach. BMC bioinformatics. 2011;12(1):169. doi: 10.1186/1471-2105-12-169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Liu M, Wu Y, Chen Y, Sun J, Zhao Z, Chen X-w, Matheny ME, Xu H. Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs. J Am Med Inform Assoc. 2012;19(e1):28–35. doi: 10.1136/amiajnl-2011-000699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhang W, Yue X, Liu F, Chen Y, Tu S, Zhang X. A unified frame of predicting side effects of drugs by using linear neighborhood similarity. BMC Syst Biol. 2017;11(6):101. doi: 10.1186/s12918-017-0477-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dey S, Luo H, Fokoue A, Hu J, Zhang P. Predicting adverse drug reactions through interpretable deep learning framework. BMC Bioinformatics. 2018;19(21):476. doi: 10.1186/s12859-018-2544-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Luo H, Fokoue-Nkoutche A, Singh N, Yang L, Hu J, Zhang P. Molecular docking for prediction and interpretation of adverse drug reactions. Comb Chem High Throughput Screen. 2018;21(5):314–22. doi: 10.2174/1386207321666180524110013. [DOI] [PubMed] [Google Scholar]
- 25.FDA’s Adverse Event Reporting System (FAERS). https://open.fda.gov/data/faers/. Accessed 30 June 2019.
- 26.Banda J, Evans L, Vanguri R, Tatonetti N, Ryan P, Shah N. Data from: A curated and standardized adverse drug event resource to accelerate drug safety research. Dryad Digital Repository. 2016. 10.5061/dryad.8q0s4. [DOI] [PMC free article] [PubMed]
- 27.Brown EG, Wood L, Wood S. The medical dictionary for regulatory activities (meddra) Drug Saf. 1999;20(2):109–17. doi: 10.2165/00002018-199920020-00002. [DOI] [PubMed] [Google Scholar]
- 28.Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, et al. Pubchem substance and compound databases. Nucleic Acids Res. 2015;44(D1):1202–13. doi: 10.1093/nar/gkv951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kuhn M, Campillos M, Letunic I, Jensen LJ, Bork P. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol. 2010;6(1):343. doi: 10.1038/msb.2009.98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Tatonetti NP, Patrick PY, Daneshjou R, Altman RB. Data-driven prediction of drug effects and interactions. Sci Trans Med. 2012;4(125):125–3112531. doi: 10.1126/scitranslmed.3003377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Szarfman A, Machado SG, O’neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the us fd’s spontaneous reports database. Drug Saf. 2002;25(6):381–92. doi: 10.2165/00002018-200225060-00001. [DOI] [PubMed] [Google Scholar]
- 32.PubChem Substructure Fingerprint V1.3. ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt. Accessed 30 June 2019.
- 33.Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B. Learning with local and global consistency. In: Advances in Neural Information Processing Systems: 2004. p. 321–328.
- 34.Zhang W, Yue X, Huang F, Liu R, Chen Y, Ruan C. Predicting drug-disease associations and their therapeutic function based on the drug-disease association bipartite network. Methods. 2018;145:51–59. doi: 10.1016/j.ymeth.2018.06.001. [DOI] [PubMed] [Google Scholar]
- 35.Designated Medical Event (DNE). https://www.ema.europa.eu/en/human-regulatory/post-authorisation/pharmacovigilance/signal-management. Accessed 30 June 2019.
- 36.Liraglutide: Monograph for Professionals. https://www.drugs.com/monograph/liraglutide.html. Accessed 30 June 2019.
- 37.Liraglutide: FDA Approved Drug Products. https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overviewprocess&ApplNo=022341. Accessed 30 June 2019.
- 38.Liraglutide: European Medicines Agency. https://www.ema.europa.eu/en/medicines/human/EPAR/victoza. Accessed 30 June 2019.
- 39.Liraglutide: FDA Approved Drug Products Safety Label. https://www.accessdata.fda.gov/drugsatfda_docs/label/2011/022341s004lbl.pdf. Accessed 30 June 2019.
- 40.Pazopanib: Uses, Side Effects and Warnings. https://www.drugs.com/mtm/pazopanib.html. Accessed 30 June 2019.
- 41.Pazopanib: FDA Approved Drug Products. https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overview.process&varApplNo=022465. Accessed 30 June 2019.
- 42.Pazopanib: European Medicines Agency. https://www.ema.europa.eu/en/medicines/human/EPAR/votrient. Accessed 30 June 2019.
- 43.Pazopanib: FDA Approved Drug Products Safety Label. https://www.accessdata.fda.gov/drugsatfda_docs/label/2014/022465s018lbl.pdf. Accessed 30 June 2019.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used and analyzed during the current study are available from the curated FAERS [26], SIDER [29]. The code is available at https://github.com/ruoqi-liu/LP-SDA. All datasets and software used in this study are fully accessible, free of charge.