Table 1.
Authors | Data Source | Sample Size | Horizon of Data Collection | Software Used | Techniques and Classifiers Used | Outcome | Result | Description of Result |
---|---|---|---|---|---|---|---|---|
[31] | Social media | Twitter.com: 1642 tweets | 3 years | Toolkit for Multivariate Analysis | Artificial Neural Networks (ANN), Boosted Decision Trees with AdaBoost (BDT), Boosted Decision Trees with Bagging (BDTG), Sentiment Analysis, Support Vector Machines (SVM) | Reported ADRs for HIV treatment | Positive | Reported adverse effects are consistent with well-recognized toxicities. |
[32] | Forums | DepressionForums.org: 7726 posts | 10 years | General Architecture for Text Engineering (GATE), NLTK Toolkit within MATLAB, RapidMiner | Hyperlink-Induced Topic Search (HITS), k-Means Clustering, Network Analysis, Term-Frequency-Inverse Document Frequency (TF-IDF) | User sentiment on depression drugs | Positive | Natural language processing is suitable to extract information on ADRs concerning depression. |
[66] | Social media | Twitter.com: 2,102,176,189 tweets | 1 year | Apache Lucene | MetaMap, Support Vector Machines (SVM) | Reported ADRs for cancer | Neutral | Classification models had limited performance. Adverse events related to cancer drugs can potentially be extracted from tweets. |
[33] | Social media | Twitter.com: 6528 tweets | Unknown | GENIA tagger, Hunspell, Snowball stemmer, Stanford Topic Modelling Toolbox, Twokenizer | Backward/Forward Sequential Feature Selection (BSFS/FSFS) Algorithm, k-Means Clustering, Sentiment Analysis, Support Vector Machines (SVM) | Reported ADRs | Positive | ADRs were identified reasonably well. |
[34] | Social media | Twitter.com: 32,670 tweets |
Unknown | Hunspell, Twitter tokenizer | Term Frequency-Inverse Document Frequency (TF-IDF) | Reported ADRs | Neutral | ADRs were not identified very well. |
[67] | Social media | Twitter.com: 10,822 tweets |
Unknown | Unknown | Naive Bayes (NB), Natural Language Processing (NLP), Support Vector Machines (SVM) | Reported ADRs | Positive | ADRs were identified well. |
[35] | Drug reviews | Drugs.com, Drugslib.com: 218,614 reviews |
Unknown | BeautifulSoup | Logistic Regression, Sentiment Analysis |
Patient satisfaction with drugs, Reported ADRs, Reported effectiveness of drugs | Positive | Classification results were very good. |
[36] | Social media | Twitter.com: 172,800 tweets |
1 year | Twitter4J | Decision Trees, Medical Profile Graph, Natural Language Processing (NLP) | Reported ADRs | Positive | Building a medical profile of users enables the accurate detection of adverse drug events. |
[37] | Social media | Twitter.com: 1245 tweets |
Unknown | CRF++ Toolkit, GENIA tagger, Hunspell, Twitter REST API, Twokenizer | Natural Language Processing (NLP) | Reported ADRs | Positive | ADRs were identified reasonably well. |
[38] | Drug reviews | WebMD.com: Unknown | Unknown | SentiWordNet, WordNet | Sentiment Analysis, Support Vector Machines (SVM), Term document Matrix (TDM) | User sentiment on cancer drugs | Positive | Sentiment on ADRs was identified reasonably well. |
[39] | Drug reviews, Social media | DailyStrength.org: 6279 reviews, Twitter.com: 1784 tweets |
Unknown | Unknown | ARDMine, Lexicon-based, MetaMap, Support Vector Machines (SVM) | Reported ADRs | Positive | ADRs were identified very well. |
[40] | Drug reviews, Social media | PatientsLikeMe.com: 796 reviews, Twitter.com: 39,127 tweets, WebMD.com: 2567 reviews, YouTube.com: 42,544 comments |
Not applicable | Deeply Moving | Unknown | Patient-reported medication outcomes | Positive | Social media serves as a new data source to extract patient-reported medication outcomes. |
[68] | Forums | Medications.com: 8065 posts, SteadyHealth.com: 11,878 |
Not applicable | Java Hidden Markov Model library, jsoup | Hidden Markov Model (HMM), Natural Language Processing (NLP) | Reported ADRs | Positive | Reported adverse effects are consistent with well-recognized side-effects. |
[69] | Electronic Health Record (EHR) | 25,074 discharge summaries | Not applicable | MedLEE | Unknown | Reported ADRs | Positive | Reported adverse effects are consistent with well-recognized toxicities (recall: 75%; precision: 31%). |
[41] | Social media | Twitter.com: 3251 tweets |
Not applicable | AFINN, Bing Liu sentiment words, Multi-Perspective Question Answering (MPQA), SentiWordNet, TextBlob, Tweepy, WEKA | MetaMap, Naive Bayes (NB), Natural Language Processing (NLP), Sentiment Analysis, Support Vector Machines (SVM) | Reported ADRs | Positive | Several well-known ADRs were identified. |
[70] | Forums | MedHelp.org: 6244 discussion threads | Unknown | Unknown | Association Mining | Reported ADRs | Positive | ADRs were identified. |