Skip to main content
. Author manuscript; available in PMC: 2016 Apr 1.
Published in final edited form as: J Biomed Inform. 2015 Feb 23;54:202–212. doi: 10.1016/j.jbi.2015.02.004

Table 2.

A summary table showing primary ADR detection approaches and evaluation methodologies.

Study Research Aim Primary Approach(es) Evaluation Methodology
Leaman et al.
(2010) [17]
Concept/relation
extraction
Lexicon-based (450 comments
for system development).
Quantitative. Against manually
annotated data (3,150 instances).
Nikfarjam and
Gonzalez (2011)
[34]
Concept/relation
extraction
Lexical pattern-matching
(2,400 comments for pattern
building). Association rule
mining to identify patterns.
Quantitative. Against manually
annotated data (1,200 instances).
Chee et al. (2011)
[40]
Drug classification Ensemble classification using
drug categories as classes.
Mixed. Classification results
are combined to generate
drug scores for 3 drugs,
which are compared against
scores for drugs (12) with
known adverse effects.
Benton et al.
(2011) [42]
Concept/relation
extraction
Lexicon-based. Association
rule mining to identify drug
reaction
pairs.
Quantitative. Adverse reactions
associated with drugs
obtained from product labels
and compared against system
reported adverse events.
Hadzi-puric and
Grmusa (2012) [43]
Concept/relation
extraction
Lexicon-based approach for
ADR detection. Statistical
scoring for identifying drug
relation
associations.
Mixed. Qualitative analysis
of identified ADRs against
known ADRs. Recall, precision
and F-score computed
for evaluation against annotated
data.
Yang et al. (2012)
[44]
Concept/relation
extraction
Lexicon-based. Association
rule mining to identify drug
reaction
pairs.
Quantitative. FDA AERS
used as the gold standard.
Lift, Leverage, and Proportional
Reporting Ratio used
as metrics.
Bian et al. (2012)
[45]
ADR classification Classification of tweets using
Support Vector Machine
(SVM) classifiers. Two class
ifiers built: one to predict if
a user has used a drug (based
on the tweets), and the second
to classify if a post contains
an adverse effect.
Mixed. Evaluation and training
is performed on the same
data. Only classification accuracies
reported. Analysis
describes the limitations introduced
by noise in Twitter.
Liu and Chen
(2013) [46]
Concept/relation
extraction
Lexicon-based approach for
ADR and drug detection.
Shortest dependency path
based machine learning algorithm
for relation extraction.
Quantitative. Separate evaluations
for entity extraction,
ADR detection and classification
of patient experiences using
200 manually annotated
comments.
Yang et al. (2013)
[48]
ADR Classification A combination of supervised
and unsupervised approaches
for training binary classifiers.
A mixture of syntactic, semantic,
and sentiment features
are used to train SVM
and Naïve Bayes classifiers.
Quantitative. Evaluation
performed on 1,600 annotated
instances. Evaluation
demonstrates that the combination
of supervised and unsupervised
training performs
significantly better than using
supervised training only.
Jiang and Zheng
(2013) [49]
Concept/relation
extraction and
classification
Supervised classification of
tweets using a Maximum Entropy
classifier trained on a
data set of 600 tweets only.
MetaMap [67] to identify
drug and ADR categories.
Mixed. 285 tweets for testing
the classification accuracy.
ADR extraction accuracy
is evaluated against
known adverse reactions.
Yates and Goharian
(2013) [50]
Concept/relation
extraction
Pattern-based. 7 patterns
used for extracting ADRs
from approximately 125 manually
annotated comments.
Quantitative. Against manually
annotated data (125 instances).
Yeleswarapu et al.
(2014) [54]
Concept/relation
extraction
Lexicon-based. Prepared lexicon
used for drug and ADR
detection. Association rule
mining and BCPNN used
for identifying drug-symptom
and drug-disease pairs.
Qualitative. Evaluation is
performed via comparative
analysis with findings from
previous studies. Primary
conclusion of evaluation is
that combining social media
data with other sources
such as medical literature and
ADR databases can improve
ADR detection performance.
Freifeld et al.
(2014) [57]
Concept/relation
extraction
Lexicon-based. A prepared
lexicon is used to detect
ADRs. Aggregated frequencies
are used to compare
drug-reaction pairs.
Quantitative. Aggregated
frequency of identified
product-event pairs compared
with data from AERS.
Correlation between the two
sources computed to assess
the effectiveness of social
media as a resource for ADR
monitoring.
Segura et al. (2014)
[58]
Concept/relation
extraction
Lexicon-based. A prepared
lexicon was used in a multilingual
text analysis engine
to detect drugs and ADRs in
text.
Quantitative. Against manually
annotated data (400 instances).
Drug and ADR detection
evaluated separately.
Ginn et al. (2014)
[38]
Corpus presentation/
description.
Supervised learning
experiments to
illustrate utility of
corpus.
Supervised classification of
ADR assertive tweets using
10-fold cross validation over
a large annotated data set of
10,822 tweets. Data set artificially balanced to lower ADR-
noADR
class imbalance.
Quantitative. Evaluated
against annotated data on
the artificially balanced data
set.
Liu et al. (2014)
[60]
Medical entity
extraction, adverse
event extraction,
report source
classification.
Lexicon-based approach for
entity extraction and ADR
extraction. Rule-based approach
for relation classification.
Quantitative. Against manually
annotated data (600).
Same set of instances used
for the tasks of events and
treatments recognition, ADR
identification, and patient report
extraction.
Patki et al. (2014)
[39]
ADR/drug classification Supervised classification of
ADR assertive comments using
SVMs and a rich set of
features extracted via NLP
techniques. Probabilities of
all comments associated with
each drug combined to predict
if drug should be categorized
as normal or blackbox.
Mixed. Annotated data used
for evaluating the classification
task. Accuracy values
used for evaluating drug categorization
strategy.
O’Connor et al.
(2014) [35]
Concept/relation
extraction
Lexicon-based approach for
detecting ADR mentions in
Twitter data. Lexicon created
by combining several existing
ADR lexicons.
Quantitative. Against manually
annotated data (1,873 instances).
Yang et al. (2014)
[61]
Drug-ADR relation
extraction
Lexicon-based approach for
detecting ADR mentions.
Association rule mining
to identify relationships
between drugs and ADRs.
Quantitative. Lift and Proportional
Reporting Ratio for
scoring association of ADRs
with drugs. Recall, precision
and F-measure used to compare
the performance against
three publicly available systems.17
Sampathkumar et al.
(2014) [62]
Concept/relation
extraction and relationship
(causal)
identification.
Lexicon-based approach for
detecting mentions of ADRs.
Hidden Markov Model applied
to detect relationship
between drug-ADR pairs.
Mixed. 10-fold cross validation
against manually annotated
data (2,000 instances).
Extracted ADRs compared
against drug package labels
to verify performance and to
identify unknown ADRs.
Sarker and Gonzalez
(2014) [41]
ADR classification. Supervised classification to
detect ADR assertive texts.
Features incorporated from
distinct research areas such as
sentiment analysis, polarity
classification and topic modeling.
Multiple corpora combined
to boost classification
performance.
Quantitative. F-score for
the ADR class is computed
against gold standard annotations.
Nikfarjam et al.
(2014) [65]
Concept/relation
extraction
Concept extraction is performed
using supervised
learning via conditional
random fields (CRF). Word
clusters, learnt from large
unlabeled data, are used as
features.
Quantitative. Against manually
annotated data (1,559
and 444 instances for two
data sources).