Abstract
Objective
Apply natural language processing (NLP) to Amazon consumer reviews to identify adverse events (AEs) associated with unapproved over the counter (OTC) homeopathic drugs and compare findings with reports to the US Food and Drug Administration Adverse Event Reporting System (FAERS).
Materials and methods
Data were extracted from publicly available Amazon reviews and analyzed using JMP 16 Pro Text Explorer. Topic modeling identified themes. Sentiment analysis (SA) explored consumer perceptions. A machine learning model optimized prediction of AEs in reviews. Reports for the same time interval and product class were obtained from the FAERS public dashboard and analyzed.
Results
Homeopathic cough/cold products were the largest category common to both data sources (Amazon = 616, FAERS = 445) and were analyzed further. Oral symptoms and unpleasant taste were described in both datasets. Amazon reviews describing an AE had lower Amazon ratings (X2 = 224.28, P < .0001). The optimal model for predicting AEs was Neural Boosted 5-fold combining topic modeling and Amazon ratings as predictors (mean AUC = 0.927).
Discussion
Topic modeling and SA of Amazon reviews provided information about consumers’ perceptions and opinions of homeopathic OTC cough and cold products. Amazon ratings appear to be a good indicator of the presence or absence of AEs, and identified events were similar to FAERS.
Conclusion
Amazon reviews may complement traditional data sources to identify AEs associated with unapproved OTC homeopathic products. This study is the first to use NLP in this context and lays the groundwork for future larger scale efforts.
Keywords: natural language processing, drug safety, adverse drug event, homeopathic remedies, OTC drugs, consumer preferences
Introduction
Objective
This study explored whether applying natural language processing (NLP) techniques to Amazon consumer reviews may help to identify and understand safety concerns associated with mass-marketed over the counter (OTC) homeopathic drug products. Findings were analyzed and compared with data extracted from the US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) public dashboard.
Background and significance
All medications can potentially cause adverse events (AEs), some of which may be serious. The FDA defines an AE as “any undesirable experience associated with the use of a medical product in a patient.”1 In this study, to learn more about consumers’ undesirable experiences with OTC homeopathic drug products, we considered an AE to include a side effect, a lack of effectiveness, or a product quality issue. Pharmacovigilance is the science of detecting, understanding, and preventing AEs, and it plays a crucial role in protecting global public health.2
Pharmacovigilance experts use several data sources to identify safety concerns. These include case reports spontaneously submitted to drug manufacturers and regulatory agencies, case studies published in medical journals, findings from clinical trials in which drugs are evaluated for safety and effectiveness, and surveillance tools like the National Poison Data System and the Drug-Induced Liver Injury Network.3 The FAERS and Uppsala Monitoring Centre’s VigiBase are examples of databases that contain spontaneously submitted case reports.3,4
There is no pharmacovigilance data source capable of detecting all AEs, because AEs are not always reported.5 This creates a challenge for drug safety surveillance systems around the globe.6 Underreporting of AEs associated with unapproved drugs may be particularly problematic.7
According to the FDA website, unapproved drugs have not been reviewed by the FDA for safety, effectiveness, or quality.8 Homeopathic products are a subset of unapproved drugs that are often mass marketed, readily available OTC, and used for self-treatment of various maladies.8,9 Their use is based on homeopathy, an alternative approach to health care developed in Germany more than 2 centuries ago.8,10,11
Identifying AEs associated with homeopathic and other unapproved drugs can be difficult, and, for reasons mentioned, the usual pharmacovigilance data sources may not be sufficient. With the evolution of powerful computing resources and access to large datasets, non-traditional pharmacovigilance data sources like online consumer reviews are being investigated.12–14 NLP methods available for exploring such large datasets include sentiment analysis (SA) and topic modeling.15,16
Several publications have utilized SA and topic modeling to explore online consumer reviews for safety surveillance. Sullivan et al.13 used NLP to explore Amazon reviews for AEs associated with nutritional supplements. They employed Latent Dirichlet Allocation (LDA) and a dictionary of adverse drug reaction terms to categorize products into high, average, and low potential danger categories. The output was about 70% accurate compared to the 2 human annotators. Torii et al. mined Amazon grocery product reviews from a publicly available dataset of Amazon product reviews.14,17–19 They extracted health-related information and categorized it using a machine learning (ML) classifier. They found descriptions of health-related benefits and AEs related to grocery products, such as ginger candy helping with an upset stomach or an energy drink causing a headache. Goldberg et al. used the same Amazon dataset to develop a text mining approach for postmarket food safety surveillance.12,17–19 They complemented this data with information extracted from a consumer website, https://iwaspoisoned.com, which collects reports of food poisoning, and identified text associated with consumers’ use of hazardous food. Another study explored the potential of social media data complementing FDA Adverse event data and identified similar and divergent areas in Twitter and FAERS data.20
These articles are part of an emerging body of literature that demonstrates the utility of mining online consumer product reviews to identify product safety concerns. However, none of the literature has studied unapproved OTC homeopathic products. This gap is addressed in the present study.
Additional background information is available in the Supplementary Material.
Methods
Data extraction and cleaning
We extracted data from 2 publicly available datasets. The first is a set of Amazon consumer reviews encompassing a range of products.17–19 The other dataset was extracted from the FAERS public dashboard.21 To allow for comparison in our study, we extracted FAERS data for the same date range as the Amazon review data and limited this dataset to reports from consumers with single product exposures.
The Amazon product review datasets provided deduplicated data from May 1996 to July 2014.18 “Review” and “metadata” subsets were extracted from the product category, “Health and Personal Care,” which includes 2 982 326 ratings and 346 355 reviews for 263 032 products.21 The fields in the “review” dataset are reviewer ID, product ID, reviewer name, helpfulness rating of the review, text of the review, overall product rating, summary of the review, and date/time of the review. The “metadata” fields are product ID, product name, price, URL of product image, related products, sales rank, brand name, and categories to which the product belongs. Most of the unstructured narrative information is in the summary and review columns of the dataset, and these were concatenated to provide a single unstructured data field.
The Amazon review data were downloaded as JavaScript Object Notation (JSON) files and imported into a Jupyter notebook using the pandas library in the Python programming language.22–24 Pandas was then used to join the Amazon review and metadata datasets on the product ID to create a new working dataset that contained the most pertinent columns for identifying potential AEs (ie, “asin,” “reviewText,” “overall,” “summary,” “reviewTime,” “description,” “title,” “related,” “salesRank,” and “brand”). Reviews about homeopathic products were extracted by searching all review text, summary, description, and title fields for the string “homeo.”
These homeopathic datasets were then imported into Excel, combined into one set and deduplicated. All product listings were reviewed by the physician author (K.K.) to verify that the product was homeopathic. Similar homeopathic products were then grouped by the product’s intended use. The most common intended use, in 622 reviews, was cough, cold, flu, sore throat, and/or chest congestion.
FAERS data were downloaded from the public dashboard on February 11, 2022, for all cases reporting the term “homeopathic.”21 The fields are case ID, suspect product name, suspect product active ingredient, reason for use, reaction (adverse event, which is coded in Medical Dictionary for Regulatory Activities [MedDRA] terminology), seriousness, outcome, patient sex, patient age, patient weight, event date, case priority, report sender, reporter type, report source, concomitant products, initial and latest FDA received dates, country where the event occurred, and whether the event was reported to the manufacturer. The FAERS data have no narratives.
There were 3765 records in the FAERS raw dataset. These were filtered for reports between May 1996 and July 2014, to match the time range of the Amazon data, and then deduplicated, resulting in 1834 records. These were further reduced to 639 FAERS reports that were submitted by a consumer and described a single-suspect homeopathic product. Of these, the most common intended use, in 445 reports, was for cough, cold, flu, sore throat, and/or chest congestion. Because this “cough and cold” category was the most frequently described in both datasets, we focused our analysis on this subset of data.
Next, the physician author (K.K.) annotated all 616 concatenated Amazon reviews of cough and cold products for presence or absence of an apparent AE based on descriptions of a side effect or a concern about product quality. When an AE was present, in order to facilitate comparison to FAERS, a MedDRA term was applied after selecting it from the FAERS dataset or the publicly available SIDER database, which is based on version 16.1 of the MedDRA dictionary.25,26 Similar terms were then grouped.
To evaluate possible bias, the pharmacist author (S.C.J.) also independently annotated a random sample of 100 of the 616 concatenated Amazon reviews for the presence or absence of an AE, and when one was present, he applied a MedDRA term. A Kappa statistic was then calculated.
Characteristics of both datasets were described, including which products were most reported or reviewed, which had AEs associated with them, and how the extracted information compared between the 2 datasets. This comparison is qualitative in nature and describes findings, benefits, and limitations of each data source.
Natural language processing
Each concatenated and annotated Amazon cough and cold product review constituted a document, and all the documents together established the corpus for Latent Semantic Analysis. We used JMP 16 Pro Text Explorer software to process text in the corpus.27 Techniques included tokenization (breaking the text into its smallest units), removing stop words (those that are repetitive and if removed do not affect meaning), stemming (words are reduced to root form), and identifying pertinent phrases and terms that suggest an AE. A document-term matrix (DTM) was generated, depicting documents in rows, terms in columns, and weighted values in cells.
Next the term frequency-inverse document frequency (TF-IDF) was calculated, providing an estimate of the relative importance of words in the documents and corpus, as opposed to simple frequency, which only provides a raw count of the number of times a term occurs, or binary labeling, which simply identifies whether a term occurs.
The SA and topic modeling features available in the software were used to reveal latent topics and opinions in the concatenated Amazon data. Using the embedded dictionaries in JMP Pro Text Explorer, SA was conducted to learn more about consumers’ overall feelings about products.28 Topic modeling was used to explore possible AEs and other aspects of consumers’ perceptions of products. AEs in Amazon reviews were compared to product–AE pairs in FAERS.
Term clusters were generated to depict potential topics, which were reviewed by the physician author (K.K.), named according to apparent themes, and evaluated for possible AEs. These were secondarily reviewed independently by the pharmacist author (S.C.J.), and consensus was reached on topic names. Table 1 shows the partial DTM after topic names were assigned. The SA provided an overall positive or negative sentiment rating for each review. These ratings are derived from negation, intensifier, and sentiment terms.26 Examples of negation terms are “can’t,” “don’t,” or “didn’t.” Intensifier terms are descriptive and are weighted based on the significance of their connotation. Examples are “amazingly” (1.90), “I’m not sure” (1.0), or “rarely” (0.10). Finally, sentiment terms express an emotion and, in some instances, might be represented by an emoji. For instance, ☹ has a score of −70, has a score of +70, “awesome” is +90, “adequate” is +25, and “abysmal” is −90. The weights are pre-specified by the software but can be customized as needed. We used the pre-specified weights. Figure 1 depicts the NLP methodology. Some of these methods are based on work by Zengul et al.29,30
Table 1.
Example of partial document term matrix with named topics to illustrate methodology.
| Concatenated review | Topic 1 | Topic 2 | Topic 3 | Topic 4 |
|---|---|---|---|---|
| Descriptive comments | Treating a cold/bronchitis with product(s) to shorten symptoms | Stay healthy by avoiding germs and using Cold-Eeze/Zicam products | Sweetness, flavor, taste | |
| TF-IDF | TF-IDF | TF-IDF | TF-IDF | |
| Similar to the hard lozenges, but with a few extra ingredients… | 11.05 | 0.144 | 8.93 | 5.50 |
| I’ve used them for years… | 6.44 | 0.81 | 4.81 | 11.31 |
| Don’t think these should be labeled as a “cold remedy.”… | 6.43 | −1.39 | −1.73 | 3.62 |
| The Original and Still the Best. Must Have Product Cold-Eze is a… | 6.24 | 0.71 | 2.01 | 2.55 |
| Cold-Eeze Daytime/Nighttime Quick Melt Tablets Zinc is the first thing I use… | 5.99 | −0.36 | 2.37 | 7.96 |
| It works, and Honey Lemon flavor is not too Lemon or too Honey… | 5.61 | −0.86 | 1.65 | 0.11 |
Figure 1.
Natural language processing methodology.
Predictive modeling
We split the data into training and testing sets, as 75% and 25%, respectively. Model predictors consisted of topic modeling loadings, SA scores, and Amazon star ratings (1 star is low and 5 stars is high). Multiple ML models were screened for the ability to predict an AE using the model screening feature in JMP 16 Pro Text Explorer. Models included Neural Boosted, Generalized Regression Pruned Forward Selection, Generalized Regression Forward Selection, Fit Stepwise, Generalized Regression Lasso, Generalized Regression Elastic-Net, Bootstrap Forest, Generalized Regression Ridge, Support Vector Machines, Nominal Logistic, Boosted Tree, XGBoost, Decision Tree, and Naive Bayes.31 We also employed 5-fold cross-validation in the training set and a random seed, which allows replicability of the model results. Our performance metric of interest was the mean area under the curve (AUC) of the receiver operating characteristic curve. We considered the model with the highest mean AUC to be optimal. The best-performing models were tested on the testing set, and we determined the models’ performance metrics.
Results
There were 2887 cases in the dataset, including 2248 Amazon reviews and 639 FAERS reports, and the available data spanned from 2000 to 2014. There were 190 unique product names, including a category for “not reported.” These included 173 names in Amazon, and 17 names in FAERS. Although the reported names were unique, the products appeared to be similar. For instance, the FAERS dataset included “Zicam Cold Remedy Rapidmelts” and the Amazon dataset included “Zicam Cold Remedy RapidMelts, Cherry Quick Dissolve Tablets.”
Cough/cold/flu was the most frequent reason for use common to both datasets, so products in this group were the focus of analysis. There were 1061 total reviews or reports, including 616 from Amazon and 445 from FAERS. Zicam and Cold-Eeze were the most frequently reported products. Zicam products were primarily reported in FAERS versus Amazon reviews (428 vs 51), whereas Cold-Eeze products were only described in Amazon reviews (310).
The volume of reviews and reports was sparse until an uptick in 2009, the year in which FDA issued a warning about loss of smell associated with homeopathic Zicam cold products.32 It is unclear if reports triggered the warning, the warning stimulated reporting, or both. A search of the Amazon review narratives for “FDA” or “food and drug” identified no commentary about the Zicam warning.
With respect to annotation of the Amazon reviews, there was 83% concordance between the clinician authors for presence or absence of an AE (Kappa: 0.6508). There was 82% concordance for applying a MedDRA term based on 66 term annotations that were an exact match plus 16 that had at least one identical or similar term (Kappa = 0.7549). Eighteen annotations did not match.
Table 2 shows the most common AEs found in Amazon reviews and FAERS reports about homeopathic cough/cold products. Most FAERS reports were about alterations in the senses of smell (anosmia, hyposmia, parosmia) and taste (ageusia, dysgeusia, hypogeusia). There were fewer but similar descriptions in Amazon reviews about the sense of taste. There were also small but similar numbers of reports in FAERS and Amazon reviews of mouth numbness or tingling and tongue irritation. The most described AE in Amazon reviews was a complaint about how the product tasted and whether it left an unpleasant aftertaste
Table 2.
Ten most common adverse events associated with homeopathic cough/cold products in combined Amazon reviews and FAERS.
| Adverse event | Product name |
||||||
|---|---|---|---|---|---|---|---|
| Zicam | Cold-Eeze | Sinol nasal spray with capsaicin | Dr. King’s Natural Medicine | Nature’s Way Umcka | Not reported | Total | |
| Ageusia/dysgeusia/hypogeusia | 184 (91.1%) | 17 (8.4%) | 1 (0.4%) | 202 | |||
| Anosmia/hyposmia/parosmia | 191 (100%) | 191 | |||||
| Product after taste/taste abnormal | 7 (6.2%) | 104 (92%) | 1 (0.9%) | 1 (0.9%) | 113 | ||
| Paresthesia/hypoesthesia/dysesthesia oral | 14 (56%) | 11 (44%) | 25 | ||||
| Nasal discomfort/inflammation/rhinalgia | 16 (88.9%) | 2 (11.1%) | 18 | ||||
| Dry mouth/throat/lips/nose | 1 (9.1%) | 9 (81.8%) | 1 (9.1%) | 11 | |||
| Product complaint | 8 (100%) | 8 | |||||
| Sinus disorder/headache/pain/sinusitis | 7 (100%) | 7 | |||||
| Taste and smell disorder | 6 (100%) | 6 | |||||
| Nausea | 1 (16.7%) | 5 (83.3%) | 6 | ||||
| Glossitis/glossodynia/tongue disorder | 2 (100%) | 2 | |||||
| Grand total | 429 (72.8%) | 154 (26%) | 2 (0.4%) | 1 (0.2%) | 1 (0.2%) | 2 (0.4%) | 589 |
Because FAERS had no unstructured data, we could only apply topic modeling to Amazon data to identify themes in text and phrases. Using an elbow graph of the Eigen values for the 100 single vectors in the DTM, the elbow point on the 2 curves was between about 14 and 17 (see Figure 2). Using trial and error, between 14 and 17 clusters were evaluated, and 15 topics appeared to provide the clearest representation of the terms and phrases. Table 3 shows the assigned topic name and the terms in each of the 15 topics. These names were determined by SME consensus.
Figure 2.
Elbow graph of eigenvalue versus vector number.
Table 3.
Fifteen groups of identified terms and their assigned topic names.
| Groups of terms identified by softwarea | Topic names by consensus | |
|---|---|---|
| 1 | eez, natur· orange · flavor·, miser·, factor·, clinic·, one· thing, glycin·, control·, efficaci·, many year·, make· me feel·, expir· date·, quickmelt·, quickmelt· tablet·, mix· berri·, wonder· | Descriptive comments |
| 2 | two· day·, recoveri·, deal·, hard·, common· cold·, version·, quick·, real, bronchiti·, pop·, supplement·, sever· year·, orang· flavor·, avoid·, otherwis·, luck· | Treating a cold/bronchitis with product(s) to shorten symptoms |
| 3 | healthy·, germ·, line·, keep·, sick·, meant·, cover·, long·, someone·, kept·, hand·, stay·, job·, winter· | Stay healthy by avoiding germs and using ColdEeze/Zicam products |
| 4 | sugar·, note·, affect·, problems, may·, artifici·, known·, possibl·, contain·, per· day·, caus·, next· time·, health·, half· | Sweetness, flavor, taste |
| 5 | cough·, 24· hour·, fit·, woke·, week·, night·, treat·, later·, dri·, wake·, last·, might·, sleep·, initi·, pay· | Coughing, nighttime |
| 6 | sooner·, fight·, base·, virus·, bodi·, cold· virus, extra·, progress·, zinc·, ad·, food·, color·, follow· the direct·, formula·, drink·, 15· minut· | Start product early |
| 7 | Skeptic·, heal·, now·, stuff·, store·, bought·, today·, call·, pain, next·, morn·, know·, herb·, something, so, free·, went· | Belief in product |
| 8 | Tast· bud·, coupl· of day, come· down with a cold·, prevent· a cold·, empti· stomach·, product· like·, seem· to work·, quicker·, first· thing·, feel· better, feel· like·, reach·, long· time·, understand· first· sign· of a cold· | Helps cold but affects tastebuds |
| 9 | individu·, quick· melt, swallow·, dissolv·, tast·, melt·, mouth·, chalki·, unpleasant, question·, orang· flavor·, convinc·, leav·, tast· good, flavor· | Taste and texture |
| 10 | nyquil·, bug·, remedy·, airborn·, flu·, drug·, combin·, oscillococcinum·, similar·, always·, call·, around·, also·, origin·, report·, ill· | Various cold remedies |
| 11 | research·, mg·, vitamin· c, signific·, durat·, daili·, reduc·, consid·, combin·, intens·, accord·, antibiot·, glycin· | Evidence about medicinal effects on cold symptoms |
| 12 | packag·, advantag·, box·, boiron·, open·, life·, pack·, stick·, enough·, activ· ingredi·, oral· spray·, go·, way·, air | Containers and packaging |
| 13 | babi·, give·, otc·, oscillococcinum·, drug·, mind·, natur· remedi·, old·, little bit·, every, review·, gave·, wasn’t, heart·, stock·, quick· melt· | No common theme identified |
| 14 | chest·, knew·, serious·, infect·, gone·, bronchiti·, breath·, medic·, follow· the direct·, treatment·, began·, unfortunately, lung·, inhal· | Respiratory system |
| 15 | handl·, cold· eze·, cure·, prevent·, don·, ok·, typic·, box·, friend·, common· cold·, favorit·, hand·, peopl·, expir· date·, cherri· flavor· | Features of OTC medicinal products |
The terms are listed in descending order from largest to smallest based on the term loadings.
The most frequent topic was about product taste and texture. This category was the only one that conveyed information about AEs. The remainder of the categories depicts consumers’ perceptions about product characteristics like packaging, their experiences of the common cold and related remedies, and about the ways in which they use or perceive the utility of the products.
Next, we conducted SA to determine the sentiment expressed in a review. Overall, the sentiments expressed in Amazon reviews were skewed toward being positive. In some reviews, complaints about product taste were tempered by positive commentary about effectiveness. In some instances, consumers mentioned that they were reviewing the product after receiving a complimentary sample. The dataset did not provide any information about the veracity of the reviews.
Figure 3 is a box plot that depicts the relationship between overall sentiment score and the presence or absence of an AE in the document. Reviews with an AE tended to have a lower median sentiment score than those without. The overall sentiment score differences by AE are statistically significant (F = 49.565, P < .0001).
Figure 3.
Relationship between overall sentiment score and presence or absence of an adverse event in the document.
It seemed reasonable to hypothesize that lower Amazon consumer star ratings of a product would be lower if a review described an AE. This hypothesis seems to be supported as shown in Figure 4, in which the size of each box is proportional to the number of reviews with a particular rating and the proportion of those reviews that include an AE. For instance, there are 283 5-star reviews, of which 33 (∼12%) include an AE. Conversely, there are only 37 1-star reviews, of which about 78% include an AE. The X2 statistic (X2=224.28) has a P-value <.0001, indicating statistical significance. In sum, Amazon ratings tended to be lower when the review was annotated as describing an AE. However, this was not uniformly true, and there were reviews indicating an AE that had four to five stars.
Figure 4.
Proportion of adverse events based on Amazon rating.
We conducted model screening in JMP 16 Pro Text Explorer to determine how well topic modeling, SA, or Amazon rating alone, or in various combinations, could predict the presence of an AE in an Amazon review. Table 4 shows the performance characteristics of each of the tested models, including the highest mean AUC. Of the individual predictors, Amazon ratings using the support vector machine method had the highest mean AUC (0.835), and SA using the neural boosted algorithm had the lowest (0.660). The optimal mean AUC of 0.927 was achieved by combining topic modeling and Amazon rating using the neural boosted method.
Table 4.
Models’ test results by the best performing methods.
| Model | Method | TP | FN | TN | FP | Sensitivity | Specificity | Precision | Accuracy | F1 | AUC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Topic Modeling | Neural Boosted | 29 | 31 | 83 | 11 | 0.483 | 0.883 | 0.725 | 0.727 | 0.580 | 0.787 |
| Sentiment Analysis | Neural Boosted | 25 | 35 | 77 | 17 | 0.417 | 0.819 | 0.595 | 0.662 | 0.490 | 0.660 |
| Amazon Rating | Support Vector Machines | 33 | 27 | 87 | 7 | 0.550 | 0.926 | 0.825 | 0.779 | 0.660 | 0.835 |
| Topic Modeling+Sentiment Analysis | Neural Boosted | 43 | 17 | 76 | 18 | 0.717 | 0.809 | 0.705 | 0.773 | 0.711 | 0.820 |
| Topic Modeling+Amazon Rating | Neural Boosted | 50 | 10 | 82 | 12 | 0.833 | 0.872 | 0.807 | 0.857 | 0.820 | 0.927 |
| Sentiment Analysis+Amazon Rating | Neural Boosted | 33 | 27 | 87 | 7 | 0.550 | 0.926 | 0.825 | 0.779 | 0.660 | 0.828 |
| Topic Modeling+Sentiment Analysis+Amazon Rating | Neural Boosted | 46 | 14 | 81 | 13 | 0.767 | 0.862 | 0.780 | 0.825 | 0.773 | 0.917 |
Abbreviations: TP, true positives; FN, false negatives; TN, true negatives; FP, false positives; Sensitivity, TP/(TP+FN); Specificity, TN/(TN+FP); Precision, TP/(TP+FP); Accuracy, (TP+TN)/(TP+TN+FP+FN); F1, 2×TP/(2TP+FP+FN).
Discussion
This is the first study exploring the potential utility of text mining Amazon homeopathic drug product reviews to identify AEs and comparing the findings to FAERS data. It provided several key takeaway points. The Amazon reviews appeared to complement and support FAERS data. The topic modeling helped to identify issues with taste and texture and an effect on tastebuds. These issues were similar to AEs reported to FAERS. Topic and SA of Amazon reviews provided information about consumers’ perceptions and opinions of homeopathic OTC cough and cold products. Amazon ratings appear to be a good indicator of the presence or absence of AEs, although there are AEs in positively reviewed products, and not all low ratings include an AE. A neural boosted 5-fold algorithm combining topic modeling and Amazon ratings as predictors had a mean AUC of 0.927, and was most accurate at predicting the presence of an AE in Amazon reviews of unapproved OTC homeopathic products.
There are limitations to this research that should be noted. Even though this is the first study exploring text mining of consumer reviews of homeopathic drugs, the datasets and subsets that were used in this study were relatively small, and the Amazon dataset that was publicly available for research use is outdated. Online consumer reviews may be subject to social, economic, and technological factors that influence the content and rating.33,34 To address potential biases emerging from these limitations, future studies should utilize a larger and more current dataset and identify methods to minimize inclusion of incentivized, fraudulent, or bot-generated reviews. The initial annotation of AEs was performed by only one clinician, making results subject to bias. However, independent review and annotation of a random subset by a second clinician showed reasonable consistency between annotators. Similarly, the clinician authors independently named the topic clusters and then reached a consensus. Finally, despite deduplication efforts, it is possible that some duplicates in both datasets were missed and not removed. These limitations may restrict the performance of the optimal predictive model in other Amazon review datasets.
Additional research should include identifying ways to obtain larger and more recent datasets, exploring other online consumer review sources, enlisting the expertise of additional AE and topic annotators to reduce bias, identifying additional approaches to deduplication, and conducting further predictive modeling after generating a more refined Amazon review dataset. The methods used in this study provide a model for future analyses.
Conclusion
This research adds to an evolving body of literature about the role of text mining of online consumer reviews for pharmacovigilance, and we believe it is the first to consider homeopathic products. We found that Amazon reviews provide complementary information to the FAERS database, and there may be potential for extracting AE data about unapproved homeopathic drugs from larger and more current consumer review datasets to complement more traditional sources of information. While this information may be useful to regulators, it may also have utility for commercial enterprises, providing insight into consumer perceptions about products and identifying potential product-related safety and quality concerns.
Supplementary Material
Contributor Information
Karen Konkel, Division of Pharmacovigilance, Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20993, United States; Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States.
Nurettin Oner, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States.
Abdulaziz Ahmed, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States.
S Christopher Jones, Division of Pharmacovigilance, Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20993, United States.
Eta S Berner, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States; Informatics Institute, The University of Alabama at Birmingham, Birmingham, AL 35294, United States.
Ferhat D Zengul, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States; Informatics Institute, The University of Alabama at Birmingham, Birmingham, AL 35294, United States; Electrical & Computer Engineering, The Center for Integrated Systems, The University of Alabama at Birmingham, Birmingham, AL 35294, United States.
Author contributions
K.K. developed and completed the study with guidance from all co-authors; conducted data extraction, cleaning, design, coding, and analysis; provided clinical expertise in document annotation; drafted the manuscript; and contributed to manuscript revisions. N.O. provided guidance on and conducted data cleaning, design, and analysis; provided expertise in JMP Pro Text Explorer, text mining, and topic modeling; conducted prediction modeling; and contributed to manuscript revisions. A.A. provided guidance on study development and expertise in Python coding and natural language processing, conducted coding and data analysis, and contributed to manuscript revisions. S.C.J. provided guidance on study development and clinical expertise in document annotation and contributed to manuscript revisions. E.S.B. provided guidance on study development and design and contributed to manuscript revisions. F.D.Z. provided guidance on all aspects of the study including development, design, and analysis, with particular expertise in JMP Pro Text Explorer, text mining, and topic modeling; conducted prediction modeling; and contributed to manuscript revisions.
Supplementary material
Supplementary material is available at Journal of the American Medical Informatics Association online.
Funding
None declared.
Conflicts of interest
None declared.
Data availability
The publicly available datasets can be accessed at Amazon review data (ucsd.edu) and https://fis.fda.gov/sense/app/95239e26-e0be-42d9-a960-9a5f7f1c25ee/sheet/7a47a261-d58b-4203-a8aa-6d3021737452/state/analysis. Additional data underlying this article will be shared on reasonable request to the corresponding author.
Other required statements
The University of Alabama at Birmingham Institutional Review Board reviewed this study and determined that it is not human subjects research. The views expressed are those of the authors and do not necessarily represent the position of, nor do they imply endorsement from, the US Food and Drug Administration or the US Government.
References
- 1. What Is a Serious Adverse Event? US Food and Drug Administration. Accessed June 24, 2023. https://www.fda.gov/safety/reporting-serious-problems-fda/what-serious-adverse-event
- 2. What Is Pharmacovigilance? World Health Organization. Accessed June 24, 2023. https://www.who.int/teams/regulation-prequalification/pharmacovigilance
- 3. Swank K. FDA Drug Topics: An Overview of Pharmacovigilance in the Center for Drug Evaluation and Research (CDER). US Food and Drug Administration. March 26, 2019. Accessed June 24, 2023. https://www.fda.gov/media/122835/download [Google Scholar]
- 4. Ensuring Medicines Work Safely for Everyone. World Health Organization. Accessed June 24, 2023. https://www.who.int/news/item/02-11-2020-ensuring-medicines-work-safely-for-everyone
- 5. Questions and Answers on FDA’s Adverse Event Reporting System (FAERS). US Food and Drug Administration. Accessed June 24, 2023. https://www.fda.gov/drugs/surveillance/questions-and-answers-fdas-adverse-event-reporting-system-faers
- 6. Tricco AC, Zarin W, Lillie E, et al. Utility of social media and crowd-intelligence data for pharmacovigilance: a scoping review. BMC Med Inform Decis Mak. 2018;18(1):1-14. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/29898743/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Ashley DD. Clarifying misconceptions about US Food and Drug Administration unapproved drugs program. Anesth Analg. 2018;127(6):1292-1294. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/30433920/ [DOI] [PubMed] [Google Scholar]
- 8. Homeopathic Products. U.S. Food and Drug Administration. Accessed June 24, 2023. https://www.fda.gov/drugs/information-drug-class/homeopathic-products
- 9. Drug Products Labeled as Homeopathic-Guidance for FDA Staff and Industry 2019. US Food and Drug Administration. Accessed June 24, 2023. https://www.fda.gov/media/131978/download
- 10. Homeopathy: What You Need to Know. National Institutes of Health, National Center for Complementary and Integrative Health. Accessed June 24, 2023. https://www.nccih.nih.gov/health/homeopathy
- 11. Jonas WB, Kaptchuk TJ, Linde K.. A critical overview of homeopathy. Ann Intern Med. 2003;138(5):393-399. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/12614092/ [DOI] [PubMed] [Google Scholar]
- 12. Goldberg DM, Khan S, Zaman N, et al. Text mining approaches for postmarket food safety surveillance using online media. Risk Anal. 2020;42(8):1749-1768. Accessed June 24, 2023. https://onlinelibrary.wiley.com/doi/abs/10.1111/risa.13651 [DOI] [PubMed] [Google Scholar]
- 13. Sullivan R, Sarker A, O’Connor K, et al. Finding potentially unsafe nutritional supplements from user reviews with topic modeling. Pac Symp Biocomput. 2016;21:528-539. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/26776215/ [PMC free article] [PubMed] [Google Scholar]
- 14. Torii M, Tilak SS, Doan S, et al. Mining health-related issues in consumer product reviews by using scalable text analytics. Biomed Inform Insights. 2016;8(Suppl 1):1. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/27375358/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Brett MR. Topic modeling: a basic introduction. J Digit Human. 2012;2(1). Accessed June 24, 2023. https://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r-brett/ [Google Scholar]
- 16. Gonçalves P, Araújo M, Benevenuto F, et al. Comparing and combining sentiment analysis methods. Proceedings of the First ACM Conference on Online Social Networks. Association for Computing Machinery; 2013. Accessed June 24, 2023. https://dl.acm.org/doi/10.1145/2512938.2512951
- 17. He R, McAuley J. Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. WWW ’16: Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conference Committee (IW3C2); 2016. Accessed June 24, 2023. https://arxiv.org/abs/1602.01585
- 18. McAuley J. Amazon Product Data. University of California San Diego; 2014. Accessed June 24, 2023. https://cseweb.ucsd.edu/~jmcauley/datasets/amazon/links.html
- 19. McAuley J, Targett C, Shi Q, et al. Image-based recommendations on styles and substitutes. Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. Cornell University; ArXiv, 2015:43-52. Accessed June 24, 2023. https://arxiv.org/abs/1506.04757
- 20. Zhou Z, Hultgren KE.. Complementing the US Food and Drug Administration adverse event reporting system with adverse drug reaction reporting from social media: comparative analysis. JMIR Public Health Surveill. 2020;6(3):e19266. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/32996889/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.US Food and Drug Administration. FDA Adverse Events Reporting System Public Dashboard. US Food and Drug Administration; 2023. Accessed June 24, 2023. https://fis.fda.gov/sense/app/95239e26-e0be-42d9-a960-9a5f7f1c25ee/sheet/7a47a261-d58b-4203-a8aa-6d3021737452/state/analysis
- 22. Project Jupyter Home. Accessed June 24, 2023. https://jupyter.org/
- 23. Pandas – Python Data Analysis Library. Accessed June 24, 2023. https://pandas.pydata.org/
- 24. Python. Accessed June 24, 2023. https://www.python.org/
- 25. MedDRA | Medical Dictionary for Regulatory Activities: International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Accessed June 24, 2023. https://www.meddra.org/
- 26. Kuhn M, Letunic I, Jensen LJ, Bork P.. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016;44(D1):D1075-D1079. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/26481350/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.JMP® Pro Predictive Analytics Software for Scientists and Engineers. Accessed June 24, 2023. https://www.jmp.com/en_us/software/predictive-analytics-software.html
- 28. Fang X, Zhan J.. Sentiment analysis using product review data. J Big Data. 2015;2(1):5. Accessed June 24, 2023. https://journalofbigdata.springeropen.com/articles/10.1186/s40537-015-0015-2 [Google Scholar]
- 29. Zengul AG, Zengul FD, Ozaydin B, et al. Identifying research themes and trends in the top 20 cancer journals through textual analysis. J Cancer Policy. 2021;30:100313. Accessed June 24, 2023. https://pubmed.ncbi.nlm.nih.gov/35559806/ [DOI] [PubMed] [Google Scholar]
- 30. Zengul FD, Oner N, Byrd JD, et al. Revealing research themes and trends in 30 top-ranking accounting journals: a text-mining approach. Abacus 2021;57(3):468-501. Accessed June 24, 2023. https://onlinelibrary.wiley.com/doi/10.1111/abac.12214 [Google Scholar]
- 31. Model Screening – New in JMP Pro 16: JMP Pro Statistical Discovery LLC. Accessed June 24, 2023. https://community.jmp.com/t5/Learning-Center/Model-Screening-New-in-JMP-Pro-16/ta-p/417323
- 32. FDA Advises Consumers Not to Use Certain Zicam Cold Remedies. Intranasal Zinc Product Linked to Loss of Sense of Smell. US Food and Drug Administration. June 16, 2009. Accessed June 24, 2023. https://wayback.archive-it.org/7993/20170113083934/http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/2009/ucm167065.htm [Google Scholar]
- 33. Klein N, Marinescu I, Chamberlain A, Smart M.. Online reviews are biased. Here’s how to fix them. Harv Bus Rev. Updated March 6, 2018, Accessed September 7, 2023. https://hbr.org/2018/03/online-reviews-are-biased-heres-how-to-fix-them [Google Scholar]
- 34. Salminen J, Kandpal C, Kamel AM, Jung S-G, Jansen BJ.. Creating and detecting fake reviews of online products. J Retail Consum Serv. 2022;64:102771. 10.1016/j.jretconser.2021.102771. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The publicly available datasets can be accessed at Amazon review data (ucsd.edu) and https://fis.fda.gov/sense/app/95239e26-e0be-42d9-a960-9a5f7f1c25ee/sheet/7a47a261-d58b-4203-a8aa-6d3021737452/state/analysis. Additional data underlying this article will be shared on reasonable request to the corresponding author.




