Skip to main content
. 2020 May 16;122:103770. doi: 10.1016/j.compbiomed.2020.103770

Table 2.

Summary of statistical and machine learning methods and data sources for surveillance using Twitter data.

Public Health Issue Method Comparative Data Source
Cancer Simple Statistical Analysis [23] CDC
Hepatitis A Support Vector Machine [24]
Gastrointestinal Illnesses Correlation Analysis [25] Government of ontario, Kingston, Frontenac and Lennox & Addington Public Health
Suicide ARIMA (Autoregressive Integrated Moving Average [26]
HIV Graph Modelling [27], Word2Vec [28], Doc2Vec [28], Dynamic Topic Modeling [28]
Allergies K-Nearest Neighbour [20], Bayesian Inference [20], Support Vector Machine [20]
Heat Wave Near Regression [18], ARIMA (Autoregressive Integrated Moving Average) [18] The US National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Information (NCEI)
Heat Related Illnesses Correlation Analysis [25] Government of ontario, Kingston, Frontenac and Lennox & Addington Public Health
Depression ARIMA (Autoregressive Integrated Moving Average [26]
Syphilis Binomial Regressions [29] CDC
Ebola Bayesian Inference [30], Lexicon Analysis [30]
Respiratory Illness Correlation Analysis [25] Government of ontario, Kingston, Frontenac and Lennox & Addington Public Health
E Coli Latent Dirichlet Allocation [31], Lexicon Analysis [31] Robert Koch Institute
Measles Support Vector Machine [24]
Influenze-like Illnesses (Hemophilus) Bayesian Inference [15] Genbank
Vomiting TSVM [22], ARIMA (Autoregressive Integrated Moving Average) [22] Public Health England
Gastroenteritis TSVM [22], Latent Dirichlet Allocation [31], Lexicon Analysis [31], ARIMA (Autoregressive Integrated Moving Average) [22] Public Health England, Robert Koch Institute
Salmonella Support Vector Machine [24]
Food Borne Illness Support Vector Machine [32] Southern Nevada Health District (SNHD)
Earthquake Clustering [19], Bayesian Inference [19]
Stress Ordinal Regression [33]
Air Pollution Self-Organizing Map (Clustering) [34], Cross-Correlation [17] The European Centre for Medium-Range Weather Forecasts (ECMWF), London Air Quality Network
Influenze-like Illnesses (ILI) Lexicon Analysis [35], Deep Learning (CNN) [36], Fp-Growth [37], Bayesian Inference [38,39], Correlation Analysis [25], Deep Learning (RNN) [36], Deep Learning (MLP) [40], Fasttext [36], Bayesian Inference [35,41], ARIMA (Autoregressive Integrated Moving Average) [22,42], Simple Statistical Analysis [23], Support Vector Machine [37,43], Glove [36], Maximum Entropy [41], TSVM [22], Partial Differential Equation [44], Autoregressive Moving Average (Arma) [45], Outlier Detection [46], Topic Model [47], Temporal Topic Model [14], Logistic Regression [42], Count Correlation [16] Public Health England, Frontenac and Lennox & Addington Public Health, Chinese CDC, Pan American Health Organization (PAHO), CDC, HHS data, Kingston, FluWatch, Government of ontario, The Pan American Health Organi-zation (PAHO)
General Healtha Topic Model (Ailment Topic Aspect Model (Atam)) [48], Lexicon Analysis [49], Regression [49], Simple Statistical Analysis [50], Temporal Ailment Topic Aspect Model (TM-ATAM) [51] CDC, U.S. Census' State-Based Counties Gazetteer
Dengue Dbscan (Clustering) [21], Deep Learning (RNN) [52], Word Embeddings (Glove) [52], Simple Statistical Analysis [53] Brazilian Health Ministry, Philippine's Department of Health, Brazilian Official Dengue case data
Diarrhoea TSVM [22], ARIMA (Autoregressive Integrated Moving Average) [22] Public Health England
Obesity Dbscan (Clustering) [54]
a

Note that the information shown for 2019 is not comparable to that for other years due to the fact that, at the time of plotting the graph, 2019 had not elapsed.