News Sentiment Informed Time-series Analyzing AI (SITALA) to curb the spread of COVID-19 in Houston

Prathamesh S Desai

doi:10.1016/j.eswa.2021.115104

. 2021 Apr 29;180:115104. doi: 10.1016/j.eswa.2021.115104

News Sentiment Informed Time-series Analyzing AI (SITALA) to curb the spread of COVID-19 in Houston

Prathamesh S Desai ^1,^⁎,¹

PMCID: PMC8081574 PMID: 33942002

Abstract

Coronavirus disease (COVID-19) has evolved into a pandemic with many unknowns. Houston, located in the Harris County of Texas, is becoming the next hotspot of this pandemic. With a severe decline in international and inter-state travel, a model at the county level is needed as opposed to the state or country level. Existing approaches have a few drawbacks. Firstly, the data used is the number of COVID-19 positive cases instead of positivity. The former is a function of the number of tests carried out while the number of tests normalizes the latter. Positivity gives a better picture of the spread of this pandemic as, with time, more tests are being administered. Positivity under 5% has been desired for the reopening of businesses to almost 100% capacity. Secondly, the data used by models like SEIRD (Susceptible, Exposed, Infectious, Recovered, and Deceased) lacks information about the sentiment of people concerning coronavirus. Thirdly, models that make use of social media posts might have too much noise and misinformation. On the other hand, news sentiment can capture long-term effects of hidden variables like public policy, opinions of local doctors, and disobedience of state-wide mandates. The present study introduces a new artificial intelligence (i.e., AI) model, viz., Sentiment Informed Time-series Analyzing AI (SITALA), trained on COVID-19 test positivity data and news sentiment from over 2750 news articles for Harris county. The news sentiment was obtained using IBM Watson Discovery News. SITALA is inspired by Google-Wavenet architecture and makes use of TensorFlow. The mean absolute error for the training dataset of 66 consecutive days is 2.76, and that for the test dataset of 22 consecutive days is 9.6. A cone of uncertainty is provided within which future COVID-19 test positivity has been shown to fall with high accuracy. The model predictions fare better than a published Bayesian-based SEIRD model. The model forecasts that in order to curb the spread of coronavirus in Houston, a sustained negative news sentiment (e.g., death count for COVID-19 will grow at an alarming rate in Houston if mask orders are not followed) will be desirable. Public policymakers may use SITALA to set the tone of the local policies and mandates.

Keywords: COVID-19 model, News sentiment, Public policy, Deep learning, Artificial intelligence, Pandemic forecast

1. Introduction

In the present-day USA, pandemic models of COVID-19 at the level of state or country (Giuliani et al., 2020, Yang et al., 2020, Benvenuto et al., 2020, Dandekar and Barbastathis, 2020, Chimmula and Zhang, 2020, Mohamadou et al., 2020) are not of much use due to a severe decline in air travel (USDoT, 2020, Chinazzi et al., 2020). The most predominant spread of the virus is restricted to the geography of a county or a couple of neighboring counties. National, and more so local, news articles provide a fairly accurate picture of the ongoing situation in a crisis-stricken county, Harris in the case of the present study. Local public policymakers can thus play an important role in setting the tone or sentiment of the news at the county level (Adolph, Amano, Bang-Jensen, Fullman, & Wilkerson, 2020). The spread of coronavirus in the short-term future is a function of test positivity data; however, news sentiment starts playing an important role over longer periods. For example, a stay-at-home order with a strong negative sentiment issued today may only start seeing the decline in test positivity after 10–14 days due to the coronavirus’s incubation period. Existing approaches have a few drawbacks (Yang et al., 2020, Shinde et al., 2020, Karisani and Karisani, 2020, Ayyoubzadeh et al., 2020, Alamoodi et al., 2020, Jha et al., 2020): Firstly, the data used is the number of COVID-19 positive cases instead of positivity. Secondly, the data used by models like SEIRD (Susceptible, Exposed, Infectious, Recovered, and Dead) lack information about the sentiment of people concerning coronavirus. Thirdly, models that make use of social media posts might have too much noise.

This first-of-a-kind study attempts to develop a multivariate artificial intelligence (AI) model to analyze the time series of COVID-19 positivity and news sentiment. The AI model is inspired by Google-Wavenet (Oord et al., 2016) architecture and uses IBM Watson Discovery News (High, 2012) to mine COVID-19 sentiment in the news articles. To the best of the author’s knowledge, this is the first AI study to combine spatial information via news sentiment at the county level with COVID-19 positivity data (Nguyen, 2020). The methodology is described in Section 2, predictions of the model are presented in Section 3 and compared with a published SEIRD temporal model from the literature. The current model fares better than the published Bayesian-based SEIRD model (Jha et al., 2020). Discussion of the modeling results is presented in Section 4.

2. Methodology

2.1. Data

The COVID-19 test positivity data for Harris county was obtained from the website of Texas Department of State Health Services. A couple of instances of wrong or missing data were filled using linear interpolation. IBM Watson Discovery was used to mine the news sentiment in 2867 news articles for three months. The tool provides 200 free queries per user per month. More information can be found here: link. Discovery News employs natural language processing to return answers to the queries. It also analyzes the sentiment of the news articles. The query used in this study was:

Is the spread of coronavirus or covid-19 or 2019-nCoV under control?

Include analysis of your results.

$ average(enriched_text.sentiment.document.score)$

Filter which documents you query.

$publication_date::‘‘2020-05-29",url:‘‘houston",

(enriched_text.keywords.text:‘‘coronavirus"|

enriched_text.keywords.text:‘‘COVID-19"|

enriched_text.keywords.text:‘‘2019-nCoV")$

A sample query along with the output from Watson Discovery News is shown in Fig. 1 . The entire dataset is also provided in the Appendix A.

2.2. Model (Oord et al., 2016, High, 2012, Géron, 2019, Ting et al., 2020, Wan et al., 2019, Borovykh et al., 2017)

Daily COVID-19 positivity rate and news sentiment are passed to a Wavenet-inspired multivariate convolution neural network (CNN) to predict the future COVID-19 positivity rate. The AI is named SITALA or Sentiment Informed Time series Analyzing AI. A 16-day window, based on the coronavirus incubation period of 11–16 days (Lauer et al., 2020), with a stride of 1 was used to generate training and test datasets. The architecture of SITALA is shown in Fig. 1. Dilated causal convolutions help with the transmission of long-term effects. Convolution rate of 1, 2, 4, and 8 days was used. The output is a single-point prediction of COVID-19 in the future. 10% of the test data was reserved for validation. Jupyter notebook code for the neural network architecture and the chosen hyper-parameters are provided in the B.

Cross-validation for time series data may result in data leakage if necessary precautions are not taken (Bergmeir, Hyndman, & Koo, 2018). Additionally, cross-validation, in general, may result in overfitting to the training data (Rao, Fung, & Rosales, 2008). Thus, cross-validation was not performed considering the limited amount of data and the unknown final desired MAE.

3. Results

The data used in the present study is shown in Fig. 2, Fig. 3 and the predictions of SITALA. Data for Harris county from 04/21 to 07/18 has been used in this study. The number of news articles returned by IBM Watson Discovery News is shown with bars that are multiplied by the sign of the average daily news sentiment. The average daily news sentiment (connected blue squares) can be +1.0 for maximum positive sentiment and −1.0 for maximum negative sentiment. Overall, on most of the days, the news sentiment about the spread of coronavirus in Houston, Harris County, Texas, has been negative. A significant positive spike in news sentiment is seen around the time of social unrest in Houston (05/30 to 06/02). The focus of news might have shifted away from COVID-19 during this time-frame. Overall, an upward trend is visible in the COVID-19 positivity data (connected red dots).

As the window size based on the virus’s incubation period is 16 days (Lauer et al., 2020), the test data set should have at least 16 days’ worth of data. The model is expected to predict at least a week into the future beyond these 16 days. Thus, the last 22 days’ worth of data was reserved for testing. This 22 days’ worth of data amounted to about 25% data for testing, and the remaining 75% of data was used for training. 10% of the training data was used for validation. The continuous black line with shadow shows the predictions of trained SITALA over the entire dataset. SITALA can capture the COVID-19 positivity data response with a mean absolute error (MAE) of 2.76 for the training dataset and 9.6 for the test dataset. SITALA is unable to capture the highest spikes encountered in both the datasets of COVID-19 positivity. This prediction error may have been due to the smaller number of observations in the total dataset (a total of 88 days’ worth of observations is not at the level of big data requirements) and the smoothing out effect introduced by the time window of 16 days.

The theoretical limits of news sentiment are +1.0 and −1.0. However, it is not feasible to expect that all the news outlets would maintain such a sustained positive or negative news sentiment. The observed minimum news sentiment in this study was about −0.62 (refer to Table A1) and was also comparable to another study on news sentiment during COVID-19 (Buckman, Shapiro, Sudhof, & Wilson, 2020). Thus, the news sentiment limits were assumed to be +0.7 and −0.7 for future news sentiment. With these as the inputs to SITALA, two extreme forecasts were obtained till 08/07. These are also shown in Fig. 3 with dotted black and dotted green lines with shadow. SITALA forecasts the positivity of COVID-19 in Houston to lie within this cone of uncertainty. The model forecasts that a sustained positive sentiment, e.g., “masks are optional”, may prove disastrous for the spread of coronavirus in Houston. COVID-19 test positivity can reach 60% in this case. On the contrary, a sustained negative sentiment, e.g., “death count for COVID-19 will grow at an alarming rate in Houston if mask orders are not followed”, may help to discourage social gatherings and to keep the COVID-19 positivity under check. COVID-19 positivity, as forecasted by SITALA, will stay under 20% in this case.

COVID-19 test positivity (truth) from 07/19 to 08/05 has also been shown in the Fig. 3. The ground truth in the forecast falls perfectly on or inside the predicted boundaries except for 07/21, where SITALA under-predicts and 08/02 through 08/04, where SITALA over-predicts. The former error of under-prediction may be problematic, but the latter error of over-prediction is not harmful. Note that SITALA was only trained on the data until 06/26. Also shown in Fig. 3 is the prediction by another model published in the literature (Jha et al., 2020). The model developed by Jha et al. is a SERID (i.e., Susceptible, Exposed, Infectious, Recovered, and Deceased) temporal model. Bayesian learning was used to find the optimal model parameters. The model was calibrated for the data of the entire state of Texas. The positivity predictions of this model for Harris county are on the lower end of the side. This prediction error highlights the drawback of SEIRD models that do not account for any spatial data. Local news articles capture the spatial information of the COVID-19 spread, and SITALA that made use of this additional spatial information could better predict the COVID-19 positivity.

4. Discussion

This study highlights the multivariate nature of COVID-19 positivity. The unknowns about the disease have not yet been thoroughly understood. However, public policymakers can benefit from models like SITALA, which add news sentiment dimension to the COVID-19 test positivity data to make forecasts. The long-term effect of sentiment due to the virus incubation period of 14 days can be captured using an AI, making use of dilated causal convolutions. In the coming weeks or even months, news publishers would have a more significant role in curbing the spread of coronavirus in Houston. SITALA is a continually evolving AI and should be enhanced with newer data, as and when available, using transfer learning. SITALA may be deployed at other similar crisis-stricken counties in New York, Florida, and California.

4.1. Limitations

The query searched for the articles having ‘houston’ in the URL may have caused the omission of few relevant articles that did not have Houston in the URL. During the initial few days of the training dataset, there were hardly any articles relevant to the IBM Watson query, and thus the sentiment during this period was assumed to be neutral, i.e., a value of 0.

4.2. Ethics (Cutler et al., 2019, Ienca and Vayena, 2020)

SITALA may only be used as an ethical guide by public policymakers to set policies. The author does not support any unethical use of SITALA.

Disclaimers

Funding: No specific funding was received for this work; Data and materials availability: All data is available in the Appendix A.

CRediT authorship contribution statement

Prathamesh S. Desai: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Visualization.

Declaration of Competing Interest

The author declares that he has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Biography

Prathamesh is a research scientist in Mechanical Engineering at W. M. Rice University. With a Ph.D. in mechanical engineering from Carnegie Mellon University (CMU), he develops experimentally-validated computational models (high fidelity physics-based and surrogate AI-enabled) of systems involving particles, fluids, and solids.

Appendix A. Entire dataset

Entire dataset is shown in Table A.1 , for the Harris county, that was used in the present study. Also included are the predictions and forecast of SITALA. Window size of 16 was used in this study therefore the predictions of SITALA start from the $17^{th}$ day (i.e., 5/7).

Table A.1.

		Inputs		Output	SITALA predictions
	Date	Percent cleaned test positivity	Percent COVID news sentiment	Percent cleaned test positivity	Positivity predictions	Forecast with positive sentiment (70)	Forecast with negative sentiment (−70)
Training dataset with 10% reserved for validation	4/23	2.24	0.00	2.24
	4/24	3.56	0.00	3.56
	4/25	4.38	0.00	4.38
	4/26	21.19	0.00	21.19
	4/27	3.93	−61.81	3.93
	4/28	14.83	0.00	14.83
	4/29	4.24	0.00	4.24
	4/30	7.58	0.00	7.58
	5/1	6.67	0.00	6.67
	5/2	5.76	0.00	5.76
	5/3	4.89	0.00	4.89
	5/4	4.29	−52.66	4.29
	5/5	3.73	0.00	3.73
	5/6	3.18	0.00	3.18
	5/7	2.62	0.00	2.62	2.60
	5/8	2.36	0.00	2.36	2.49
	5/9	6.29	0.00	6.29	6.08
	5/10	4.20	0.00	4.20	4.16
	5/11	2.23	0.00	2.23	2.06
	5/12	3.93	0.00	3.93	3.67
	5/13	4.50	0.00	4.50	4.32
	5/14	4.29	0.00	4.29	4.17
	5/15	9.17	5.43	9.17	8.85
	5/16	4.78	1.36	4.78	4.59
	5/17	1.81	−26.88	1.81	1.76
	5/18	7.66	13.96	7.66	7.32
	5/19	2.38	−10.04	2.38	2.19
	5/20	5.00	7.82	5.00	4.88
	5/21	6.64	1.89	6.64	6.49
	5/22	8.48	11.79	8.48	8.32
	5/23	6.74	−54.95	6.74	6.56
	5/24	4.99	−11.66	4.99	4.92
	5/25	3.25	−30.84	3.25	3.19
	5/26	1.46	6.23	1.46	1.52
	5/27	14.83	11.26	14.83	14.78
	5/28	4.27	11.29	4.27	4.35
	5/29	6.20	5.52	6.20	5.92
	5/30	6.06	39.18	6.06	6.00
	5/31	21.91	25.26	21.91	21.21
	6/1	0.96	16.38	0.96	0.68
	6/2	8.34	0.33	8.34	7.94
	6/3	12.92	−3.21	12.92	12.45
	6/4	17.55	15.25	17.55	17.02
	6/5	12.49	9.44	12.49	11.88
	6/6	7.89	−26.27	7.89	6.94
	6/7	5.57	−15.18	5.57	5.21
	6/8	11.63	−19.69	11.63	11.01
	6/9	16.62	−14.70	16.62	16.26
	6/10	15.01	2.63	15.01	14.11
	6/11	13.40	−12.09	13.40	13.22
	6/12	11.79	−18.90	11.79	11.28
	6/13	10.19	−28.92	10.19	9.99
	6/14	7.46	2.87	7.46	7.58
	6/15	7.16	−4.26	7.16	7.27
	6/16	5.53	2.51	5.53	5.57
	6/17	10.04	14.69	10.04	10.10
	6/18	7.08	−0.39	7.08	6.87
	6/19	5.75	15.41	5.75	5.68
	6/20	17.55	6.68	17.55	16.89
	6/21	27.71	−27.42	27.71	6.39
	6/22	3.71	4.48	3.71	6.63
	6/23	60.35	−17.23	60.35	12.43
	6/24	45.44	−10.17	45.44	9.54
	6/25	30.53	−0.26	30.53	13.65
	6/26	15.62	−16.69	15.62	11.36

Test dataset	6/27	9.05	−10.33	9.05	14.38
	6/28	9.18	−48.70	9.18	14.08
	6/29	1.64	−20.83	1.64	17.87
	6/30	19.83	−15.70	19.83	11.95
	7/1	14.38	−20.46	14.38	9.54
	7/2	20.69	−30.37	20.69	17.72
	7/3	23.57	−11.68	23.57	8.66
	7/4	15.27	−24.72	15.27	12.23
	7/5	6.32	−35.68	6.32	16.23
	7/6	7.38	1.79	7.38	11.14
	7/7	13.42	3.53	13.42	12.01
	7/8	15.12	−41.36	15.12	11.32
	7/9	13.59	−50.11	13.59	10.55
	7/10	11.44	−27.35	11.44	12.04
	7/11	10.81	−21.80	10.81	5.45
	7/12	25.92	−51.09	25.92	11.06
	7/13	18.76	−4.39	18.76	6.50
	7/14	49.02	0.49	49.02	6.91
	7/15	28.24	−11.31	28.24	7.88
	7/16	27.44	1.59	27.44	16.13
	7/17	26.65	−9.15	26.65	9.96
	7/18	20.28	−31.79	20.28	20.28

Forecast	7/19	12.27				9.23	9.23
	7/20	14.15				17.31	13.78
	7/21	20.52				26.54	17.34
	7/22	27.54				14.89	8.86
	7/23	24.79				22.31	11.32
	7/24	19.83				32.09	16.35
	7/25	16.16				26.15	8.79
	7/26	21.30				29.99	10.60
	7/27	21.00				37.56	14.11
	7/28	30.01				33.28	12.61
	7/29	39.18				41.10	9.85
	7/30	23.33				44.90	15.87
	7/31	43.18				45.41	14.24
	8/1	37.56				49.85	16.57
	8/2	12.00				57.47	18.08
	8/3	5.75				56.75	19.36
	8/4	15.04				55.62	16.81
	8/5	21.09				56.75	17.65
	8/6					59.65	17.22
	8/7					60.82	17.17

Open in a new tab

Appendix B. Jupyter notebook

The part of the Jupyter notebook that describes the architecture of SITALA is reproduced below. Causal padding and relu activation were used along with 4 dilated convolutional hidden layers. Contact PSD (pratnsai@gmail.com) to obtain the trained SITALA for transfer learning.

# SITALA: Sentiment Informed Timeseries Analyzing AI

# Location: Harris county, TX

# Purpose: Forecast spread of coronavirus

# Author: Prathamesh S. Desai

tf.keras.backend.clear_session()

tf.random.set_seed(40)

np.random.seed(40)

# Model definition

model = tf.keras.Sequential()

model.add(tf.keras.layers.InputLayer(input_shape=(window, n_features)))

for dilation_rate in (1, 2, 4, 8):

model.add(tf.keras.layers.Conv1D(filters = 64, kernel_size = 2, strides = 1, dilation_rate = dilation_rate, padding=‘‘causal", activation=‘‘relu"))

model.add(tf.keras.layers.Flatten())

model.add(tf.keras.layers.Dense(128, activation=’relu’))

model.add(tf.keras.layers.Dense(1))

# Model compilation

optimizer = tf.keras.optimizers.Adam(lr = 5e-4)

model.compile(loss = tf.keras.losses.Huber(), optimizer = optimizer, metrics=[‘‘mae"])

early_stopping = tf.keras.callbacks.EarlyStopping(patience = 200)

model.summary()

history = model.fit(X_train, Y_train, epochs = 500, verbose = True, validation_split = 0.1, callbacks=[early_stopping])

print(‘‘SITALA has been trained")

Open in a new tab

References

Adolph, C., Amano, K., Bang-Jensen, B., Fullman, N., & Wilkerson, J. (2020). Pandemic politics: Timing state-level social distancing responses to COVID-19. medRxiv. [DOI] [PubMed]
Alamoodi, A., Zaidan, B., Zaidan, A., Albahri, O., Mohammed, K., Malik, R., Almahdi, E., Chyad, M., Tareq, Z., Albahri, A. et al. (2020). Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. Expert systems with applications, (p. 114155). [DOI] [PMC free article] [PubMed]
Ayyoubzadeh S.M., Ayyoubzadeh S.M., Zahedi H., Ahmadi M., Kalhori S.R.N. Predicting COVID-19 incidence through analysis of Google trends data in Iran: Data mining and deep learning pilot study. JMIR Public Health and Surveillance. 2020;6 doi: 10.2196/18828. [DOI] [PMC free article] [PubMed] [Google Scholar]
Benvenuto, D., Giovanetti, M., Vassallo, L., Angeletti, S., & Ciccozzi, M. (2020). Application of the ARIMA model on the COVID-2019 epidemic dataset. Data in brief, (p. 105340). [DOI] [PMC free article] [PubMed]
Bergmeir C., Hyndman R.J., Koo B. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics & Data Analysis. 2018;120:70–83. [Google Scholar]
Borovykh, A., Bohte, S., & Oosterlee, C.W. (2017). Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691.
Buckman S.R., Shapiro A.H., Sudhof M., Wilson D.J., et al. News sentiment in the time of COVID-19. FRBSF Economic Letter. 2020;8:1–05. [Google Scholar]
Chimmula V.K.R., Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos, Solitons & Fractals. 2020:109864. doi: 10.1016/j.chaos.2020.109864. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chinazzi, M., Davis, J.T., Ajelli, M., Gioannini, C., Litvinova, M., Merler, S., y Piontti, A.P., Mu, K., Rossi, L., Sun, K. et al. (2020). The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science, 368, 395–400. [DOI] [PMC free article] [PubMed]
Cutler A., Pribić M., Humphrey L. IBM Corporation; 2019. Everyday ethics for artificial intelligence. PDF. [Google Scholar]
Dandekar, R., & Barbastathis, G. (2020). Neural network aided quarantine control model estimation of global COVID-19 spread. arXiv preprint arXiv:2004.02752.
Géron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media.
Giuliani, D., Dickson, M. M., Espa, G., & Santi, F. (2020). Modelling and predicting the spatio-temporal spread of coronavirus disease 2019 (COVID-19) in Italy. Available at SSRN 3559569. [DOI] [PMC free article] [PubMed]
High R. The era of cognitive systems: An inside look at IBM Watson and how it works. IBM Corporation, Redbooks. 2012:1–16. [Google Scholar]
Ienca M., Vayena E. On the responsible use of digital data to tackle the COVID-19 pandemic. Nature Medicine. 2020;26:463–464. doi: 10.1038/s41591-020-0832-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jha P.K., Cao L., Oden J.T. Bayesian-based predictions of COVID-19 evolution in texas using multispecies mixture-theoretic continuum models. Computational Mechanics. 2020;66:1055–1068. doi: 10.1007/s00466-020-01889-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Karisani, N., & Karisani, P. (2020). Mining coronavirus (COVID-19) posts in social media. arXiv preprint arXiv:2004.06778.
Lauer S.A., Grantz K.H., Bi Q., Jones F.K., Zheng Q., Meredith H.R., Azman A.S., Reich N.G., Lessler J. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Annals of Internal Medicine. 2020;172:577–582. doi: 10.7326/M20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mohamadou, Y., Halidou, A., & Kapen, P. T. (2020). A review of mathematical modeling, artificial intelligence and datasets used in the study, prediction and management of COVID-19. Applied Intelligence, (pp. 1–13). [DOI] [PMC free article] [PubMed]
Nguyen, T. T. (2020). Artificial intelligence in the battle against coronavirus (COVID-19): A survey and future research directions. arXiv preprint arXiv:2008.07343.
Oord, A. v. d., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
Rao, R.B., Fung, G., & Rosales, R. (2008). On the dangers of cross-validation. an experimental evaluation. In Proceedings of the 2008 SIAM international conference on data mining (pp. 588–596). SIAM.
Shinde G.R., Kalamkar A.B., Mahalle P.N., Dey N., Chaki J., Hassanien A.E. Forecasting models for coronavirus disease (COVID-19): A survey of the state-of-the-art. SN Computer Science. 2020;1:1–15. doi: 10.1007/s42979-020-00209-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ting D.S.W., Carin L., Dzau V., Wong T.Y. Digital technology and COVID-19. Nature Medicine. 2020;26:459–461. doi: 10.1038/s41591-020-0824-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
USDoT (2020). Air traffic data, May 2020: 89% reduction in U.S. airline passengers from May 2019 (preliminary) [the easiest access to this source is via the URL]. URL:https://tinyurl.com/y699a4oz.
Wan R., Mei S., Wang J., Liu M., Yang F. Multivariate temporal convolutional network: A deep neural networks approach for multivariate time series forecasting. Electronics. 2019;8:876. [Google Scholar]
Yang Z., Zeng Z., Wang K., Wong S.-S., Liang W., Zanin M., Liu P., Cao X., Gao Z., Mai Z., et al. Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions. Journal of Thoracic Disease. 2020;12:165. doi: 10.21037/jtd.2020.02.64. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0005] Adolph, C., Amano, K., Bang-Jensen, B., Fullman, N., & Wilkerson, J. (2020). Pandemic politics: Timing state-level social distancing responses to COVID-19. medRxiv. [DOI] [PubMed]

[b0010] Alamoodi, A., Zaidan, B., Zaidan, A., Albahri, O., Mohammed, K., Malik, R., Almahdi, E., Chyad, M., Tareq, Z., Albahri, A. et al. (2020). Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. Expert systems with applications, (p. 114155). [DOI] [PMC free article] [PubMed]

[b0015] Ayyoubzadeh S.M., Ayyoubzadeh S.M., Zahedi H., Ahmadi M., Kalhori S.R.N. Predicting COVID-19 incidence through analysis of Google trends data in Iran: Data mining and deep learning pilot study. JMIR Public Health and Surveillance. 2020;6 doi: 10.2196/18828. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0020] Benvenuto, D., Giovanetti, M., Vassallo, L., Angeletti, S., & Ciccozzi, M. (2020). Application of the ARIMA model on the COVID-2019 epidemic dataset. Data in brief, (p. 105340). [DOI] [PMC free article] [PubMed]

[b0025] Bergmeir C., Hyndman R.J., Koo B. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics & Data Analysis. 2018;120:70–83. [Google Scholar]

[b0030] Borovykh, A., Bohte, S., & Oosterlee, C.W. (2017). Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691.

[b0035] Buckman S.R., Shapiro A.H., Sudhof M., Wilson D.J., et al. News sentiment in the time of COVID-19. FRBSF Economic Letter. 2020;8:1–05. [Google Scholar]

[b0040] Chimmula V.K.R., Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos, Solitons & Fractals. 2020:109864. doi: 10.1016/j.chaos.2020.109864. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0045] Chinazzi, M., Davis, J.T., Ajelli, M., Gioannini, C., Litvinova, M., Merler, S., y Piontti, A.P., Mu, K., Rossi, L., Sun, K. et al. (2020). The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science, 368, 395–400. [DOI] [PMC free article] [PubMed]

[b0050] Cutler A., Pribić M., Humphrey L. IBM Corporation; 2019. Everyday ethics for artificial intelligence. PDF. [Google Scholar]

[b0055] Dandekar, R., & Barbastathis, G. (2020). Neural network aided quarantine control model estimation of global COVID-19 spread. arXiv preprint arXiv:2004.02752.

[b0060] Géron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media.

[b0065] Giuliani, D., Dickson, M. M., Espa, G., & Santi, F. (2020). Modelling and predicting the spatio-temporal spread of coronavirus disease 2019 (COVID-19) in Italy. Available at SSRN 3559569. [DOI] [PMC free article] [PubMed]

[b0070] High R. The era of cognitive systems: An inside look at IBM Watson and how it works. IBM Corporation, Redbooks. 2012:1–16. [Google Scholar]

[b0075] Ienca M., Vayena E. On the responsible use of digital data to tackle the COVID-19 pandemic. Nature Medicine. 2020;26:463–464. doi: 10.1038/s41591-020-0832-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0080] Jha P.K., Cao L., Oden J.T. Bayesian-based predictions of COVID-19 evolution in texas using multispecies mixture-theoretic continuum models. Computational Mechanics. 2020;66:1055–1068. doi: 10.1007/s00466-020-01889-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0085] Karisani, N., & Karisani, P. (2020). Mining coronavirus (COVID-19) posts in social media. arXiv preprint arXiv:2004.06778.

[b0090] Lauer S.A., Grantz K.H., Bi Q., Jones F.K., Zheng Q., Meredith H.R., Azman A.S., Reich N.G., Lessler J. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Annals of Internal Medicine. 2020;172:577–582. doi: 10.7326/M20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0095] Mohamadou, Y., Halidou, A., & Kapen, P. T. (2020). A review of mathematical modeling, artificial intelligence and datasets used in the study, prediction and management of COVID-19. Applied Intelligence, (pp. 1–13). [DOI] [PMC free article] [PubMed]

[b0100] Nguyen, T. T. (2020). Artificial intelligence in the battle against coronavirus (COVID-19): A survey and future research directions. arXiv preprint arXiv:2008.07343.

[b0105] Oord, A. v. d., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.

[b0110] Rao, R.B., Fung, G., & Rosales, R. (2008). On the dangers of cross-validation. an experimental evaluation. In Proceedings of the 2008 SIAM international conference on data mining (pp. 588–596). SIAM.

[b0115] Shinde G.R., Kalamkar A.B., Mahalle P.N., Dey N., Chaki J., Hassanien A.E. Forecasting models for coronavirus disease (COVID-19): A survey of the state-of-the-art. SN Computer Science. 2020;1:1–15. doi: 10.1007/s42979-020-00209-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0120] Ting D.S.W., Carin L., Dzau V., Wong T.Y. Digital technology and COVID-19. Nature Medicine. 2020;26:459–461. doi: 10.1038/s41591-020-0824-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0125] USDoT (2020). Air traffic data, May 2020: 89% reduction in U.S. airline passengers from May 2019 (preliminary) [the easiest access to this source is via the URL]. URL:https://tinyurl.com/y699a4oz.

[b0130] Wan R., Mei S., Wang J., Liu M., Yang F. Multivariate temporal convolutional network: A deep neural networks approach for multivariate time series forecasting. Electronics. 2019;8:876. [Google Scholar]

[b0135] Yang Z., Zeng Z., Wang K., Wong S.-S., Liang W., Zanin M., Liu P., Cao X., Gao Z., Mai Z., et al. Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions. Journal of Thoracic Disease. 2020;12:165. doi: 10.21037/jtd.2020.02.64. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

News Sentiment Informed Time-series Analyzing AI (SITALA) to curb the spread of COVID-19 in Houston

Prathamesh S Desai

Abstract

1. Introduction