. 2021 Apr 26;23(6):1467–1497. doi: 10.1007/s10796-021-10131-x

Table 3.

Summary of selected research papers about COVID-19 and data used within those studies

Reference	Dataset	COVID19 Data	Time interval	AI/ML method	Performance	Relevance	Shortcoming
Car et al. (2020)	JHU CSSE	time series. infected, recovered, and deceased patients. 20,706 data points for 406 locations	Jan 22 – Mar 12, 2020	MLP regressor. Limited-memory BFGS (Broyden–Fletcher–Goldfarb–Shanno algorithm)	R² (confirmed): 0.94 R²(Recovered): 0.781 R²(Deceased): 0.986 on 5-cross validation	Model of novel viral infections with geographical and time data as inputs. Average training time 2357 min on 16 48-thread HPC nodes, for 5-fold cross-validation and grid search of 5376 items.	Models can be compared with various infectious diseases. Other approaches should be applied to gain explainability.
Zhu et al. (2020)	Huami Wearable Device	time series. Physiological data. 1.3 million users (with or without COVID-19)	Jul 1, 2017 – Apr 8, 2020	Regression model combining sparse categorical features and dense numerical features (CDNet), that concatenates 2 subnetworks: CatNN and DenNN	Pearson’s coefficient ρ: 0.68	Prediction using dynamic physiological data may have an advantage in recognition of the outbreak of infection.	The validity of the statistical description depends on both the user scale and diversity.
Ghamizi et al. (2020)	Google’s Mobility Reports	time series. 32 features (mobility trends over time and demographic features) for 97 different countries. 4,625 inputs of 32 features each	Jan 3 – Apr 29, 2020	Feed-Forward Neural Network (FFNN)	R²: 0.97 vs R² (LSTM): 0.95	FFNN provides accurate and interpretable predictions	better feature engineering or neural architecture search (with CNN or RNN)
Mackey et al. (2020)	Twitter and Instagram	text. Sales of COVID-19 related products. 1,042 unique tweets and 596 Instagram posts	Feb 5 – May 7, 2020	NLP & RNN and LSTM	AUC: 94–99 (based on Li et al. (2019))	Identified over 1000 suspect selling posts	Multimodal methods that could analyze and distinguish both text and image have not been used.
Murphy et al. (2020)	Netherlands Hospitals	images. Chest X-rays. 994 images including 512 images from COVID-19 positive subjects	Mar 4 – Apr 6, 2020	CAD4COVID-Xray, based on CAD4TB v6 - a commercial deep learning system	AUC: 0.81 Spec: 78%; Sens: 75%	Performance compared against 6 independent readers	Need to take into account related patient details.
Ls et al. (2020)	2 hospitals in China	images. CT scans. 408 COVID-19 patients	Jan 1 – Mar 18, 2020	ResNet34 as a backbone model for multiple instance learning (MIL) framework training procedure	(ROC) AUC : 0.987 ACC: 97.4% on 5-fold cross-validation	Model can be employed as a tool for prognosis prediction. Validated a MIL-based predictive model using CT imaging.	i) Sample size was relatively small; ii)Lack of transparency and interpretability (like all DL models)
Zhang et al. (2020)	Wuhan and Ecuador centers; Radiopaedia dataset	images. CT images. 2,246 patients including 752 COVID-19 patients used for training	Jan 25 – Mar 25, 2020	(i) segmentation networks: U-net, DRUNET, FCN SegNet, and DeepLabv3. (ii) Classification networks: ResNet-18	ACC = 90.71% Sens = 92.50%, Spec = 90.00%	Performance comparable to that of practicing radiologists.	To refine the clinical prognostic model with varying risk thresholds associated with different clinical prognoses.
Abdel-Basset et al. (2021a)	Italian Society of Medical and Interventional Radiology	images. CT images. 80 COVID-19 patients used for image segmentation	before Apr 11, 2020	Few-shot segmentation (FSS) with four encoder blocks based on pre-trained Res2Net-50	DSC: 0.798 Sens: 0.803, Spec: 0.986	Model could outperform all approaches to multiple evaluation metrics	i) Comprehensive parameter improving to attain the highest results, ii) Predictions lack laborious uncertainty quantification, unable to achieve a very precise segmentation iii) Accountability and interpretability do need to be improved.
Roy et al. (2020)	ICLUS-DB	video. Lung ultrasound (LUS) videos. 35 patients (including 17 COVID-19 patients) generating 277 videos	Mar – Apr, 2020	ConvNet similar to van Sloun and Demi (2020), B-line, STN and CNN are jointly trained by using the Adam optimizer	ACC: 96% binary Dice score: 0.75	i) Fully-annotated dataset of LUS images, ii) Predicts the disease severity score associated with a input frame.	i) Leveraging the temporal structure between frames in a sequential model; ii) The data set should be wider and more balanced
Banerjee et al. (2020)	Hospital in Brazil	time series. laboratory test clinical data: age, outcome from SARS-CoV-2 test and standard full blood count (15 features). individual patients, including 81 COVID-19 patients	Mar 28 – Apr 3, 2020	i) ANN; ii) random forest (RF) and Lasso-elastic-net regularized generalized linear (glmnet); iii) simple logistic regression (LR)	(i) (ROC) AUC 0.95 ± 0.08 (ii) (ROC) AUC: 94% (iii) (ROC) AUC: 81%	Improve initial screening for patients with limited PCR-based diagnostic tools.	Random forests and glmnet offer a clearer overview of the most relevant factors, compared to ANN, as well as a better indicator on how a decision has been reached.
Pan et al. (2021)	2 isolation centers of Huazhong University of Science and Technology in Wuhan	multimedia. chest CT scans. 931 confirmed COVID-19 vs 1340 healthy persons	Until Mar 31, 2020	COVID-Lesion Net based on a combination of U-net and Fully convolutional networks	Dice coefficient: 82.08% 85.00% for the training	Deep learning-based quantification for COVID-19, quantification of the lung volume and the percent of the lung involvement.	i) Performance measured against no standard for the lesion area quantification for viral pneumonia, ii) Not multi-center training
Ismael and Şengür (2021)	three different sources (Cohen, Kaggle, Radiology Assistant)	multimedia Chest X-ray images. 180 COVID-19 and 200 normal (healthy) chest X-ray images	Mar 10, 2020	deep features model (ResNet50) and SVM with Linear kernel	94.7% accuracy other: 89.1%- 90.3%	Three CNN deep methods have been applied. In addition to different kernel functions, the deep features have been classified through SVM.	More testing needed.
Lopez-Rincon et al. (2021)	NCBI database of genetic variation and NGDC (National Genomics Data Center)	sequences. 583 sequences (*.fasta files) from the NGDC	Mar 15, 2020	CNN	Accuracy of 98.73	The network was able to systematically discover significant sequences to isolate the various virus classes.	Further testing is necessary

During the first peak of the COVID-19 pandemic (stage 3), the principal affected countries were in Europe and America and therefore the databases generally come from these areas. The first examples of social network analysis are reported, with a limited number of instances. The temporal windows during which the data were gathered extend until April 2020. Despite the time interval reported for Mackey et al. (2020), the relationship with Stage 3 for the COVID-19 is due to the fact that in the USA, by that time, the pandemic phase was still in the first stages