Skip to main content
. 2021 May 21;13(3):2013–2025. doi: 10.1007/s13204-021-01868-7

Table 2.

Supervised and un-supervised machine learning for analyzing the COVID-19 disease that included articles with the related details of the Dataset, author name, country of publication, year of publication, the used method in the study, and their results

n Author Year Country Dataset Method Tasks and Algorithms Result
1 Khanday et al. (2020) 2020 India

GitHub

212 reports

Supervised learning

Classification

Logistic Regression and Naive Bayes

The findings showed that Logistic regression and multinomial Nia''ve Bayes are better than the commonly used algorithms according to 96% accuracy obtained from the findings
2 Burdick et al. (2020a) 2020 USA

United States health systems

197 patients

Supervised learning

Classification

Logistic Regression

Their results showed that this algorithm displays a higher diagnostic odds ratio (12.58) for foreseeing ventilation and effectively triage patients than a comparator early warning system, such as Modified Early Warning Score (MEWS) which showed (0.78) sensitivity, while this algorithm showed (0.90) sensitivity which leads to higher specificity (p < 0.05), also it shows the capability of accurate identification 16% of patients more than a commonly used scoring system which results in minimizing false-positive results
3 Varun et al. (2020) 2020 USA 184,319 reported cases Supervised learning

Classifications

Convolutional Neural Networks (CNN)

In response to this crisis, the medical and academic centers in New York City issued a call to action to artificial intelligence researchers to leverage their electronic medical record (EMR) data to better understand SARS-COV-2 patients. Due to the scarcity of ventilators and a reported need for a quick an accurate method of triaging patients at risk for respiratory failure, our purpose was to develop a machine-learning algorithm for frontline physicians in the emergency department and the inpatient floors to better risk-assess patients and predict who would require intubation and mechanical ventilation
4 Luca et al. (2020) 2020 Italy 85 chest X-rays Supervised Learning

Classification

K-nearest neighbors classifier (k-NN)

In the paper, we propose a method aimed to automatically detect the COVID-19 disease by analyzing medical images. We exploit supervised machine-learning techniques building a model considering a data set freely available for research purposes of 85 chest X-rays. The experiment shows the effectiveness of the proposed method in the discrimination between the COVID-19 disease and other pulmonary diseases
5 Constantin et al. (2020) 2020 Germany 152 datasets of COVID-19 patients, 500 chest CTs Supervised learning

Classifications

Convolutional Neural Network (CNN)

The findings showed that the combining between machine learning and a clinically embedded software developed platform allowed time-efficient development, immediate deployment, and fast adoption in medical routine. Finally they achieved the algorithm for fully automated segmentation of the lung and opacity quantification within just 10 days was ready for medical use and achieved human-level performance even for complex cases
6 Lamiaa et al. ( 2020) 2020 Egypt COVID-19 5000 cases Supervised learning

Regression

Linear Regression model

The result showed that the designated models, such as the exponential, fourth-degree, fifth-degree, and sixth-degree polynomial regression models are brilliant especially the fourth-degree model which will benefit the government to prepare their procedures for 1 month. Furthermore, they introduced a well-known log that will grow up the regression model and will result in obtaining the epidemic peak and the last time of the epidemic during a specific time in 2020. Besides, the final report of the total size of COVID-19 cases
7 Dan  et al. (2020) 2020 Israel 6995 patients in Sheba Medical Center Supervised learning

Classifications

Artificial Neural Network (ANN)

The most contributory variables to the models were APACHE II score, white blood cell count, and time from symptoms to admission, oxygen saturation, and blood lymphocytes count. Machine-learning models demonstrated high efficacy in predicting critical COVID-19 compared to the most efficacious tools available. Hence, artificial intelligence may be applied for accurate risk prediction of patients with COVID-19, to optimize patients triage and in-hospital allocation, better prioritization of medical resources, and improved overall management of the COVID-19 pandemic
8 Joep et al. (2020) 2020 Netherlands 319 patients Supervised learning

Classification

Logistic regression

Chest CT, using the CO-RADS scoring system, is a sensitive and specific method that can aid in the diagnosis of COVID-19, especially if RT–PCR tests are scarce during an outbreak. Combining a predictive machine-learning model could further improve the accuracy of diagnostic chest CT for COVID-19. Further candidate predictors should be analyzed to improve our model. However, RT–PCR should remain the primary standard of testing as up to 9% of RT–PCR positive patients are not diagnosed by chest CT or our machine-learning model
9 Christopher et al. (2020) 2020 Germany 368 independent variables Supervised learning

Classifications

Naive Bayes

They focused on variables and factors that increase the COVID-19 incidence in Germany depending on the multi-method ESDA tactic which provides a unique insight into spatial and spatial non-stationaries of COVID-19 occurrence, the variables, such as built environment densities, infrastructure, and socioeconomic characteristics all showed an association with incidence of COVID-19 in Germany after assessment by the county scale

Their research outcome suggests that implementation social distancing and reducing needless travel can be important methods for reducing contamination

10 Hoyt et al. (2020b) 2020 U.S 290 patients Supervised learning Classification Logistic Regression The findings showed that there is no correlation between the mortality and treatment in the entire population as the hydroxychloroquine was associated with a statistically significant (p = 0.011) rise in survival the adjusted hazard ratio was 0.29, 95% with a confidence interval (CI) 0.11–0.75. Although the patients who were indicted by the algorithm the adjusted survival was 82.6% in the treated group and 51.2% in the group who were not treated, after machine-learning applications the algorithm detected 31% of improving among the COVID-19 population which shows the important role of the machine-learning application in medicine
11 María.et al. ( et al. 2020) 2020 International Food for each of the 170 countries Unsupervised learning

Clustering

K-means clustering

The research findings stated that countries with the highest death ratio were those who had a high consumption of fats, while countries with a lower death rate have a higher level of cereal consumption followed by a lower total average intake of kilocalories
12 Shinwoo et al. (2020) 2020 U.S.A 790 Korean immigrants Supervised learning

Classifications

Artificial Neural Network (ANN)

Their result showed The Artificial Neural Network (ANN) analysis, which is a statistical model and able to examine complex non-linear interactions of variables, was applied. The algorithm perfectly predicted the person’s flexibility, familiarities of everyday discernments, and the racism actions toward Asians in the U.S. since the beginning of the COVID-19 pandemic which finally provides important suggestions for public health practitioners (Zhang 2020b)
13 Yigrem.et al. (2020) 2020 Southern Ethiopia 244 samples Supervised learning Classification Logistic Regression Results showed that more than half of the research participants were presented with perceived stress of coronavirus disease, which means that there is a strong correlation between the health care staff and perceived stress of COVID-19
14 Abolfazl et al. (2020) 2020 USA US Centers for Disease and Control and Johns Hopkins University. Database of 57 candidate Supervised learning Classification Artificial Neural Networks (ANN) Results showed that the presented model (logistic regression) shown that these factors and variables describe the presence/absence of the hotspot of the COVID-19 incidence which was clarified by Getis-Ord Gi (p < 0.05) in a geographic information system. As a result, the findings provided valuable insights for public health decision makers in categorizing the effect of the potential risk factors associated with COVID-19 incidence level