Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Feb 3;10(4):570. doi: 10.3390/jcm10040570

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2021 by the authors.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

PMC Copyright notice

Modelling framework for the analysis of symptom associations and COVID-19 prediction. Data from 777 patients were obtained from different hospitals in the South of Spain. (a) For the analysis of the association between the intensity reported for loss of smell and taste, along with other symptoms, and a COVID-19 diagnosis, a first model was derived using step-wise logistic regression (LR) with a holdout validation scheme, by splitting the sample into a training (75%) and a testing dataset (25%). The performance of the model was assessed through ROC analysis, with AUC, SE, PPV and NPV parameters being calculated for the holdout testing (25%) dataset. (b) For the analysis of the discrimination ability and predictive value of different symptom variable datasets, including categorical (D1), continuous visual analog scales VAS (D2), dichotomized VAS (D3) as well as simplified predictor datasets with a reduced number of symptoms (D4 and D5), a comprehensive 50-fold cross-validation scheme was designed by assessing three different ML algorithms (LR, RF, and SVM). The performance of the models obtained were calculated through the mean AUC, SE, SP, PPV and NPV values over the 50-cross validated estimates obtained for each model tested. LR = logistic regression. RF = random forest. SVM = support vector machine, ROC= receiver operating characteristic, AUC= area under the curve, SE = sensitivity, SP = specificity, PPV= positive predictive value, NPV= negative predictive value.