Modelling framework for the analysis of symptom associations and COVID-19 prediction. Data from 777 patients were obtained from different hospitals in the South of Spain. (a) For the analysis of the association between the intensity reported for loss of smell and taste, along with other symptoms, and a COVID-19 diagnosis, a first model was derived using step-wise logistic regression (LR) with a holdout validation scheme, by splitting the sample into a training (75%) and a testing dataset (25%). The performance of the model was assessed through ROC analysis, with AUC, SE, PPV and NPV parameters being calculated for the holdout testing (25%) dataset. (b) For the analysis of the discrimination ability and predictive value of different symptom variable datasets, including categorical (D1), continuous visual analog scales VAS (D2), dichotomized VAS (D3) as well as simplified predictor datasets with a reduced number of symptoms (D4 and D5), a comprehensive 50-fold cross-validation scheme was designed by assessing three different ML algorithms (LR, RF, and SVM). The performance of the models obtained were calculated through the mean AUC, SE, SP, PPV and NPV values over the 50-cross validated estimates obtained for each model tested. LR = logistic regression. RF = random forest. SVM = support vector machine, ROC= receiver operating characteristic, AUC= area under the curve, SE = sensitivity, SP = specificity, PPV= positive predictive value, NPV= negative predictive value.