A risk scoring model of COVID-19 at hospital admission

João José Ferreira Gomes; António Ferreira; Afonso Alves; Beatriz Nogueira Sequeira

doi:10.1371/journal.pone.0288460

. 2023 Jul 20;18(7):e0288460. doi: 10.1371/journal.pone.0288460

A risk scoring model of COVID-19 at hospital admission

João José Ferreira Gomes ^1,^*, António Ferreira ², Afonso Alves ¹, Beatriz Nogueira Sequeira ¹

Editor: Robert Jeenchen Chen³

PMCID: PMC10358923 PMID: 37471332

Abstract

Background

The COVID-19 pandemic has been the most serious public health crisis in recent times, a pandemic whose impact was felt across the globe in various groups and populations. Confronted with an urgent problem, people and governments were forced to make decisions without fully understanding the disease. The present work aims to reinforce our ever-growing knowledge of the illness, particularly in modelling the risk of death of a patient admitted to a hospital with a positive COVID-19 test.

Methods

Given the simplicity of using and programming logistic regression in any national healthcare unit and the ease of interpreting the results, we chose to use this technique over several other. Using scoring techniques, it is possible to associate the various diagnoses with a numerical value (score), making it possible therefore to integrate the patient’s multiple medical conditions as a single continuous variable in the model.

Results

It is possible to establish with good discriminatory capacity (ROC AUC Test = 0.8) which COVID patients are at higher risk when admitted to the healthcare unit—people of advanced age with pre-existing conditions, such as diabetes and high blood pressure, or newly acquired conditions, such as pneumonia. Moreover, males and clinical episodes occurring in healthcare units with few available beds (high healthcare unit occupancy) are also at higher risk. The importance of each variable in predicting the target is: age (47%), sum of comorbidity scores (28%), healthcare unit score (12.0%), gender score (7%) and healthcare unit occupancy (6%).

Conclusions

Using a dataset with more than 52000 people, it was possible to successfully differentiate likelihood of death by COVID using age, comorbidity information, healthcare unit, healthcare unit occupancy and gender. The age and the comorbidities associated with each patient had a joint contribution of about 75% in explaining the COVID related mortality in Portuguese public hospitals in the period between March 2020 and May 2021.

Introduction

The Coronavirus disease (COVID-19) caused by severe acute respiratory syndrome (SARS-COV-2) raised the biggest global alarm since the Spanish Flu pandemic of 1918. The images that first came from China, followed by those in the Bergamo area in Italy, instantly led to a generalized state of fear among people resulting in the implementation of lockdowns by many governments (especially in the developed countries), affecting millions of people [1].

The consequences were serious both mentally and economically [2, 3], and the measures taken did not always seem to have the intended effect [4]. Panic and a growing number of cases led to difficulties in accessing reliable information crucial to better understand the dynamics of the disease and the clinical determinants of its gravity.

It is challenging to determine with the existing data the precise risk level of a patient admitted to the hospital with SARSCOV-2. It is important to understand how serious was the risk of death from SARSCOV-2 for people who came to health services every day with symptoms of this disease. There are several articles on this topic [5, 6].

The database we had access to comes from the Shared Services of Ministry of Health (SPMS, Portugal) and is characterized by episodes (an episode corresponds to a hospital admission and the same person can be associated with more than one episode). The patients’ privacy is assured through reference codes that we transformed into ID’s (1, 2, 3, …). Each episode is described with a referral healthcare unit (cluster of hospitals), date of birth, gender, start date, end date (which can be categorized in death in the hospital or recovery) and the various clinical diagnoses in addition to SARSCOV-2 associated with that episode (comorbidities). Each clinical diagnosis corresponds to a row in the database.

Methods

Thanks to SPMS we were able to access data that until May 2021 are the most reliable and complete in Portugal. The data provided by SPMS include patients who between 1 March 2020 and 10 May 2021 were admitted to a Portuguese public hospital and, at some point during that period, had a positive diagnosis of COVID.

This paper will present a model that allows us to evaluate the probability of survival for a patient who was hospitalized in Portugal, based on variables including age, healthcare unit, average healthcare unit occupancy over the patient’s stay period, gender, and other clinical diagnoses. This information could enable doctors to decide, for instance, whether a patient should enter the ICU (intensive care unit).

Some small healthcare units and patients who periodically went to the hospital for haemodialysis were not considered. In addition, all non-adult patients were excluded since it was found that SARSCOV-2 does not damage the immune system of this type of patient except in very specific cases without a defined pattern [7]. (Fig 1).

Regarding the clinical diagnoses we were able, with the help of a medical practitioner, to exclude comorbidities that were not significant to our study and consider only those that would contribute to a worse diagnosis when associated with a respiratory disease.

We were unable to rule out patients who tested positive for COVID in the hospital but were admitted for other causes, as the data do not contain this information. Furthermore, we were unable to filter out cases where a patient might have COVID but died due to other causes. Therefore, this study focuses on patients admitted to public hospitals who tested positive for COVID at some point in the procedure.

There follows a summary of the variables available to help us understand the target variable, namely “Death”.

Data

The variables available were: Code identification by episode (id), Healthcare unit (anonymized) (hcu), Capacity (number of beds in the healthcare unit) (hcu_dim), Date of admission (in_date), Age by year (age), Gender, Outcome after discharge (Survival (0) or Death (1)), Clinical Diagnoses with description, Length of Stay by episode in days (los).

We refer to the target variable as “death”–with 52,403episodes of which 12,546 (24%) resulted in the patient’s death.

Initially we have four predictor variables–age, gender, clinical diagnoses and hcu.

Moreover, considering the entry and exit dates of each patient and healthcare unit capacity, it was possible to assess the daily healthcare unit prevalence and, from there, to understand the level of healthcare unit occupancy (average healthcare unit occupancy rate) to which each patient was subject to during their hospitalization period: number of patients in the database/number of beds (information in National Health Service [8]). The hcu occupancy will be our 5^th predictor variable.

Let’s now consider the two hcu-related variables, represented in Figs 2 and 3.

We note that, although we know the number of beds available for each health unit, it is unknown how many of these beds can actually be occupied by COVID patients.

Besides the two healthcare unit-related variables presented, we have three patient-related variables–age (continuous), clinical diagnoses and gender (categorical). We will be describing these variables in the Results section.

Statistical modelling

In our preliminary steps, we will be transforming the categorical variables into scores, a technique that is widely used for the probability of default in bank loans (see for example [9–11]). Thus, we will work with 5 continuous variables.

This method follows a scorecard model approach, which is meant to be simpler to explain and apply. Two of the most common cited references in the literature are for instance [10, 11]. Let’s take variable hcu, for example, that contains 45 categories. Considering that one of the dummy variables always serves as a reference category we need to estimate 44 parameters in total (excluding the intercept variable, other categorical variables and potentially other numerical variables). In our case, we are dealing with 3 categorical variables: hcu (with 45 categories), clinical diagnoses (27 comorbidities) and gender (2 categories), besides the 2 continuous variables age and hcu occupancy. For the categorical variables one can apply a monotonic transformation related to the target variable to obtain a new numerical variable with a single corresponding coefficient. This transformation can either be monotonically increasing or decreasing resulting in either a positive or negative value of the beta parameter, respectively. The transformation of the categorical variable into numerical with a single coefficient results in a more parsimonious model, improving interpretability.

In the special case of the comorbidity variable (clinical diagnoses), we started by calculating a score for each diagnosis. Following the score estimation, the scores were adjusted to the range [0;1] so that they no longer contained negative values (typical score transformation generates positive and negative scores). Thus, having more comorbidities can only increase the risk of death. Finally, we added the scores by episode to obtain episode specific information (for example, for a patient/episode with 5 comorbidities we would sum the corresponding 5 scores). Thus, the score transformation in the case of the comorbidities enables us to account for multiple comorbidity patient profiles, a feature not easily achievable using just the original categorical information. Although this is an innovative approach, we highlight two potential problems with this variable: (i) the different comorbidities are combined additively per episode, but this is not the only approach (e.g. the multiplicative model would be a valid alternative); (ii) the score per comorbidity is created univariately, which is an expedient approach, but a patient with two comorbidities of low severity, when combined, may have high severity (and by adding two low scores we always get a low total score).

We split the dataset into training (70%) and testing (30%) sets. The scores for each healthcare unit, gender and comorbidity were computed in the training to avoid information leakage from the test partition.

Having 5 continuous variables, standardization was applied: each variable was transformed to have null mean and unit variance (the standardization parameters, the mean and variance, were computed in the training partition). The standardization makes it possible to compare the model coefficients and infer the relative importance of the different variables.

Results

After being filtered, the database represents 45 Portuguese public healthcare units containing 52,403 episodes. Besides the variables related to hcu, we are also considering variables related to the patients–age, gender and clinical diagnoses–as we can observe in Table 1.

Table 1. Descriptive statistics for four of the variables used in this study.

Variables	Categories	Total	Death	Rate
Age (mean = 70.7, sd = 17.1, median = 74.0)	[18_40]	3535	51	0.01
	[40_50]	3617	161	0.04
	[50_60]	6033	457	0.08
	[60_70]	9323	1372	0.15
	[70_80]	11628	3025	0.26
	[80_85]	7296	2540	0.35
	[85_90]	6535	2724	0.42
	[90_95]	3415	1656	0.48
	[95_100]	940	517	0.55
	[100_105]	81	43	0.53
Gender	Male	24777	5651	0.23
Gender	Fem	27626	6895	0.25
Clinical Diagnoses	Smoking (TABACO)	2261	443	0.20
	Obesity (OB)	9234	1832	0.20
	Arterial hypertension (HTA)	22119	5004	0.23
	COVID-19 (COVID)	54384	12607	0.23
	Viral pneumonia (PV)	12589	2946	0.23
	Pulmonary thromboembolism (TEP)	1565	399	0.25
	Hyponatremia (HIPONA)	3100	821	0.26
	Diabetes mellitus (DM)	19787	5280	0.27
	Pneumonia to SARSCoV (PCOV)	22409	6399	0.29
	Non-pulmonary localized infection (ILNP)	6619	1928	0.29
	Anemia (ANE)	7096	2174	0.31
	Acute abdominal disease (DAA)	2074	642	0.31
	Chronic respiratory disease (DPCO)	6285	2012	0.32
	Acute cerebrovascular disease (AVC)	5368	1798	0.33
	Ischemic heart disease (EM)	2650	929	0.35
	Chronic kidney failure (IRC)	14506	5156	0.36
	Bacterial pneumonia (PB)	10751	3887	0.36
	Acute breathing insufficiency (IRESPA)	14800	5531	0.37
	Pulmonary hypertension (HTP)	715	269	0.38
	Cardiac insufficiency (IC)	13017	4936	0.38
	Atrial fibrillation (FA)	7901	3015	0.38
	Acute kidney failure (IRA)	8974	3784	0.42
	Coagulation changes (AC)	377	165	0.44
	Neoplastic disease (cancer) (NEO)	6846	3070	0.45
	Liver failure (IH)	840	377	0.45
	Fungal pneumonia (PF)	90	49	0.54
	Septicemia (SEPSIS)	3517	2255	0.64
Occupancy (%) (mean = 22.7, sd = 15.1, median = 20.7.0)	[0_10]	12232	2294	0.19
	[10_20]	13768	3242	0.24
	[20_30]	13887	3615	0.26
	[30_80]	12516	3395	0.27

Open in a new tab

After exploring the variables, we proceed to the construction of our model, starting with univariate regression.

Univariate regression

Logistic regression allows us, unlike new data science tools, to obtain a prediction and respective confidence interval for new patients. In Table 2 we present, still in a univariate way, the variables available to estimate the probability of death at hcu admission:

Table 2. Results of fitting univariate logistic regression model.

Variable	Estimated Coeff	Standard Error	z	p-value	OR (95% CI)
age	1.143	0.018	62.52	<0.001	3.13 (3.02–3.25)
gender	0.066	0.012	5.54	<0.001	1.07 (1.04–1.09)
hcu_occupancy	0.168	0.011	14.71	<0.001	1.37 (1.33–1.4)
healthcare unit	0.311	0.012	25.34	<0.001	1.18 (1.16–1.21)
Clinical diagnoses	0.783	0.013	62.12	<0.001	2.19 (2.13–2.24)

Open in a new tab

All variables have a very low p-value, proving to be potentially useful in predicting death in a logistic regression model.

In the following section we will go through the process of building the Logistic Regression model.

Multiple regression

We will start by the 4 variables that contain a single piece of information per episode, and later introduce the 5th variable, clinical diagnoses (Table 3), of which the same episode can have several associated values–one for each comorbidity.

Table 3. Model built from the principal variables and with the training dataset.

Variable	Estimated Coeff	Standard Error	z	p-value	OR (95% CI)
(Intercept)	-1.64117	0.01752	-93.68	<2e-16
age	1.106	0.021	53.63	<2e-16	3.02 (2.9–3.15)
gender	0.175	0.014	12.67	<2e-16	1.19 (1.16–1.22)
hcu_occupancy	0.151	0.013	11.38	<2e-16	1.16 (1.13–1.19)
healthcare unit	0.280	0.014	20.02	<2e-16	1.32 (1.29–1.36)
Clinical diagnoses	0.664	0.013	49.81	<2e-16	1.94 (1.89–1.99)

Open in a new tab

Since all variables have the same scale (due to standardization), we can compare and rank the contribution of each variable through its regression coefficient [12] (a variable with higher absolute coefficient value has higher model impact). If we take one step further and assume that the sum of the absolute coefficients represents the total modelled effects, we can relativize the importance of each variable j by dividing each coefficient by this sum (Table 4). This practice is common in a scorecard model approach [10, 11]:

Table 4. Relative importance of each variable for the model.

coef_j/∑_i\|coef_i\|
age	gender	hcu_occupancy	healthcare unit	Clinical diagnoses
46.5%	7.4%	6.3%	11.8%	28.0%

Open in a new tab

Thus, it is possible to conclude that almost 47% of the contribution to the model comes from the age, 28% from comorbidities and the remaining 25% from the other three.

Subsequently, we tried to understand if there were any relevant interactions between the variables, and we found two that revealed to be somewhat important, as they led to a slight predictive improvement.

These interactions express that gender and comorbidities may have a different impact depending on the patient’s age. Thus, we obtained the model in Table 5:

Table 5. Model built from all significant effects and with the training dataset.

Variable	Estimated Coeff	Standard Error	z	p-value	OR (95% CI)
(Intercept)	-1.673	0.019	-86.69	<0.001
age	1.206	0.023	53.18	<0.001	3.34 (3.19–3.49)
gender	0.134	0.018	7.38	<0.001	1.14 (1.1–1.19)
hcu_occupancy	0.154	0.013	11.59	<0.001	1.17 (1.14–1.2)
health care unit	0.279	0.014	19.93	<0.001	1.32 (1.29–1.36)
Clinical diagnoses	0.802	0.016	51.60	<0.001	2.23 (2.16–2.3)
age:gender	0.074	0.022	3.44	<0.001	1.08 (1.03–1.12)
age: Clinical diagnoses	-0.391	0.020	-19.99	<0.001	0.68 (0.65–0.7)

Open in a new tab

Model assessment

ROC curve

To assess general model performance, we can compute the ROC AUC. In addition, it’s possible to establish a cut-off probability and predict death or survival if the model probability is higher or lower than this reference value, respectively. Out of multiple possible approaches, we chose to define this value as the probability for which the sensitivity is equal to the specificity (Fig 4).

Fig 4 — The optimal cut-off is 0.287 and a ROC AUC for Training of 0.811 was obtained.

To avoid test leakage, the cut-off was calculated using training data. Computing the ROC AUC using the test partition, we obtained a value of 0.8. The value is similar to the one obtained using the training partition, indicating the robustness of the model. Let’s now see how the model performs for each variable, using test data.

To better understand the adjustment of the predictions to the observed risk, we can compute a confidence interval for each prediction. In terms of the probability, the lower and upper bounds of the confidence intervals (CI) are 99%. Moreover, the lower and upper CI bounds in the following graphs are the average lower and upper limits of the individuals in each variable group (Figs 5–7).

With almost 47% of model contribution, age is the most important of all studied factors in predicting death by COVID-19.

In general, the model responds well to reality, even using the test database. Comorbidities are the 2nd most important variable. Instead of seeing the effect on the clinical diagnoses (sum of all comorbidities scores in an episode), a useful but cryptic variable, we present the results per comorbidity. We highlight septicemia as the most severe.

Is medical care better for hospitals with lower hcu occupancy? And if so, is it enough to have any influence on the mortality? Although not by much, as we can attest by the scale of variations in probability, we can in fact observe a risk increase associated with a higher hcu occupancy.

In summary, the model generally managed to capture the diversity of the data, and can be a useful tool to help determine, for example, when a patient should or not be admitted to the ICU (intensive care unit).

Plotting multiple variables

To visualize the optimal cut-off in the variables space, we built a graph of the sum of comorbidity scores vs age, the two most important variables in the model (Fig 8). Given that the model uses 5 variables to generate the probability of death the 3 remaining variables (gender, hcu occupancy score and hcu score) must be assigned to some value. In this case, we chose the 25^th, 50^th and 75^th quantiles of hcu score and hcu occupancy’ distributions. The 50^th quantile values were used to build the main threshold line, while the 25^th and 75^th used for a pseudo confidence interval (favourable and unfavourable values of these two variables). Two threshold lines were built for each gender. Fig 8 shows the effect of gender in the threshold: the male boundary line is situated at younger ages compared to the female boundary; an indication of the higher risk present in males.

Fig 8 — The left, center and right band limits correspond to the 25th, 50th and 75th quantile values of the hcu occupancy and hcu scores. Some test data is also plotted, along with the outcome, death (1) or survival (0).

Risk of death by comorbidity profile

Another way to explore the effect of the comorbidities in the model is to create disease profiles (Fig 9). Thus, we establish 6 disease profiles and compute the corresponding sum of the comorbidity scores. Besides the disease profile, we can vary the age and gender in the analysis and use the mean hcu occupancy and mean healthcare unit score of the training partition. The profiles and predicted risk are the following:

Profile 1: Just covid;
Profile 2: Covid and Chronic respiratory disease (DPCO);
Profile 3: Covid, diabetes mellitus (DM), arterial hypertension (HTA) and obesity (OB);
Profile 4: Covid, bacterial pneumonia (PB) and pneumonia to SARSCoV (PCOV);
Profile 5: Covid, pulmonary thromboembolism (TEP), arterial hypertension (HTA) and pneumonia to SARSCoV (PCOV);
Profile 6: Covid, acute breathing insufficiency (IRESPA), cardiac insufficiency (IC) and pneumonia to SARSCoV (PCOV).

In Fig 9, like in Fig 8, we can see a clear difference between the genders risk-wise. For example, at the age of 90, the worst female profile has a lower risk than the best male profile.

Discussion and conclusions

This type of work is integrated within Risk Prediction models, a subject contained in hundreds of existing papers. In 2011, there were almost 8000 citations reviewed [13]. With this many studies involving risk prediction, what are our new contributions to the subject? The main contribution of this study relates to the quantity and originality of the data. Using a simple linear model, we managed to integrate the patient comorbidities, hcu occupancy and inter-hcu variability using score techniques, with commonly used variables (age, gender). This methodology allows us to weigh the relative importance of each variable in predicting death by COVID.

In this study we are looking at a time before vaccination was available, making it a useful tool to understand the efficacy of the vaccine. As soon as more recent data can be obtained, we will be able to analyse the results obtained for a vaccinated versus unvaccinated population.

Despite being a Portuguese database, this study is relevant to other countries (for instance, we expect the contributions of the 5 variables to the model to be similar using other countries’ data, although we still recommend using, if possible, country specific data).

Given that age is almost 47% of importance in our model, it is interesting to consider whether Portugal (and other countries) should have applied their restrictive measures with more emphasis on age stratification, such as lockdowns (especially given their catastrophic economic and social effects [2, 3]).

The second most important variable (28%) is the sum of the comorbidities scores per episode. The analysis by patient profile also reveals the effect of the variable, with a substantial increase in the model probability for certain clinical profiles.

Several studies (for example [14]) indicate obesity as an extremely relevant factor. However, our Portuguese data do not allow us to draw this conclusion (in fact, in this database patients tagged with obesity have a lower death ratio than those who are not).

In terms of future work, it would be interesting to see if the model performance is preserved with data from May 2021 onwards. If not, we could then assess the causes, such as changes in the mortality of the disease (due to new virus variants for instance), altering the pattern associated with age, comorbidity and possibly gender, or changes in hcu logistic, altering the effect of hcu occupancy and inter-hcu variability.

Acknowledgments

The authors would like to thank

Manuel do Carmo Gomes for providing the data without which this study would not have been possible.
Sérgio Daniel das Neves Pereira Naito for his help and consulting on using a scorecard model approach for the construction of our model.
John McDermott for his help with wording and English grammar. He received his Ph.D. in history at the University of Toronto and taught modern European history at the University of Winnipeg and McMaster University. He is now living in Portugal doing some editing when called upon.

Data Availability

We made the data and R scripts available at this github link: https://github.com/jjfgomes/A-risk-scoring-model.

Funding Statement

The first author was supported by FCT – Fundação para a Ciência e a Tecnologia, under the project: UIDB/04561/2020.

References

1.Koh D. COVID-19 lockdowns throughout the world. Occupational Medicine. 2020;70(5):322-. doi: 10.1093/occmed/kqaa073 [DOI] [Google Scholar]
2.Bonaccorsi G, Pierri F, Cinelli M, Flori A, Galeazzi A, Porcelli F, et al. Economic and social consequences of human mobility restrictions under COVID-19. Proceedings of the National Academy of Sciences. 2020;117(27):15530–5. doi: 10.1073/pnas.2007658117 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Kumar A, Nayar KR. COVID 19 and its mental health consequences. Journal of Mental Health. 2021;30(1):1–2. doi: 10.1080/09638237.2020.1757052 [DOI] [PubMed] [Google Scholar]
4.Miles DK, Stedman M, Heald AH. “Stay at Home, Protect the National Health Service, Save Lives”: A cost benefit analysis of the lockdown in the United Kingdom. International Journal of Clinical Practice. 2021;75(3):e13674. doi: 10.1111/ijcp.13674 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Bertsimas D, Lukin G, Mingardi L, Nohadani O, Orfanoudaki A, Stellato B, et al. COVID-19 mortality risk assessment: An international multi-center study. PloS one. 2020;15(12):e0243262. doi: 10.1371/journal.pone.0243262 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Pourhomayoun M, Shakibi M. Predicting mortality risk in patients with COVID-19 using artificial intelligence to help medical decision-making. MedRxiv. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Ioannidis JPA. Infection fatality rate of COVID-19 inferred from seroprevalence data. Bull World Health Organ. 2021;99(1):19–33f. Epub 20201014. doi: 10.2471/BLT.20.265892 . [DOI] [PMC free article] [PubMed] [Google Scholar]
8.SNS. https://transparencia.sns.gov.pt/explore/dataset/ocupacao-do-internamento/table/?disjunctive.regiao&disjunctive.instituicao&sort=tempo.
9.Desai VS, Crook JN, Overstreet GA. A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operational Research. 1996;95(1):24–37. doi: 10.1016/0377-2217(95)00246-4 [DOI] [Google Scholar]
10.Thomas L, Crook J, Edelman D. Credit scoring and its applications: SIAM; 2017. [Google Scholar]
11.Anderson R. The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation: Oxford University Press; 2007. [Google Scholar]
12.Gelman A. Scaling Regression Inputs by Dividing by Two Standard Deviations. Statistics in medicine. 2008;27:2865–73. doi: 10.1002/sim.3107 [DOI] [PubMed] [Google Scholar]
13.Kansagara D, Englander H, Salanitro A, Kagen D, Theobald C, Freeman M, et al. Risk Prediction Models for Hospital Readmission: A Systematic Review. JAMA. 2011;306(15):1688–98. doi: 10.1001/jama.2011.1515 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Yang J, Hu J, Zhu C. Obesity aggravates COVID‐19: a systematic review and meta‐analysis. Journal of medical virology. 2021;93(1):257–61. doi: 10.1002/jmv.26237 [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0288460.r001

Decision Letter 0

Jiaxu Zeng

11 Aug 2022

PONE-D-22-14646

A risk scoring model of COVID-19 at hospital admission

PLOS ONE

Dear Dr. Gomes ,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we have decided that your manuscript does not meet our criteria for publication and must therefore be rejected.

Specifically:

1. The manuscript isn't organised scientifically, and the tables and figures isn't presented in academic style that are suitable for publication.

2. Lack of details re the setting and conduction of study and poor interpretation of results.

I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision.

Kind regards,

Jiaxu Zeng, Ph.D

Academic Editor

PLOS ONE

PLoS One. 2023 Jul 20;18(7):e0288460. doi: 10.1371/journal.pone.0288460.r002

Author response to Decision Letter 0

24 Aug 2022

"1. The manuscript is not scientifically organized, and the tables and figures are not presented in an academic style that is suitable for publication."

The article is organized in the standard way for many scientific articles: Introduction, Methods, Results, Discussion and Conclusions.

Tables and figures were attached according to the guidelines suggested by PlosOne. The tables were incorporated in the text in the simplest possible way so that they could be adapted by the editor if necessary and the figures were attached in separate files as the submission manual suggests. In the new proposed manuscript, we altered 3 tables that were in the format of an R output image.

"2. Lack of details on the setting and conduction of study and poor interpretation of results."

This study was conducted from a database of over 50,000 people, of which over 12,000 individuals have died.

It contains the following information: Date of birth, date of entry and exit from the hospital, sex, health unit where the patient was admitted (which allowed us to, based on the information from the Portuguese Ministry of Health regarding hospital capacity, infer the pressure in terms of average hospital capacity per hospitalization and use this information in the model) and the various comorbidities associated with each patient (through a sophisticated process it was also possible to assign a risk score to each patient) – all this is described in detail in the manuscript.

Based on this material, we built a regression model that allows us to estimate the probability of death for each patient diagnosed with COVID and associate this probability to its respective confidence interval. In addition to that, we built profiles for the most frequent patterns with an associated probability of death (survival) to each, allowing us to have an idea of the increase in risk for a given factor. Furthermore, we retrospectively simulated some of these profiles to understand the model's ability to discriminate between survivals and deaths. It was also possible to objectively quantify the importance of age in terms of COVID mortality - about 47% compared to the 5 factors available.

We hope to have been able to enlighten the importance of this study and the results obtained, specially for an in-depth analysis of the strategy that was followed by most countries during the pandemic outburst.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(15.2KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0288460.r003

Decision Letter 1

Yuyan Wang

22 Sep 2022

PONE-D-22-14646R1

A risk scoring model of COVID-19 at hospital admission

PLOS ONE

Dear Dr. Gomes,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by 11/15/2022. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript

Kind regards,

Yuyan Wang, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide

3. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse

4.Your abstract cannot contain citations. Please remove any existing citations from the abstract section. You may only include citations in the body text of the manuscript, and please ensure that they remain in ascending numerical order on first mention.

Additional Editor Comments (if provided):

Please refer to some published papers, revise the manuscript in an academic way and make it more rigorous.

Please note that Review 2 was provided by the Academic Editor, Yuyan Wang

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This is an interesting well written manuscript on an important topic.

In my opinion al comments have been answered so the paper can be published

Reviewer #2: The authors used logistic regression to predict the death probability for inpatients diagnosed with COVID. The topic is interesting, however, the manuscript is not easy to read and the authors should refer to published articles to make the paper more rigorous not only in content but also in the presented results.

Major:

1. I would suggest the authors look for some professional English editorial help to avoid grammar errors and make the manuscript fluent and easier to read.

2. Abstract: the Conclusion part is missing in Abstract.

3. The way the authors created the diagnoses score is unclear. More details (or specific example how the scores are creasted) are needed.

4. The context in the manuscript is too messy. Please refer to the published paper (from PLOS ONE or any other academic journals) to clean up the Methods and Results section. Specifically,

a. Row 76 to 79: it will be good to have a data flowchart to reflect the subject number change by inclusion or exclusion creation.

b. Row 90-99: it’s unnecessary to show the variable name from original dataset

c. Row 101: Table 1 are unnecessary and can be moved to supplementary tables if the authors insist to keep them.

d. Row 108: Table 2, there are too many facilities and it would better to use a bar plot to show the number of patients and death rate.

e. Table 3, 4 and 5 can be combined as one descriptive table and should be placed in Results section.

f. Row 125: Table 6 can be merged into Table 2 and presented as plot as well.

g. Row 152: Table 7 isn’t presented in academic style. ORs wit CIs are better.

h. Why the presented results content are different in Table 8 (Df/Deviance/Resid.Df…) and Table 9 (Estimate/Std./Error)? These results are not presented in standard way. No one should directly present formula and output results from analytical software. Please refer to the published paper to make it more rigorous.

i. Row 169: could you cite reference to use “percentage by dividing each coefficient by the sum of all coefficients” to describe the variable importance?

j. Row 179: Table 11 needs to be revised like Table 9, and the interpretation needs improvement.

k. Row 223: Figure 5 and Figure 6 can be merged as one plot to show the difference of the boundary line between male and female.

l. Row 242: Table 12 can be transformed as a plot using Age as x-axis and probability as y-axis. Then it will be easier to compare among different profile and gender groups.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jul 20;18(7):e0288460. doi: 10.1371/journal.pone.0288460.r004

Author response to Decision Letter 1

25 Nov 2022

Review Comments

1. “I would suggest the authors look for some professional English editorial help to avoid grammar errors and make the manuscript fluent and easier to read”

The writing has been carefully revised as suggested by reviewer.

2. “Abstract: the Conclusion part is missing in Abstract”

Conclusions were added to the Abstract section of the paper.

3. “The way the authors created the diagnoses score is unclear. More details (or specific example how the scores are created) are needed.”

The approach applied followed a scorecard model approach, which is meant to be simpler to explain and apply. Two of the most common cited references in the literature are, for instance: [Thomas, Edelman, Crook; Credit Scoring and Its Applications Society for Industrial and Applied Mathematics (1983)] [Anderson, R.; Credit Scoring Toolkit - Theory and Practice for Retail Credit Risk Management and Decision Automation, Oxford University Press (2007)].

Further explanation on this topic was added to the manuscript in the Methods section.

4. “The context in the manuscript is too messy. Please refer to the published paper (from PLOS ONE or any other academic journals) to clean up the Methods and Results section. Specifically:”

a) “Row 76 to 79: it will be good to have a data flowchart to reflect the subject number change by inclusion or exclusion creation”

A flowchart was added to the manuscript, explaining the transformations made to the initial database.

b) “Row 90-99: it’s unnecessary to show the variable name from original dataset”

The names presented correspond to the variable names used in our model and a brief description clarifying what each variable means. However, presentation has been simplified.

c) “Row 101: Table 1 are unnecessary and can be moved to supplementary tables if the authors insist to keep them.”

Table 1 was removed according to suggestion.

d) “Row 108: Table 2, there are too many facilities and it would better to use a bar plot to show the number of patients and death rate.”

Information previously presented in Table 2 was converted to a bar plot (Fig. 2), per suggestion.

e) “Table 3, 4 and 5 can be combined as one descriptive table and should be placed in Results section.”

The suggested changes were made – Tables 3,4 and 5 were combined into one single table (Table 1) and were moved to the Results section.

f) “Row 125: Table 6 can be merged into Table 2 and presented as plot as well”

Tables 2 and 6 have been combined as suggested as were converted into a bar plot (Fig. 3).

g) “Row 152: Table 7 isn’t presented in academic style. ORs with CIs are better”

h) “Why the presented results content are different in Table 8 (Df/Deviance/Resid.Df…) and Table 9 (Estimate/Std./Error)? These results are not presented in standard way. No one should directly present formula and output results from analytical software. Please refer to the published paper to make it more rigorous.”

Tables 7 and 8 have been combined (Table 2) and the results of the univariate analysis are now presented in a standard way, per suggestion.

i) “Row 169: could you cite reference to use “percentage by dividing each coefficient by the sum of all coefficients” to describe the variable importance?”

Further explanation as to why this approach was followed was added to the manuscript, as well as reference (Multiple Regression section in Results), as suggested.

j) “Row 179: Table 11 needs to be revised like Table 9, and the interpretation needs improvement.”

Tables 9 and 11 were revised and the results are now presented in a standard way (they now correspond to Tables 3 and 5, respectively). Interpretation of the results has also been improved.

k) “ Row 223: Figure 5 and Figure 6 can be merged as one plot to show the difference of the boundary line between male and female.”

Figures 5 and 6 have been merged into one single plot (Fig. 8)

l) “Row 242: Table 12 can be transformed as a plot using Age as x-axis and probability as y-axis. Then it will be easier to compare among different profile and gender groups.”

Table 12 has been transformed into a plot, per suggestion.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(19.9KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0288460.r005

Decision Letter 2

Robert Jeenchen Chen

8 Mar 2023

PONE-D-22-14646R2A risk scoring model of COVID-19 at hospital admissionPLOS ONE

Dear Dr. Gomes,

Please revise.

Please submit your revised manuscript by Apr 22 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Robert Jeenchen Chen, MD, MPH

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #3: (No Response)

Reviewer #4: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #3: Partly

Reviewer #4: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #3: No

Reviewer #4: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #3: Yes

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #3: No

Reviewer #4: Yes

**********

6. Review Comments to the Author

Reviewer #3: A risk scoring model of COVID-19 at hospital admission. This paper analyzed how patient baseline characteristics and healthcare units’ characteristics could impact the mortality of covid in Portugal. A scoring method was utilized before the logistic regression.

Overall, this paper needs to be better written in English. I could only understand ~80% of this paper. Hence, the authors should seek professional help in the wording, grammar, and correcting the typos.

The paper models the probability of survival on both patient baseline characteristics/ and the characteristics of healthcare institutes. A score transformation was used on the predictive variables. However, is it necessary? Logistic regression can model both categorical and continuous predictive variables. And with an l-1 penalty, it can further handle variable selection. Authors should explain the motivation and benefit of using this scoring pre-process method.

Method section where variables were transformed into continuous scores, please add some explanations in the context. Despite citing two articles, I believe these methods are less well-known than the other statistical methods. Thus more details could help readers understand and even replicate this research.

Method: the hcu beds were excluded; however, such exclusion was not well justified.

Method: was modeled as categorical? In table 1, the author chose ten levels for age. Suggest keeping the level of categorical beyond 4. Otherwise, it might be more beneficial to model it as a continuous variable directly. If it was modeled as continuous, suggest reporting the mean/sd and Median(Range) for this variable in Table 1.

Data- 52,400 patients but 52390 episodes? What’s the difference between the ID and patients?

Data - hcu pressure(average hcu occupancy rate); what is the definition or math formula for hcu occupancy rate? Is it derived from stay duration and the number of beds?

P5-R13: please define the weights-of-evidence transformation (WOE) method. It would be helpful to add more details.

Table 1- only three variables were reported; what about the HCU-related two variables? It would be more straightforward to report the variable by event, i.e., two columns in the table as death or survival; cell value could be either mean/sd + median/range or percentage.

Figure 1- add numbers being excluded from each step.

Figure 3- Title mismatch legends. Is the blue bar number of occupation rate or healthcare unit pressure? Please reconcile; only one name is needed for one variable.

Minor comments:

1. the manuscript needs to be better formatted; for example, rows 81, 85, 102, 144, and 271 are in different fonts from the rest of the paper. Please reconcile it.

3. Variables were bolded in font but not needed.

4. Variable names: hcu pressure could be replaced by a better/clear variable name.

5. It would be interesting to see how the covariates would predict the time from admission to death.

Reviewer #4: This article addresses an important public health issue which continue to impact healthcare around the world, despite > 3 yaers of experience in dealing with COVID 19 pandemic. The pre-vaccination period of this study shows the factors leading to death at a pospulation level. It address the unique feature of public sector healthcare units and impact of their size & bed occpancy on a pandemic of this magnitude. It also gives important data for public health authorities in terms of prepardness for the future pandemics. There are some imporatnt short comings in the manuscript that need addressing.

1, Overall size of the manuscript is too long and the proportion of different sections isn't uniform. e.g. Discussion section is relatively small as compared methodology which is very detailed and "wordy" I would recommend to cut down the methodology section.

2. The information within each section isn't appropriate for the respective section. e.g. sentence 68-70 should be in methodology. point 72-74 should be in discussion section. point 95-96 should be in results section

3. exclusion of so called "reargard hospital" oncology units and patients receiving hemodialysis would have lead to exclusion of an important vulnerable group with high expected mortality.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #3: No

Reviewer #4: Yes: msalbur

**********

PLoS One. 2023 Jul 20;18(7):e0288460. doi: 10.1371/journal.pone.0288460.r006

Author response to Decision Letter 2

18 Apr 2023

Please, see the enclosed response for the reviewers

Attachment

Submitted filename: WOE AND SCORING.xlsx

Click here for additional data file.^{(223.6KB, xlsx)}

PLoS One. doi: 10.1371/journal.pone.0288460.r007

Decision Letter 3

Robert Jeenchen Chen

29 Jun 2023

A risk scoring model of COVID-19 at hospital admission

PONE-D-22-14646R3

Dear Dr. Gomes,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Robert Jeenchen Chen, MD, MPH

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #4: All comments have been addressed

Reviewer #5: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #4: Yes

Reviewer #5: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #4: Yes

Reviewer #5: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #4: Yes

Reviewer #5: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #4: Yes

Reviewer #5: Yes

**********

6. Review Comments to the Author

Reviewer #4: I am very pleased to see that you have made all the suggestions / recommendations from me. Also you have addressed some of the general reconfiguration of the manuscript.

Reviewer #5: Manuscript is completely revised and ready to get published. All the subjects in this article is addressed appropriately and needs no more revision.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #4: Yes: M S Albur

Reviewer #5: No

**********

PLoS One. doi: 10.1371/journal.pone.0288460.r008

Acceptance letter

Robert Jeenchen Chen

12 Jul 2023

PONE-D-22-14646R3

A risk scoring model of COVID-19 at hospital admission

Dear Dr. Gomes:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Robert Jeenchen Chen

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(15.2KB, docx)}

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(19.9KB, docx)}

Attachment

Submitted filename: WOE AND SCORING.xlsx

Click here for additional data file.^{(223.6KB, xlsx)}

Data Availability Statement

We made the data and R scripts available at this github link: https://github.com/jjfgomes/A-risk-scoring-model.

[pone.0288460.ref001] 1.Koh D. COVID-19 lockdowns throughout the world. Occupational Medicine. 2020;70(5):322-. doi: 10.1093/occmed/kqaa073 [DOI] [Google Scholar]

[pone.0288460.ref002] 2.Bonaccorsi G, Pierri F, Cinelli M, Flori A, Galeazzi A, Porcelli F, et al. Economic and social consequences of human mobility restrictions under COVID-19. Proceedings of the National Academy of Sciences. 2020;117(27):15530–5. doi: 10.1073/pnas.2007658117 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288460.ref003] 3.Kumar A, Nayar KR. COVID 19 and its mental health consequences. Journal of Mental Health. 2021;30(1):1–2. doi: 10.1080/09638237.2020.1757052 [DOI] [PubMed] [Google Scholar]

[pone.0288460.ref004] 4.Miles DK, Stedman M, Heald AH. “Stay at Home, Protect the National Health Service, Save Lives”: A cost benefit analysis of the lockdown in the United Kingdom. International Journal of Clinical Practice. 2021;75(3):e13674. doi: 10.1111/ijcp.13674 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288460.ref005] 5.Bertsimas D, Lukin G, Mingardi L, Nohadani O, Orfanoudaki A, Stellato B, et al. COVID-19 mortality risk assessment: An international multi-center study. PloS one. 2020;15(12):e0243262. doi: 10.1371/journal.pone.0243262 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288460.ref006] 6.Pourhomayoun M, Shakibi M. Predicting mortality risk in patients with COVID-19 using artificial intelligence to help medical decision-making. MedRxiv. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288460.ref007] 7.Ioannidis JPA. Infection fatality rate of COVID-19 inferred from seroprevalence data. Bull World Health Organ. 2021;99(1):19–33f. Epub 20201014. doi: 10.2471/BLT.20.265892 . [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288460.ref008] 8.SNS. https://transparencia.sns.gov.pt/explore/dataset/ocupacao-do-internamento/table/?disjunctive.regiao&disjunctive.instituicao&sort=tempo.

[pone.0288460.ref009] 9.Desai VS, Crook JN, Overstreet GA. A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operational Research. 1996;95(1):24–37. doi: 10.1016/0377-2217(95)00246-4 [DOI] [Google Scholar]

[pone.0288460.ref010] 10.Thomas L, Crook J, Edelman D. Credit scoring and its applications: SIAM; 2017. [Google Scholar]

[pone.0288460.ref011] 11.Anderson R. The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation: Oxford University Press; 2007. [Google Scholar]

[pone.0288460.ref012] 12.Gelman A. Scaling Regression Inputs by Dividing by Two Standard Deviations. Statistics in medicine. 2008;27:2865–73. doi: 10.1002/sim.3107 [DOI] [PubMed] [Google Scholar]

[pone.0288460.ref013] 13.Kansagara D, Englander H, Salanitro A, Kagen D, Theobald C, Freeman M, et al. Risk Prediction Models for Hospital Readmission: A Systematic Review. JAMA. 2011;306(15):1688–98. doi: 10.1001/jama.2011.1515 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0288460.ref014] 14.Yang J, Hu J, Zhu C. Obesity aggravates COVID‐19: a systematic review and meta‐analysis. Journal of medical virology. 2021;93(1):257–61. doi: 10.1002/jmv.26237 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A risk scoring model of COVID-19 at hospital admission

João José Ferreira Gomes

António Ferreira

Afonso Alves

Beatriz Nogueira Sequeira

Roles

Abstract

Background

Methods

Results

Conclusions

Introduction

Methods

Fig 1. Presentation of the filters applied to the initial database.

Data

Fig 2. Cases and Covid-19 mortality by healthcare units (code) in Portugal.

Fig 3. Healthcare unit occupancy and number of beds available for each of the healthcare units in our database.

Statistical modelling

Results

Table 1. Descriptive statistics for four of the variables used in this study.

Univariate regression

Table 2. Results of fitting univariate logistic regression model.

Multiple regression

Table 3. Model built from the principal variables and with the training dataset.

Table 4. Relative importance of each variable for the model.

Table 5. Model built from all significant effects and with the training dataset.

Model assessment

ROC curve

Fig 4. Sensitivity and specificity, plotted as a function of the cut-off probability.

Fig 5. Average predicted probability and observed death ratio for each age group, using test data.

Fig 7. Average predicted probability and observed death ratio for each hcu occupancy group, using test data.

Fig 6. Average predicted probability and observed death ratio for each comorbidity group, using test data.

Plotting multiple variables

Fig 8. Boundary using cut-off = 0.287 for female and male individuals, as a function of comorbidity score sum and age.

Risk of death by comorbidity profile

Fig 9. Model probability for the 6 disease profiles, differentiated by age and gender.

Discussion and conclusions

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Jiaxu Zeng

Roles

Author response to Decision Letter 0

Decision Letter 1

Yuyan Wang

Roles

Author response to Decision Letter 1

Decision Letter 2

Robert Jeenchen Chen

Roles

Author response to Decision Letter 2

Decision Letter 3

Robert Jeenchen Chen

Roles

Acceptance letter

Robert Jeenchen Chen

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases