Skip to main content
PLOS One logoLink to PLOS One
. 2021 Feb 16;16(2):e0247205. doi: 10.1371/journal.pone.0247205

An integrated clinical and genetic model for predicting risk of severe COVID-19: A population-based case–control study

Gillian S Dite 1,*,#, Nicholas M Murphy 1,#, Richard Allman 1
Editor: Giuseppe Novelli2
PMCID: PMC7886160  PMID: 33592063

Abstract

Up to 30% of people who test positive to SARS-CoV-2 will develop severe COVID-19 and require hospitalisation. Age, gender, and comorbidities are known to be risk factors for severe COVID-19 but are generally considered independently without accurate knowledge of the magnitude of their effect on risk, potentially resulting in incorrect risk estimation. There is an urgent need for accurate prediction of the risk of severe COVID-19 for use in workplaces and healthcare settings, and for individual risk management. Clinical risk factors and a panel of 64 single-nucleotide polymorphisms were identified from published data. We used logistic regression to develop a model for severe COVID-19 in 1,582 UK Biobank participants aged 50 years and over who tested positive for the SARS-CoV-2 virus: 1,018 with severe disease and 564 without severe disease. Model discrimination was assessed using the area under the receiver operating characteristic curve (AUC). A model incorporating the SNP score and clinical risk factors (AUC = 0.786; 95% confidence interval = 0.763 to 0.808) had 111% better discrimination of disease severity than a model with just age and gender (AUC = 0.635; 95% confidence interval = 0.607 to 0.662). The effects of age and gender are attenuated by the other risk factors, suggesting that it is those risk factors–not age and gender–that confer risk of severe disease. In the whole UK Biobank, most are at low or only slightly elevated risk, but one-third are at two-fold or more increased risk. We have developed a model that enables accurate prediction of severe COVID-19. Continuing to rely on age and gender alone (or only clinical factors) to determine risk of severe COVID-19 will unnecessarily classify healthy older people as being at high risk and will fail to accurately quantify the increased risk for younger people with comorbidities.

Introduction

The current COVID-19 pandemic is a dominating and urgent threat to public health and the global economy. While COVID-19 can be a mild disease in many individuals, with cough and fever the most commonly reported symptoms, up to 30% of those affected may require hospitalisation, and some will require intensive intervention for acute respiratory distress syndrome [1, 2].

Globally, public health responses have been aimed at limiting new cases by preventing community transmission through mask wearing, social distancing, curtailing non-essential services and broad travel restrictions. The economic and social impacts of these interventions have been devastating, with foundational damage to local economies [3] and unprecedented increases in mental health diagnoses being reported [4].

As the protracted strain of the pandemic increases pressure to re-open economies, there is an urgent need for tests to predict an individual’s risk of severe COVID-19. In the community, a risk prediction test could enable workplaces to confidently manage employees who are at increased risk of severe disease and should work from home or avoid client-facing roles. In the healthcare setting, a risk prediction test could inform patient triage when hospital resources are limited and be useful in prioritising pathology tests and vaccination (when one becomes available). On a personal level, knowledge of individual risk can empower individuals to make informed choices about day-to-day activities.

Age, gender, and comorbidities are frequently cited as risk factors for severe COVID-19 [5], but these have generally been considered independently without accurate knowledge of the magnitude of their effect on risk, potentially resulting in incorrect risk estimation. Early epidemiological analyses of the factors associated with COVID-19 severity and death have now appeared, including an analysis of a cohort of 17 million people by Williamson et al. [6] and a prospective cohort study of 5,279 people in New York [7], both based on the analysis of electronic health records.

The analysis of human genetic variation that may affect response to viral infection has been slower, largely due to the lack of available data. Nevertheless, the COVID-19 Host Genetics Initiative has undertaken meta-analyses of the genetic determinants of COVID-19 severity and has made the summary statistics publicly available [8, 9]. In addition, Ellinghaus et al. [10] identified two loci (3p21.31 and 9q34.2) that are strongly associated with severe disease.

We used the UK Biobank to develop a comprehensive model to predict risk of severe COVID-19 by integrating demographic information, comorbidity risk factors, and a panel of genetic markers.

Methods

UK Biobank data

The UK Biobank is a population-based prospective cohort of over 500,000 participants from England, Wales, and Scotland who were aged 40 to 69 years when recruited from 2006 to 2010 [11]. The UK Biobank has extensive genotyping [12] and phenotypic data obtained from baseline assessment and from linkage to hospital and primary care databases and to cancer and death registries [11]. The UK Biobank has Research Tissue Bank approval (REC #11/NW/0382) that covers analysis of data by approved researchers. All participants provided written informed consent to the UK Biobank before data collection began. This research has been conducted using the UK Biobank resource under Application Number 47401.

In response to the COVID-19 pandemic, the UK Biobank made available up-to-date SARS-CoV-2 testing, hospital, primary care, and death data for use in COVID-19 research by approved researchers [13]. We extracted testing and hospital records from the UK Biobank COVID-19 data portal on 15 September 2020. We extracted single-nucleotide polymorphism (SNP) and baseline assessment data from files previously downloaded as part of our approved project. At the time of data extraction, primary care administrative data (general practitioner records relating to diagnoses, symptoms, referrals, laboratory test results and prescriptions for medication) was only available for just over half of the identified participants and was therefore not used in these analyses.

Eligibility

Eligible participants were those who had tested positive for SARS-CoV-2 and for whom SNP genotyping data and linked hospital records were available. Of the 18,221 participants with SARS-CoV-2 test results, 1,713 had tested positive and 1,582 of those had both SNP and hospital data available.

COVID-19 severity

We used source of test result as a proxy for severity of disease: outpatient representing non-severe disease and inpatient representing severe disease. For participants with multiple test results, we considered the disease to be severe if at least one result came from an inpatient setting.

Selection of SNPs for risk of severe COVID-19

We identified 62 SNPs from the results of the ANA2 meta-analysis (release 2) of SARS-CoV2 positive non-hospitalised versus hospitalised cases of COVID-19 conducted by the COVID-19 Host Genetics Initiative consortium [8, 9]. Because of the limited amount of data available at the time of release, we used P<0.0001 as the threshold for loci selection. We then removed variants that were associated with hospitalisation in only one of the five studies in the meta-analysis. We pruned for linkage disequilibrium using an r2 threshold of 0.5 against the 1000 Genomes European populations (CEU, TSI, FIN, GBR and IBS) representing the ethnicities of the submitted populations [14]. Variants that had a minor allele frequency of ≥0.01 and beta coefficients from −1 to 1 were then retained [15]. All of the variants had a Cochran’s Q heterogeneity test P>0.001. Where possible, SNP variants were chosen over insertion–deletion variants to facilitate laboratory validation testing. We also included the two lead SNPs from the loci found by Ellinghaus et al. [10] that reached genome-wide significance. Therefore, we used a panel of 64 SNPs for severe COVID-19 in this study (S1 Table).

SNP score

While we would normally construct a SNP relative risk score using published data to calculate population-averaged risk values for each SNP and then multiplying the risks for each SNP [16], the size of the odds ratios for the 64 SNPs meant that this approach could result in relative risks of several orders of magnitude. Therefore, for this study, we calculated the percentage of risk alleles present in the genotyped SNPs for each participant. We used the percentage rather than a count because some of the eligible participants had missing data for some SNPs (9% had all SNPs genotyped, 21% were missing 1 SNP, 26% were missing 2 SNPs, 18% were missing 3 SNPs, 10% were missing 4 SNPs, 7% were missing 5 SNPs, 8% were missing 6–10 SNPs and 1% were missing 11–15 SNPs).

Imputation of ABO genotype

Blood type was imputed for genotyped UK Biobank participants using three SNPs (rs505922, rs8176719 and rs8176746) in the ABO gene on chromosome 9q34.2. A rs8176719 deletion (or for those with no result for rs8176719, a T allele at rs505922) indicated haplotype O. At rs8176746, haplotype A was indicated by the presence of the G allele and haplotype B was indicated by the presence of the T allele [17, 18].

Clinical risk factors

Risk factors for severe COVID-19 were identified from large epidemiological studies of electronic health records [6, 7] and advice posted on the Centers for Disease Control and Prevention website [19]. Rare monogenic diseases (thalassemia, cystic fibrosis and sickle cell disease) were not included in these analyses.

Age was classified as 50–59 years, 60–69 years and 70+ years. This was based on the participants’ approximate age at the peak of the first wave of infections (April 2020) and was calculated using the participants’ month and year of birth. Self-reported ethnicity was classified as white and other (including unknown). The Townsend deprivation score at baseline was classified into quintiles defined by the distribution in the UK Biobank as a whole. Body mass index and smoking status were also obtained from the baseline assessment data. Body mass index was inverse transformed and then rescaled by multiplying by 10. Smoking status was defined as current versus past, never or unknown. The other clinical risk factors were extracted from hospital records by selecting records with ICD9 or ICD10 codes for the disease of interest (S2 Table).

Statistical methods

We used multivariable logistic regression to examine the association of risk factors with severity of COVID-19. We began with a base model that included SNP score, age group and gender. We then included all the candidate variables and used backwards step-wise selection to remove those with P>0.05. We then refined the final model by considering the addition of the removed candidate variables one at a time. Model selection was informed by examination of the Akaike information criterion and the Bayesian information criterion, with a decrease of >2 indicating a statistically significant improvement.

Model calibration was assessed using the Pearson–Windmeijer goodness-of-fit test and model discrimination was measured using the area under the receiver operating characteristic curve (AUC). To compare the effect sizes of the variables in the final model, we used the odds per adjusted standard deviation [20] using dummy variables for age group and ABO blood type. Sensitivity analyses were undertaken by including participants with no hospital records.

We then used the intercept and beta coefficients from the final model to calculate the COVID-19 risk score for all UK Biobank participants.

We used Stata (version 16.1) [21] for analyses; all statistical tests were two-sided and P<0.05 was considered nominally statistically significant.

Results and discussion

Of the 1,582 UK Biobank participants with a positive SARS-CoV-2 test result and hospital and SNP data available, 564 (35.7%) were from an outpatient setting and considered not to have severe disease (controls), while 1,018 (64.3%) were from an inpatient setting and considered to have severe disease (cases). Cases ranged in age from 51 to 82 years with a mean of 69.1 (standard deviation [SD] = 8.8) years. Controls ranged in age from 50 to 82 years with a mean of 65.0 (SD = 9.0) years. Mean body mass index was 29.0 kg/m2 (SD = 5.4) for cases and 28.5 (SD = 5.4) for controls. Body mass index was transformed to the inverse multiplied by 10 for all analyses and ranged from 0.2 to 0.6 for both cases and controls. The percentage of risk alleles in the SNP score ranged from 47.6 to 73.8 for cases and from 43.7 to 72.5 for controls. The distributions of the variables of interest for cases and controls and the unadjusted odd ratios and 95% confidence intervals (CI) are shown in Table 1.

Table 1. Characteristics of cases and controls and unadjusted odds ratios for risk of severe COVID-19.

Variable Cases N = 1018 Controls N = 564 Unadjusted odds ratio 95% confidence interval P value
Continuous variables Mean (SD) Mean (SD)
SNP score % risk alleles 62.1 (4.1) 59.3 (4.7) 1.16 1.13 to 1.19 <0.001
Inverse of body mass index (kg/m2) 10/BMI 0.36 (0.06) 0.36 (0.06) 0.15 0.03 to 0.79 0.03
Categorical variables N (%) N (%)
Age group (years) 50–59 218 (21.4) 210 (37.2)
60–69 210 (20.6) 157 (27.8) 1.29 0.97 to 1.71 0.08
70+ 590 (58.0) 197 (34.9) 2.89 2.25 to 3.70 <0.001
Gender Female 443 (43.5) 298 (52.8)
Male 575 (56.5) 266 (47.2) 1.45 1.18 to 1.79 <0.001
Ethnicity White 888 (87.2) 489 (86.7)
Other 123 (12.1) 73 (12.9) 0.93 0.68 to 1.26 0.64
Missing 7 (0.7) 2 (0.4)
Quintile of Townsend deprivation index at baseline 1 134 (13.2) 84 (14.9)
2 165 (16.2) 95 (16.8) 1.09 0.75 to 1.58 0.65
3 179 (17.6) 98 (17.4) 1.14 0.79 to 1.65 0.47
4 215 (21.1) 124 (22.0) 1.09 0.77 to 1.54 0.64
5 325 (31.9) 162 (28.7) 1.26 0.90 to 1.75 0.18
Missing 0 (0.0) 1 (0.2)
ABO blood type O 425 (41.8) 235 (41.7)
A 450 (44.2) 249 (44.2) 1.00 0.80 to 1.25 1.00
B 113 (11.1) 55 (9.8) 1.14 0.79 to 1.63 0.49
AB 30 (3.0) 25 (4.4) 0.66 0.38 to 1.15 0.15
Smoking status at baseline Never/previous 882 (86.6) 499 (88.5)
Current 124 (12.2) 60 (10.6) 1.17 0.84 to 1.62 0.35
Missing 12 (1.2) 5 (0.9)
Asthma No 852 (83.7) 487 (86.4)
Yes 166 (16.3) 77 (13.7) 1.23 0.92 to 1.65 0.16
Autoimmune (rheumatoid arthritis/lupus/psoriasis) No 947 (93.0) 547 (97.0)
Yes 71 (7.0) 17 (3.0) 2.41 1.41 to 4.14 0.001
Cancer–haematological No 972 (95.5) 558 (98.9)
Yes 46 (4.5) 6 (1.1) 4.40 1.87 to 10.37 0.001
Cancer–non-haematological No 799 (78.5) 486 (86.2)
Yes 219 (21.5) 78 (13.8) 1.71 1.29 to 2.26 <0.001
Cerebrovascular disease No 847 (83.2) 503 (89.2)
Yes 171 (16.8) 61 (10.8) 1.66 1.22 to 2.28 0.001
Diabetes No 765 (75.2) 493 (87.4)
Yes 253 (24.9) 71 (12.6) 2.30 1.72 to 3.06 <0.001
Heart disease No 633 (62.2) 437 (77.5)
Yes 385 (37.8) 127 (22.5) 2.09 1.66 to 2.65 <0.001
Hypertension No 419 (41.2) 354 (62.8)
Yes 599 (58.8) 210 (37.2) 2.41 1.95 to 2.98 <0.001
Immunocompromised No 1,001 (98.3) 560 (99.3)
Yes 17 (1.7) 4 (0.7) 2.38 0.80 to 7.10 0.12
Kidney disease No 859 (84.4) 521 (92.4)
Yes 159 (15.6) 43 (7.6) 2.24 1.57 to 3.20 <0.001
Liver disease No 937 (92.0) 541 (95.9)
Yes 81 (8.0) 23 (4.1) 2.03 1.26 to 3.27 0.003
Respiratory disease (excluding asthma) No 571 (56.1) 486 (86.2)
Yes 447 (43.9) 78 (13.8) 4.88 3.73 to 6.38 <0.001

The adjusted odds ratios for the variables included in the final model are shown in Table 2. This model included SNP score, age group, gender, ethnicity, ABO blood type, and a history of autoimmune disease (rheumatoid arthritis, lupus or psoriasis), haematological cancer, non-haematological cancer, diabetes, hypertension or respiratory disease (excluding asthma) and was a good fit to the data (Windmeijer’s H = 0.02, P = 0.88). The SNP score was strongly associated with severity of disease, increasing risk by 19% per percentage increase in risk alleles. The effect of age was only evident in the group aged 70 years and over, and while gender was not statistically significant (P = 0.26), it was retained because it was one of the three variables considered the base model to which other variables were added. Ethnicity showed a 43% increase in risk for non-whites but was only marginally statistically significant (P = 0.06). The AB blood type was protective (P = 0.007), but the protective effect of blood type A and the increased risk for blood type B were not statistically significant (P = 0.10 and P = 0.41, respectively). Table 2 also shows the odds per adjusted standard deviation for the final model. This allows direct comparisons of the strength of the associations for each variable, regardless of the scales on which they were measured. The SNP score was, by far, the strongest predictor followed by respiratory disease and age 70 years or older. Sensitivity analyses including those with no linked hospital records did not change the conclusions presented in S3 Table.

Table 2. Final model for risk of severe COVID-19.

Variable Adjusted odds ratio 95% confidence interval P value Odds per adjusted standard deviation 95% confidence interval
SNP score % risk alleles 1.19 1.15 to 1.22 <0.001 2.18 1.91 to 2.48
Age group (years) 50–59+
60–69 0.94 0.68 to 1.30 0.72 0.97 0.84 to 1.12
70+ 1.70 1.25 to 2.33 0.001 1.25 1.10 to 1.43
Gender Female
Male 1.15 0.90 to 1.46 0.26 1.07 0.95 to 1.20
Ethnicity White
Other/missing 1.43 0.99 to 2.05 0.06 1.12 1.00 to 1.26
ABO blood type O
A 0.81 0.62 to 1.04 0.10 0.90 0.80 to 1.02
B 1.19 0.79 to 1.78 0.41 1.05 0.93 to 1.18
AB 0.42 0.22 to 0.79 0.007 0.84 0.74 to 0.95
Autoimmune disease (rheumatoid arthritis/lupus/psoriasis) No
Yes 2.20 1.20 to 4.02 0.01 1.14 1.03 to 1.26
Cancer–haematological No
Yes 2.82 1.10 to 7.21 0.03 1.11 1.01 to 1.22
Cancer–non-haematological No
Yes 1.44 1.04 to 2.00 0.03 1.13 1.01 to 1.26
Diabetes No
Yes 1.63 1.16 to 2.30 0.005 1.16 1.04 to 1.28
Hypertension No
Yes 1.35 1.03 to 1.78 0.03 1.13 1.01 to 1.26
Respiratory disease (excluding asthma) No
Yes 3.43 2.54 to 4.64 <0.001 1.48 1.35 to 1.63

The receiver operating characteristic curves for the final model and for alternative models with clinical factors only (S4 Table); SNP score only (Table 1); and age and gender (S5 Table) are shown in Fig 1. The SNP score alone had an AUC of 0.680 (95% CI, 0.652 to 0.708). The model with age and gender had an AUC of 0.635 (95% CI, 0.607 to 0.662), while the model with clinical factors only had an AUC of 0.723 (95% CI, 0.698 to 0.749). Given that the minimum possible value for an AUC is 0.5, the model with clinical factors only was a 65% improvement over the model with age and gender (χ2 = 57.97, df = 1, P<0.001). The full model had an AUC of 0.786 (95% CI, 0.763 to 0.808) and was an 28% improvement over the model with clinical factors only (χ2 = 39.54, df = 1, P<0.001), a 59% improvement over the SNP score (χ2 = 71.94, df = 1, P<0.001), and a 111% improvement over the model with age and gender (χ2 = 113.67, df = 1, P<0.001).

Fig 1. Receiver operating characteristic curves for models with different amounts of information.

Fig 1

The area under the receiver operating characteristic curve was 0.786 for the full model, 0.723 for the clinical model, 0.680 for the SNP score, and 0.635 for the age and gender model.

Fig 2 illustrates the difference in the distributions of the COVID-19 risk scores in cases and controls. The median score was 3.35 for cases and 0.90 for controls, with inter-quartile ranges of 6.70 and 1.34, respectively. Sixteen per cent of cases and 53% of controls had COVID-19 risk scores of less than 1, and 18% of cases and 25% of controls had scores ≥1 and <2. COVID-19 risk scores ≥2 were more common in cases than in controls, with 13% of cases and 9% of controls having scores ≥2 and <3, 8% of cases and 4% of controls having scores ≥3 and <4, and 45% of cases and 9% of controls having scores ≥4.

Fig 2.

Fig 2

Distribution of risk score for severe COVID-19 risk score for (A) cases and (B) controls. Note that 130 (13%) cases and 6 (1%) controls with scores of 15 or over have been omitted to facilitate the display of the distribution.

Fig 3 shows that the distribution of the COVID-19 risk score in the whole UK Biobank is similar to that for the controls in Fig 2B. The median COVID-19 risk score in the whole UK Biobank was 1.32 and the inter-quartile range was 1.80. Thirty-eight per cent of the UK Biobank have COVID-19 risk scores of less than 1, while 29% have scores ≥1 and <2, 13% have scores ≥2 and <3, 6% have scores ≥3 and <4, and 14% have scores of ≥4.

Fig 3. Distribution of risk score for severe COVID-19 in the 487,311 UK Biobank participants with SNP data available.

Fig 3

Note that 7,769 (1.8%) scores of 15 or over have been omitted to facilitate the display of the distribution.

One of the main issues of the COVID-19 pandemic is that of susceptibility to severe disease. We have shown that a comprehensive risk prediction test that quantifies the varying effects of clinical risk factors and a SNP risk score has an AUC of 0.786 and improves risk discrimination of severe COVID-19 by 111% compared with a model using age and gender (P<0.001). Examination of the odds per adjusted standard deviation (Table 2) shows that the SNP score is the strongest risk factor for severe COVID-19. While the SNP score explains more variance in disease severity than all of the other risk factors in the model combined, the full model discriminates better than the clinical factors alone or the SNP score alone (both P<0.001).

The strong associations observed in the model consisting of just age and gender (S4 Table) are attenuated by the inclusion of other risk factors. This is due to the comorbidities in the full model being more prevalent in older people and in men, and it is the comorbidities–not age and gender–that are associated with severe disease. Relying on age and gender alone to determine risk of severe COVID-19 will unnecessarily classify healthy older people as being at high risk and will fail to accurately quantify the increased risk for younger people with comorbidities.

Our study does have some limitations. We used source of test result as a proxy for severity of disease. Therefore, there may have been some misclassification of disease severity, but this would be likely to attenuate the magnitude of the associations. Townsend deprivation score, BMI and current smoking status were taken from the baseline assessment data and may not represent the participants’ current status. This may have contributed to these variables not being statistically significant. Until mid-May, testing for COVID-19 in the UK was limited to those who had recognisable symptoms and were essential workers, contacts of known cases, hospitalised or had returned from overseas [22]. Therefore, many asymptomatic or very mild cases from the first wave of the pandemic will not have been identified in this dataset. Nevertheless, our results remain applicable to those who develop symptoms that warrant medical attention.

Conclusions

While the vast majority of the 487,311 UK Biobank participants with SNP data available are at low or only slightly elevated risk of severe COVID-19 (Fig 3), we can identify those who are likely to be at substantially increased risk. Our risk prediction test for severe COVID-19 in people aged 50 years or older has great potential for wide-reaching benefits in managing the risk for essential workers, in healthcare settings and in workplaces that seek to operate safely. The test will also enable individuals to make informed choices based on their personal risk. However, key to understanding the performance of our risk prediction test will be validation in independent data sets, work that we are planning to undertake in the near future.

Supporting information

S1 Table. Single-nucleotide polymorphisms.

(PDF)

S2 Table. Disease definitions.

(PDF)

S3 Table. Sensitivity analysis.

(PDF)

S4 Table. Model with age group and gender.

(PDF)

S5 Table. Model with clinical risk factors.

(PDF)

Acknowledgments

We wish to thank Mr Lawrence Whiting for his invaluable expertise in the management of large data files from the UK Biobank.

Data Availability

Access to the data used in this study can be obtained by applying directly to the UK Biobank at https://www.ukbiobank.ac.uk/register-apply/. The authors did not receive special access privileges to the data that others would not have; Interested researchers will be able to access the data in the same manner by applying directly to the UK Biobank.

Funding Statement

The authors received no specific funding for this work. All authors are employed by a commercial company, Genetic Technologies Limited, which provided support in the form of salaries for all authors, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of all authors are articulated in the Author Contributions section.

References

  • 1.Shovlin CL, Vizcaychipi MP. Implications for COVID-19 triage from the ICNARC report of 2204 COVID-19 cases managed in UK adult intensive care units. Emerg Med J. 2020;37(6):332–3. Epub 2020/05/06. 10.1136/emermed-2020-209791 [DOI] [PubMed] [Google Scholar]
  • 2.European Centre for Disease Prevention and Control. Rapid risk assessment–coronavirus disease 2019 (COVID-19) in the EU/EEA and the UK–ninth update [Internet]. 2020 [cited 2020 September 21]. Available from: https://www.ecdc.europa.eu/en/publications-data/rapid-risk-assessment-coronavirus-disease-2019-covid-19-pandemic-ninth-update.
  • 3.World Bank. Global economic prospects. June 2020. Washington, DC: World Bank, 2020. [Google Scholar]
  • 4.Moreno C, Wykes T, Galderisi S, Nordentoft M, Crossley N, Jones N, et al. How mental health care should change as a consequence of the COVID-19 pandemic. Lancet Psychiatry. 2020;7(9):813–24. Epub 2020/07/20. 10.1016/S2215-0366(20)30307-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kim L, Garg S, O’Halloran A, Whitaker M, Pham H, Anderson EJ, et al. Risk factors for intensive care unit admission and in-hospital mortality among hospitalized adults identified through the U.S. Coronavirus Disease 2019 (COVID-19)-Associated Hospitalization Surveillance Network (COVID-NET). Clin Infect Dis. 2020. Epub 2020/07/17. 10.1093/cid/ciaa1012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Williamson EJ, Walker AJ, Bhaskaran K, Bacon S, Bates C, Morton CE, et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature. 2020;584(7821):430–6. Epub 2020/07/09. 10.1038/s41586-020-2521-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Petrilli CM, Jones SA, Yang J, Rajagopalan H, O’Donnell L, Chernyak Y, et al. Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: prospective cohort study. BMJ. 2020;369:m1966 Epub 2020/05/24. 10.1136/bmj.m1966 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.COVID-19 Host Genetics Initiative. COVID-19 Host Genetics Initiative: results [Internet]. 2020 [cited 2020 May 13]. Available from: https://www.covid19hg.org/results/.
  • 9.COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur J Hum Genet. 2020;28(6):715–8. Epub 2020/05/15. 10.1038/s41431-020-0636-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ellinghaus D, Degenhardt F, Bujanda L, Buti M, Albillos A, Invernizzi P, et al. Genomewide association study of severe Covid-19 with respiratory failure. N Engl J Med. 2020;383(16):1522–34. Epub 2020/06/20. 10.1056/NEJMoa2020283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779 Epub 2015/04/01. 10.1371/journal.pmed.1001779 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–9. Epub 2018/10/12. 10.1038/s41586-018-0579-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.UK Biobank. UK Biobank makes health data available to tackle COVID-19 [Internet]. 2020 [cited 2020 August 20]. Available from: https://www.ukbiobank.ac.uk/2020/04/covid/.
  • 14.Machiela MJ, Chanock SJ. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31(21):3555–7. Epub 2015/07/04. 10.1093/bioinformatics/btv402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Dayem Ullah AZ, Oscanoa J, Wang J, Nagano A, Lemoine NR, Chelala C. SNPnexus: assessing the functional relevance of genetic variation to facilitate the promise of precision medicine. Nucleic Acids Res. 2018;46(W1):W109–W13. Epub 2018/05/15. 10.1093/nar/gky399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Mealiffe ME, Stokowski RP, Rhees BK, Prentice RL, Pettinger M, Hinds DA. Assessment of clinical validity of a breast cancer risk model combining genetic and clinical information. J Natl Cancer Inst. 2010;102(21):1618–27. Epub 2010/10/20. 10.1093/jnci/djq388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Melzer D, Perry JR, Hernandez D, Corsi AM, Stevens K, Rafferty I, et al. A genome-wide association study identifies protein quantitative trait loci (pQTLs). PLoS Genet. 2008;4(5):e1000072 Epub 2008/05/10. 10.1371/journal.pgen.1000072 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wolpin BM, Kraft P, Gross M, Helzlsouer K, Bueno-de-Mesquita HB, Steplowski E, et al. Pancreatic cancer risk and ABO blood group alleles: results from the pancreatic cancer cohort consortium. Cancer Res. 2010;70(3):1015–23. Epub 2010/01/28. 10.1158/0008-5472.CAN-09-2993 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Centers for Disease Control and Prevention. People with certain medical conditions [Internet]. 2020 [cited 2020 August 14]. Available from: https://www.cdc.gov/coronavirus/2019-ncov/need-extra-precautions/people-with-medical-conditions.html.
  • 20.Hopper JL. Odds per adjusted standard deviation: comparing strengths of associations for risk factors measured on different scales and across diseases and populations. Am J Epidemiol. 2015;182(10):863–7. Epub 2015/11/02. 10.1093/aje/kwv193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.StataCorp. Stata statistical software: Release 16. StataCorp LLC: College Station, TX; 2019.
  • 22.Our World in Data. Coronavirus (COVID-19) testing: COVID-19 testing policies [Internet]. 2020 [cited 2020 September 26]. Available from: https://ourworldindata.org/coronavirus-testing.

Decision Letter 0

Giuseppe Novelli

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

30 Dec 2020

PONE-D-20-34815

An integrated clinical and genetic model for predicting risk of severe COVID-19: A population-based case–control study

PLOS ONE

Dear Dr. Dite,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Feb 04 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Giuseppe Novelli

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.  Thank you for stating the following in the Competing Interests section:

"I have read the journal's policy and the authors of this manuscript have the following competing interests: All authors are employed by Genetic Technologies Limited and have a patent pending for the work in this manuscript."

We note that one or more of the authors are employed by a commercial company: Genetic Technologies Limited .

2.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

2.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.  

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and  there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

2.3. We note that you have a patent relating to material pertinent to this article. Please provide an amended statement of Competing Interests to declare this patent (with details including name and number), along with any other relevant declarations relating to employment, consultancy, patents, products in development or modified products etc. Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

2.4. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

3. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

4. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I was pleased to see this paper. Very crucial topic.

My only concerns are about the selection of the SNPs for the score, quality controls procedures and imputation of the data. Could the authors give more details about these points?

If I understand well from the paragraph "We used the

126 percentage rather than a count because some of the eligible participants had missing data for

127 some SNPs (9% had all SNPs genotyped, 82% were missing 1–5 SNPs and 9% were missing

128 6–15 SNPs)" the number of particpants with all the SNPs genotyped is quite low.

How did the authors manage this point? Did the authors use an imputation method?

Reviewer #2: The principal aim of this case-control study is to prospect a new and more accurate prediction model of severe COVID-19, based on age, gender, clinical conditions and a SNP score built on a panel of 64 SNPs identified from published data.

The authors used the clinical baseline data collected at the UK Biobank for 1,582 participants aged >50 years, tested positive for COVID-19. These individuals were divided into two categories: severe disease (n=1,018) and without severe disease (n=564). They identified 62 SNPs from a meta-analysis conducted by the COVID-19 host Genetic Initiative consortium and two SNPs from the loci found by Ellinghaus et al., for a total of 64 SNPs. Clinical risk factors were identified on large epidemiological studies of electronic health records (rare monogenic diseases were not included). Multivariable logistic regression was used to examine the association of risk factors and severe COVID-19 with adjusted odds ratio for each variable.

The results showed a strong association between the SNP score and the severity of the disease, while gender was not statistically significant (p=0.26) and the effect of age was relevant only for >70 years.

Surely the COVID-19 pandemic has set the need for a more accurate and complete risk prediction model than the one based just on age, gender and comorbidities. The addition of genetic risk factors could provide useful information to the community, especially in terms of prevention (more than a triage/hospital setting; lines 58-59). However, we should bear in mind that an accurate prediction model in order to be useful should also be easily applicable to the community, in terms of availability and cost-effectiveness (therefore this kind of data should be available for every individual, including SNPs genotype). Even though I share the authors approach to go beyond age and gender to assess individual’s risk for COVID-19, analysing this study I personally found some weaknesses, for example:

As they point out themselves (line 257), the main limitation of this study relies in the given meaning to the source of test results. We can assume that outpatients did not present a severe form of disease at time of testing, but what about the prognosis? The same consideration applies to inpatients: surely not everyone of them developed a severe form of COVID-19 and some of them could have been hospitalized for quite something else. Consequentially, can we really talk about risk prediction of “severe” COVID-19? Moreover, as stated in lines 274-275, the prediction model should be tested in independent data sets to confirm its reliability.

As we are considering the percentage of risk alleles, it should be taken into account that the 64 SNPs are not known for every patient: only 9% has all SNPs genotyped, 9% of them has between 6 and 15 SNPs missing and the rest has 1-5 SNPs missing.

Concerning primary care data (Lines 90-91): maybe the authors could specify which kind of clinical information was not taken into account.

Tables ad images are well done and easy to read, even though I would have specified for figure 3, which should refer to the “vast majority of UK Biobank participants” (line 268), how many participants without SARS-CoV-2 test results were actually taken into consideration (since we know that among them only 18,221 had SARS-CoV-2 test results; line 95).

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Feb 16;16(2):e0247205. doi: 10.1371/journal.pone.0247205.r002

Author response to Decision Letter 0


7 Jan 2021

The information below has also been provided in the Response to Reviewers document.

Reviewer #1:

Question:

I was pleased to see this paper. Very crucial topic.

My only concerns are about the selection of the SNPs for the score, quality controls procedures and imputation of the data. Could the authors give more details about these points?

Response:

The GWAS quality control, imputation and meta-analysis, were performed by the COVID-19 Host Genetics Initiative consortium, details of which are available from the references provided.

We have added additional details of our SNP selection on lines 105–109, which now reads:

We identified 62 SNPs from the results of the ANA2 meta-analysis (release 2) of SARS-CoV2 positive non-hospitalised versus hospitalised cases of COVID-19 conducted by the COVID-19 Host Genetics Initiative consortium [8, 9]. Because of the limited amount of data available at the time of release, we used P<0.0001 as the threshold for loci selection. We then removed and variants that were associated with hospitalisation in only one of the five studies in the meta-analysis.

We have also added the following sentence on line 114–115:

All of the variants had a Cochran’s Q heterogeneity test P>0.001.

Question:

If I understand well from the paragraph "We used the percentage rather than a count because some of the eligible participants had missing data for some SNPs (9% had all SNPs genotyped, 82% were missing 1–5 SNPs and 9% were missing 6–15 SNPs)" the number of particpants with all the SNPs genotyped is quite low.

How did the authors manage this point? Did the authors use an imputation method?

Response:

We did not impute the missing SNPs because, while only 9% of participants had all SNPs genotyped, very few participants had more than 5 SNPs missing.

Using a percentage rather than a count of risk alleles ensures that the SNP score for those with missing data does not underestimate risk. Because the SNPs will be missing completely at random, missing SNP data will not bias the calculation of the percentage.

Nevertheless, we have updated the information in the parentheses (lines 127–129) to be more informative. It now reads:

9% had all SNPs genotyped, 21% were missing 1 SNP, 26% were missing 2 SNPs, 18% were missing 3 SNPs, 10% were missing 4 SNPs, 7% were missing 5 SNPs, 8% were missing 6–10 SNPs and 1% were missing 11–15 SNPs

Reviewer #2:

Question:

The principal aim of this case-control study is to prospect a new and more accurate prediction model of severe COVID-19, based on age, gender, clinical conditions and a SNP score built on a panel of 64 SNPs identified from published data.

The authors used the clinical baseline data collected at the UK Biobank for 1,582 participants aged >50 years, tested positive for COVID-19. These individuals were divided into two categories: severe disease (n=1,018) and without severe disease (n=564). They identified 62 SNPs from a meta-analysis conducted by the COVID-19 host Genetic Initiative consortium and two SNPs from the loci found by Ellinghaus et al., for a total of 64 SNPs. Clinical risk factors were identified on large epidemiological studies of electronic health records (rare monogenic diseases were not included). Multivariable logistic regression was used to examine the association of risk factors and severe COVID-19 with adjusted odds ratio for each variable.

The results showed a strong association between the SNP score and the severity of the disease, while gender was not statistically significant (p=0.26) and the effect of age was relevant only for >70 years.

Surely the COVID-19 pandemic has set the need for a more accurate and complete risk prediction model than the one based just on age, gender and comorbidities. The addition of genetic risk factors could provide useful information to the community, especially in terms of prevention (more than a triage/hospital setting; lines 58-59). However, we should bear in mind that an accurate prediction model in order to be useful should also be easily applicable to the community, in terms of availability and cost-effectiveness (therefore this kind of data should be available for every individual, including SNPs genotype). Even though I share the authors approach to go beyond age and gender to assess individual’s risk for COVID-19, analysing this study I personally found some weaknesses, for example:

As they point out themselves (line 257), the main limitation of this study relies in the given meaning to the source of test results. We can assume that outpatients did not present a severe form of disease at time of testing, but what about the prognosis? The same consideration applies to inpatients: surely not everyone of them developed a severe form of COVID-19 and some of them could have been hospitalized for quite something else. Consequentially, can we really talk about risk prediction of “severe” COVID-19? Moreover, as stated in lines 274-275, the prediction model should be tested in independent data sets to confirm its reliability.

Response:

The UK Biobank did not have extensive clinical information available to determine severity of disease. We have explained that we considered our outcome measure to be a proxy for severity of disease. If there were to be some misclassification of severity of disease, this would bias the results towards the null.

The data file provided by the UK Biobank had, in many cases, multiple test results for participants. We aggregated the data so that participants who tested positive both in an outpatient setting and in hospital were considered to have severe disease (see lines 97–100). This addresses (to some extent) the problem of prognosis of disease raised by the reviewer. We acknowledge that we cannot identify participants who were hospitalised for reasons other than COVID-19.

In an emerging health crisis such as COVID-19, the perfect dataset does not exist and we must use data that is readily available. Our concerns regarding the classification of disease severity and the limited testing available early in the pandemic (the period from which this data was derived) were behind our decision not to divide the data into training and testing datasets. We believed that any inherent biases in the data would result in false validation of the risk prediction model. This is why we emphasised the need for validation in independent datasets.

Question:

As we are considering the percentage of risk alleles, it should be taken into account that the 64 SNPs are not known for every patient: only 9% has all SNPs genotyped, 9% of them has between 6 and 15 SNPs missing and the rest has 1-5 SNPs missing.

Please see the response above to the question from Reviewer 1.

Question:

Concerning primary care data (Lines 90-91): maybe the authors could specify which kind of clinical information was not taken into account.

Response:

We have updated the text on lines 85–87 as follows:

At the time of data extraction, primary care administrative data (general practitioner records relating to diagnoses, symptoms, referrals, laboratory test results and prescriptions for medication) was only available for just over half of the identified participants and was therefore not used in these analyses.

Question:

Tables ad images are well done and easy to read, even though I would have specified for figure 3, which should refer to the “vast majority of UK Biobank participants” (line 268), how many participants without SARS-CoV-2 test results were actually taken into consideration (since we know that among them only 18,221 had SARS-CoV-2 test results; line 95).

Response:

On line 272, “the vast majority of UK Biobank participants” refers to those with low or only slightly increased risk among the whole UK Biobank, not the subset with COVID-19 test results in the main analyses.

We have updated the text on lines 272–273 as follows:

While the vast majority of the 487,311 UK Biobank participants with SNP data available are at low or only slightly elevated risk of severe COVID-19 (Fig 3)…

We have also updated the caption to Fig 3 (lines 240–241) as follows:

Fig 3. Distribution of risk score for severe COVID-19 in the 487,311 UK Biobank participants with SNP data available.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Giuseppe Novelli

3 Feb 2021

An integrated clinical and genetic model for predicting risk of severe COVID-19: A population-based case–control study

PONE-D-20-34815R1

Dear Dr. Dite,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Giuseppe Novelli

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Giuseppe Novelli

5 Feb 2021

PONE-D-20-34815R1

An integrated clinical and genetic model for predicting risk of severe COVID-19: A population-based case–control study

Dear Dr. Dite:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Prof. Giuseppe Novelli

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Single-nucleotide polymorphisms.

    (PDF)

    S2 Table. Disease definitions.

    (PDF)

    S3 Table. Sensitivity analysis.

    (PDF)

    S4 Table. Model with age group and gender.

    (PDF)

    S5 Table. Model with clinical risk factors.

    (PDF)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    Access to the data used in this study can be obtained by applying directly to the UK Biobank at https://www.ukbiobank.ac.uk/register-apply/. The authors did not receive special access privileges to the data that others would not have; Interested researchers will be able to access the data in the same manner by applying directly to the UK Biobank.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES