Skip to main content
PLOS One logoLink to PLOS One
. 2023 Nov 16;18(11):e0287725. doi: 10.1371/journal.pone.0287725

Dataset of single nucleotide polymorphisms of immune-associated genes in patients with SARS-CoV-2 infection

Nikoletta Katsaouni 1,2,3,#, Pablo Llavona 4,#, Yascha Khodamoradi 5,#, Ann-Kathrin Otto 6, Stephanie Körber 7, Christof Geisen 7, Christian Seidl 7, Maria J G T Vehreschild 5, Sandra Ciesek 8,9,10, Jörg Ackermann 6, Ina Koch 6, Marcel H Schulz 1,2,3,‡,*, Daniela S Krause 4,7,11,12,13,‡,*
Editor: Asli Suner Karakulah14
PMCID: PMC10653545  PMID: 37971979

Abstract

The SARS-CoV-2 pandemic has affected nations globally leading to illness, death, and economic downturn. Why disease severity, ranging from no symptoms to the requirement for extracorporeal membrane oxygenation, varies between patients is still incompletely understood. Consequently, we aimed at understanding the impact of genetic factors on disease severity in infection with SARS-CoV-2. Here, we provide data on demographics, ABO blood group, human leukocyte antigen (HLA) type, as well as next-generation sequencing data of genes in the natural killer cell receptor family, the renin-angiotensin-aldosterone and kallikrein-kinin systems and others in 159 patients with SARS-CoV-2 infection, stratified into seven categories of disease severity. We provide single-nucleotide polymorphism (SNP) data on the patients and a protein structural analysis as a case study on a SNP in the SIGLEC7 gene, which was significantly associated with the clinical score. Our data represent a resource for correlation analyses involving genetic factors and disease severity and may help predict outcomes in infections with future SARS-CoV-2 variants and aid vaccine adaptation.

Introduction

The pandemic due to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has de-vastatingly affected the entire globe causing disease and death to millions, as well as unemployment, social hardship, and shrinkage of global growth, which represents the deepest recession since World War II [1, 2]. While vaccines have become available in industrialized nations and have favorably impacted infection rates, continuously evolving mutations in the viral genome are leading to new waves of infection and causing vaccines to become less efficient [3].

Symptoms of infection with SARS-CoV-2 vary and range from no or mild symptoms, such as headaches, nasal congestion, fever, muscle pain, sore throat, and a loss of taste and smell, to severe illness. In severe cases, dyspnea, hypoxia, and lung involvement, which may lead to respiratory failure, shock, and multi-organ dysfunction, requiring ventilation or extracorporeal membrane oxygenation have been observed [1].

Approximately 80% of patients with SARS-CoV-2 infection recover from the disease without the need for treatment, while approximately 5% are admitted to intensive care, 2.3% require mechanical ventilation and 1.4% of patients die [4]. Advanced age and comorbidities such as diseases of the cardiovascular system, chronic pulmonary diseases, diabetes, cancer, and immunosuppression have been considered risk factors for a severe clinical course or fatal outcome [5]. It is, however, unknown why even certain younger, otherwise healthy individuals or members of certain ethnic groups, may experience severe or even fatal disease or a condition characterized by fatigue and other symptoms termed long COVID.

Since the outbreak of the pandemic, various publications have described polymorphisms in certain genes or variations in expression levels of proteins to influence disease severity and outcome in SARS-CoV-2 infection. Prominent examples are the solute carrier family 6 member 20 (SLC6A20) [1], sialic acid-binding Ig-like lectin 7 (SIGLEC7; CD328) [6], and transmembrane serine protease 2 (TMPRSS2) [7]. SLC6A20 is a member of the family of Na+ and Clcoupled transporters expressed by proximal tubule kidney cells (and enterocytes), where they function as transporters of proline [8, 9]. SLC6A20 has been identified as part of a 3p21.31 gene cluster representing a genetic susceptibility locus in SARS-CoV-2 patients with respiratory failure [1]. By hetero-dimerization with an angiotensin-converting enzyme (ACE)2, SLC6A20 is associated with the renin-angiotensin system. In the renin-angiotensin system, ACE2 plays a key role in converting the vasoconstrictor angiotensin-II (Ang-II) into the vasodilator Ang1-7 and, thereby, lowering blood pressure [10]. TMPRSS2 promotes the uptake of SARS-CoV-2 via proteolytic cleavage of the ACE2 receptor. Single nucleotide polymorphisms in TMPRSS2 have been shown to vary amongst global populations [7] and to affect the susceptibility of a patient to infection with SARS-CoV-2 via modulation of splicing, miRNA expression, etc. [11]. SIGLEC7 facilitates the inhibition of the cytotoxicity of natural killer cells [12]. Expression of SIGLEC7 has been shown to correlate with SARS-CoV-2 levels in nasopharyngeal swabs or autopsies [6].

Based on these previous findings, we collected demographics, ABO blood group, human leukocyte antigen (HLA) type, data on hereditary prothrombotic factors, as well as next-generation sequencing data of genes in the natural killer cell receptor family, the renin-angiotensin-aldosterone and kallikrein-kinin systems in patients. Patients had been stratified into seven categories of disease severity according to the World Health Organization (WHO) Ordinal Scale for Clinical Improvement. We hypothesized that ABO blood group, HLA type, hereditary factors or certain SNPs in immune-, blood pressure- or inflammation-associated genes or a combination of these factors may impact severity of symptoms and influence outcome in infection with SARS-CoV-2. We, therefore, collected blood samples from 159 adult individuals with past or current infection with SARS-CoV-2, in order to test for factors, possibly predisposing them to severe outcome.

Taken together, our data may aid predictions on the susceptibility of individuals to novel virus variants, their anticipated disease severity, as well as on the efficiency of current and future vaccines to viral mutants. It is hoped that our data will support the rational design, adaptation, and optimization of vaccines to new viral strains.

Materials and methods

Patient consent

This study was performed as part of the CAP-Net Foundation’s competence network on community-acquired pneumonia and was conducted according to the guidelines of the Declaration of Helsinki and international standards of good clinical practice. The ethics committee of the Goethe University Frankfurt approved the protocol and any protocol amendments (Protocol number: 20–748). All enrolled patients provided a written informed consent form and specifically agreed to the performance of genetic studies, the transfer of their anonymized data to third parties and the use of data from their medical records in this research. All data were fully anonymized prior to their access. No minors were included in the study.

Patient collection

10 ml of peripheral blood were collected in EDTA from 159 patients, who had been or were currently infected with SARS-CoV-2 and who received medical care from the university hospital of the Goethe-University in Frankfurt am Main, Germany. All patient samples were collected between April 2020 and November 2021, a period when the virus strains B.1.1.7 and B.1.617 or B.1.617.2 were most prevalent. All included patients tested positive by polymerase chain reaction (PCR) at different institutions. DNA for HLA typing was available for 156 of the patients. 67 patients were female (42%), 90 patients were male (56%), and two patients were not reported. SNP sequencing was conducted for all patients. The patients were between 18–92 years of age. The patients were stratified according to the World Health Organization Ordinal Scale for Clinical Improvement [13], with a

score 1 signifying very mild symptoms without compromise of activities,

score 2 signifying mild symptoms with a compromise of activities,

score 3 signifying moderate disease requiring admission to hospital (but no oxygen),

score 4 signifying hospital care with oxygen treatment,

score 5 signifying severe disease with high oxygen requirements,

score 6 signifying intubation and mechanical ventilation and

score 7 signifying ventilation plus the support of organs by pressors or extracorporeal membrane oxygenation (ECMO).

The scoring system for these patients was also compared to an updated scoring system [14]. For further data and future statistical analysis, patients were categorized as having mild (≤2, n = 94) or severe disease (≥3, n = 65), with the categorization ‘severe disease’ defining patients who were hospitalized. Respiratory hospitalization was defined as hospitalization due to dyspnea requiring oxygen treatment, mechanical ventilation, or ECMO (scores ≥4, n = 51). The steps of the experimental workflow can be found in Fig 1.

Fig 1. Experimental workflow.

Fig 1

Table 1 (Excel version online available at https://doi.org/10.6084/m9.figshare.21803928.v1) lists the gender, age, clinical score, ABO type, Rhesus (Rh) type, HLA types, ethnicity, and clinical information on comorbidities for each patient. Information on the classification of the patients according to the WHO Clinical Improvement clinical progression scale [14] is provided and can be found in the column “Clinical Score”. Note that sample IDs 33, 45, and 79 are not included.

Table 1. Clinical metadata and HLA types for each individual patient in the cohort of 159 patients.

Note that, sample IDs 33, 45, and 79 are not included. Excel version is online available at https://doi.org/10.6084/m9.figshare.21803928.v1.

Sample Sex Age Clinical
Score
Revised
Clinical
Score
ABO
Type
Rhesus  HLA-A HLA-B HLA-C HLA-DRB1 HLA-
DQB1
HLA-
DPB1
FVL Prothrom-
bin
PCR
test
Ethnicity Premorbidities
1 f 47 2 2 O pos 3; 30 35; 51 4; 14 1; 7 2; 5 4; 17 WT WT pos Caucasian
2 m 51 2 2 O pos 2; 3 7; 44 5; 7 4; 15 3; 6 4; 14 WT WT pos Caucasian
3 f 34 2 2 O pos 25; 32 14; 18 8; 12 7; 14 2; 5 4; 10 WT WT pos Caucasian
4 f 52 2 2 O pos 2; 15; 40 3; 3 11; 13 3; 6 3; 4 WT WT pos Caucasian
5 f 27 2 2 A pos 1; 8; 37 6; 7 3; 10 2; 5 4; 4 het WT pos Caucasian
6 m 67 2 2 A pos 2; 11 27; 51 2; 15 4; 9 3; 3 4; 4 WT WT pos Caucasian
7 m 78 6 10 B pos 1; 25 8; 18 7; 12 3; 4 2; 3 4; 4 WT WT pos Caucasian
8 f 71 6 9 A pos 11; 26 35; 4; 14; 15 5; 6 1; 4 WT WT pos Caucasian
9 m 52 7 7 A pos 1; 2 15; 51 1; 7 11; 13 3; 6 124; 4 WT WT pos Middle Eastern
10 m 61 6 10 A pos 2; 3 7; 51 2; 7 11; 15 3; 6 2; 4 hom WT pos Caucasian
11 m 37 2 2 O pos 2; 26 38; 40 3; 12 13; 13 3; 6 4; 4 pos Caucasian
12 m 32 4 4 O pos 2;3 42;57 17;18 3;13 4;6 1;2 WT WT pos Hispanic
13 f 68 4 5 O neg 32; 69 7; 38 7; 12 15; 15 6; 6 4; 4 WT WT pos Caucasian Hypothyroidism
14 f 40 1 1 A neg 2; 30 7; 13 6; 7 7; 11 2; 3 4; 14 WT WT pos North African Penicillin Alergy
15 m 47 2 3 O pos 2; 24 15; 51 3; 14 13; 13 6; 6 4; 4 pos Caucasian
16 m 46 2 2 A pos 2; 11 8; 27 1; 7 4; 15 3; 6 2; 14 WT WT pos Caucasian Ulcerative colitis
17 m 41 1 2 O pos 2; 11 7; 7 7; 7 4; 15 3; 6 4; 4 pos Caucasian
18 f 37 2 3 A pos 2; 3 15; 37 3; 6 4; 10 3; 5 3; 4 WT WT pos Caucasian
19 m 67 4 4 AB pos 2;31 41;51 15;17 11; 3 4;4 WT WT pos Caucasian
20 m 50 4 5 AB pos 2; 24 51; 55 3; 16 4; 16 3; 5 1; 416 WT WT pos North African
21 f 22 2 2 B pos 3; 26 27; 56 1; 1 15; 16 5; 6 4; 10 WT WT pos Caucasian
22 f 35 2 2 O pos 2; 24 51; 51 1; 2 11; 3; 3 4; 4 WT WT pos Caucasian
23 m 27 2 2 A pos 2; 2 15; 51 3; 4 3; 4 2; 3 4; 4 WT WT pos Caucasian
24 m 50 4 8 A pos 2; 3 15; 41 14; 17 3; 4 2; 3 3; 650 WT WT pos Caucasian
25 m 65 4 10 A pos 2;29 7;40 3;15 1;10 5 4;10 het WT pos Caucasian
26 f 64 2 3 O pos 2; 24 7; 44 4; 7 7; 15 2; 6 4; 14 WT WT pos Caucasian
27 f 66 2 2 O pos 1; 3 7; 35 4; 7 7; 15 2; 6 4; 4 WT WT pos Caucasian
28 f 64 4 5 O pos 1; 3 7; 35 4; 7 7; 15 2; 6 4; 4 WT WT pos Caucasian
29 m 65 2 2 O pos 2; 2 15; 51 1; 3 1; 4 3; 5 2; 4 WT WT pos Caucasian COPD
30 f 63 1 2 O pos 2; 3 15; 18 3; 12 1; 11 3; 5 1; 4 WT WT pos Caucasian
31 f 45 3 4 B pos 3; 32 7; 39 7; 12 8; 15 4; 5 2; 4 WT WT pos Caucasian Atopic dermatitis
32 m 67 2 2 O pos 2; 3 39; 44 5; 7 7; 14 2; 5 4; 4 WT WT pos Caucasian Arthritis, Hashimoto"s Thyroiditis
34 m 49 1 1 O pos 1; 32 8; 38 7; 12 14; 15 5; 6 4; 4 WT WT pos Caucasian Coronary artery disease
35 m 39 3 4 O pos 11; 11 15; 55 3; 4 12; 12 3; 3 5; 5 WT WT pos East Asian
36 m 70 6 7 B neg 3; 23 7; 44 4; 7 15; 15 6; 6 3; 4 WT WT pos Caucasian
37 m 63 7 7 A pos 2; 11 7; 7 7; 7 4; 15 3; 6 4; 4 WT WT pos Middle Eastern
38 m 42 2 3 AB pos 1; 32 35; 57 4; 6 7; 13 3; 5 4; 4 WT het pos Caucasian
39 f 47 2 2 A neg 3; 25 7; 13 6; 7 3; 7 2; 2 4; 4 WT WT pos Caucasian
40 m 56 2 2 O neg 2; 3 15; 40 3; 3 4; 7 2; 3 6; 15 WT WT pos Caucasian
41 f 32 2 2 A neg 2; 68 14; 27 2; 8 13; 15 3; 6 2; 19 WT WT pos Caucasian
42 m 52 2 2 B neg 1; 24 8; 44 4; 7 3; 11 2; 3 1; 17 WT WT pos Middle Eastern
43 f 50 3 5 B pos 1; 2 8; 57 6; 7 3; 15 2; 6 4; 13 WT WT pos Caucasian Meningioma, chronically relapsing EBV infection, asthma, bone cyst left wrist, thyroid nodules
44 f 32 2 2 O pos 2; 2 44; 51 4; 4 7; 16 2; 5 13; 14 WT WT pos Caucasian
46 m 89 4 6 B pos 11; 33 14; 38 8; 12 1; 3 2; 5 1; 2 WT WT pos Caucasian Hypertension, atrial flutter
47 f 51 2 2 A pos 2; 3 15; 51 2; 6 13; 13 6; 6 4; 9 WT WT pos Caucasian Asthma
48 m 55 2 2 O pos 2; 24 7; 38 7; 12 15; 15 6; 6 2; 10 WT WT pos Caucasian
49 m 54 5 7 O pos 11; 68 53; 55 3; 4 1; 13 5; 6 4; 10 WT WT pos Caucasian Hypertension, arthritis, thrombosis (Microthrombosis in the fingers)
50 f 28 2 2 A pos 2; 24 40; 51 14; 15 8; 11 3; 3 4; 16 WT WT pos East Asian Systemic lupus erythematodes
51 m 30 2 2 A pos 2; 11 35; 56 4; 4 8; 15 4; 6 13; 26 WT WT pos East Asian Hypothyroidism
52 m 56 4 5 A pos 3; 24 7; 7 7; 7 4; 15 3; 6 3; 4 WT WT pos Caucasian
53 m 56 2 2 A pos 11; 24 40; 51 3; 15 4; 11 3; 3 3; 4 WT WT pos Caucasian Hypertension
54 m 38 2 2 A pos 3; 11 35; 35 4; 5 7; 11 2; 3 3; 4 WT WT pos Caucasian Hypertonie
55 f 54 2 2 O pos 1; 2 8; 39 7; 7 3; 8 2; 4 3; 4 WT WT pos Caucasian
56 m 51 2 2 O pos 2; 25 8; 18 7; 12 3; 13 2; 6 4; 14 WT WT pos Caucasian
57 m 42 2 2 A pos 1; 3 8; 15 3; 7 3; 4 2; 3 1; 10 WT WT pos Caucasian
58 f 29 1 2 B pos 2; 32 27; 57 1; 6 3; 14 2; 5 4; 5 WT WT pos Caucasian (Pregnant)
59 m 59 4 5 O neg 29; 68 8; 45 6; 7 3; 4 2; 3 4; 10 WT WT pos Caucasian Asthma
60 2 2 A pos 2; 25 14; 18 8; 12 1; 16 5; 5 3; 4 WT WT pos Caucasian
61 f 42 2 2 O pos 2; 3 18; 38 7; 12 3; 15 2; 5 2; 126 WT WT pos Hispanic Hyperthyroidism
62 m 49 2 2 O pos 1; 2 15; 51 7; 16 4; 7 3; 3 4; 4 WT WT pos Caucasian Essential thrombocythemia
63 f 28 2 2 O pos 2; 2 15; 44 4; 16 4; 7 2; 3 1; 15 WT WT pos Caucasian
64 m 30 2 3 O neg 2; 68 7; 53 4; 7 13; 15 6; 6 4; 4 WT WT pos Caucasian
65 m 30 2 2 O pos 1; 24 27; 57 5; 6 7; 11 3; 3 4; 4 WT WT pos Caucasian
66 m 18 2 2 O neg 3; 32 7; 7 7; 7 8; 15 4; 6 4; 4 WT WT pos Caucasian
67 m 29 2 2 O neg 3; 3 7; 7 7; 7 15; 15 6; 6 3; 4 WT WT pos Caucasian Lymphoma 2014
68 f 39 2 2 O pos 1; 24 14; 27 6; 7 3; 7 2; 2 1; 17 WT WT pos Caucasian Psoriasis, allergic rhinitis
69 f 58 2 2 A pos 2; 2 40; 45 3; 6 10; 13 5; 6 3; 3 WT WT pos Caucasian
70 m 36 2 2 A neg 1; 32 8; 14 7; 8 3; 7 2; 2 4; 4 WT WT pos Caucasian
71 m 35 2 2 O pos 2; 68 40; 44 1; 5 1; 4 3; 5 5; 20 WT WT pos Caucasian Diabetes mellitus II, Hypertension
72 m 46 2 2 A pos 1; 3 7; 15 7; 7 4; 12 3; 3 4; 4 WT WT pos Caucasian none
73 f 33 1 2 A pos WT WT pos Caucasian Allergy (corn + pollen); Hypertension
74 f 60 2 2 O pos 3; 68 7; 44 7; 7 7; 13 2; 6 350; 905 WT WT pos Caucasian allergic asthma
75 m 46 2 4 A pos 2; 24 18; 50 6; 12 7; 11 2; 3 4; 9 WT WT pos Caucasian
76 f 50 1 2 A neg 1; 68 8; 53 4; 7 3; 13 2;6 104; 4 het WT pos Caucasian Lung carcinoma
77 f 51 2 2 O pos 24; 26 7; 27 2; 7 15; 15 6; 6 3; 4 WT WT pos Caucasian Rheumatoid arthritis
78 m 55 2 2 O pos 1; 24 7; 51 7; 14 13; 15 6; 6 2; 2 WT WT pos Caucasian
80 f 26 2 2 A pos 1; 30 8; 13 6; 7 3; 7 2; 2 4; 4 WT WT pos Caucasian Allergies (Cat; pollen)
81 m 59 2 2 A pos 23; 24 15; 56 3; 4 3; 7 2; 2 13; 13 WT WT pos Caucasian
82 f 32 1 2 A pos 1; 3 7; 55 3; 7 11; 15 3; 6 3; 4 WT WT pos Caucasian
83 f 40 2 2 B pos 2; 2 15; 44 3; 16 4; 14 3; 5 2; 2 WT WT pos Caucasian Hemophilia A, autoimmune thyroid disease
84 m 37 2 2 AB pos 2; 32 44; 50 5; 6 7; 11 2; 3 2; 4 WT WT pos Caucasian
85 m 30 2 2 B pos 3; 29 7; 35 4; 15 11; 13 3; 6 4; 17 WT WT pos Caucasian Allergy (Grasses), Hailey Hailey disease
86 f 39 2 2 A pos 2; 31 51; 51 15; 15 11; 13 3; 6 2; 2 WT WT pos Caucasian
87 f 41 2 2 B pos 1; 30 18; 37 5; 6 3; 10 2; 5 2; 2 WT WT pos Caucasian Malignant melanoma
88 f 22 2 2 O pos 2; 3 7; 14 7; 8 13; 15 3; 5 2; 2 WT WT pos East Asian Asthma
89 m 53 2 2 A pos pos Caucasian
90 m 50 2 2 O pos 3; 32 7; 7 7; 7 8; 15 4; 6 3; 16 WT WT pos Caucasian
91 m 25 2 2 O pos 3; 3 7; 57 6; 7 15; 15 6; 6 3; 4 WT WT pos Caucasian Diabetes mellitus II
92 m 57 2 2 A pos 1; 68 44; 44 4; 5 4; 7 2; 3 4; 4 WT WT pos Caucasian Atopic dermatitis, allergic asthma
93 m 35 2 2 B pos 1; 2 13; 15 3; 6 4; 7 2; 3 4; 17 WT WT pos Caucasian Hypertension
94 f 34 2 2 B pos 11; 33 14; 35 4; 8 1; 1 5; 5 2; 104 WT WT pos Caucasian
95 m 41 2 2 A pos 2; 25 40; 44 3; 4 4; 10 3; 5 3; 4 WT WT pos Caucasian
96 f 42 4 5 O pos 3; 26 38; 51 1; 12 13; 3; 6 4; 5 WT WT pos Caucasian
97 f 29 2 2 B pos 25; 31 40; 44 3; 5 4; 15 3; 6 1; 3 WT WT pos Caucasian Epilepsy
98 f 27 2 2 O pos 1; 11 7; 40 3; 7 4; 15 3; 6 1; 4 WT WT pos Caucasian Hashimoto’s Thyroiditis, slipped disc
99 f 49 2 2 O pos 23; 24 49; 49 7; 7 1; 11 3; 5 104; 4 WT WT pos Caucasian
100 f 25 1 2 B pos 2; 23 49; 49 7; 7 4; 11 3; 3 4; 13 WT WT pos Caucasian Hashimoto’s Thyroiditis
101 m 20 3 4 O pos 1; 25 18; 40 2; 5 3; 11 2; 3 4; 4 WT WT pos Caucasian
102 f 21 2 2 A pos 2; 24 35; 40 3; 4 11; 13 3; 6 4; 4 WT WT pos Caucasian
103 f 51 2 2 A pos 2; 3 40; 40 3; 3 4; 7 2; 3 4; 4 WT WT pos Caucasian Iron deficiency anemia
104 m 48 2 2 O pos 2; 2 7; 44 5; 7 13; 15 6; 6 4; 4 het WT pos Caucasian
105 f 49 2 2 A pos 2; 68 8; 53 4; 7 12; 13 3; 6 2; 4 het WT pos Caucasian
106 m 30 1 2 A pos 2; 2 13; 51 6; 15 7; 13 2; 6 4; 17 WT WT pos Caucasian Deficiency of factor VII
107 f 43 2 2 B pos 2; 30 13; 18 5; 6 3; 10 2; 5 3; 4 WT WT pos Caucasian
108 m 58 2 2 A pos 1; 3 7; 8 7; 7 9; 15 3; 6 2; 4 WT WT pos Caucasian
109 m 56 5 6 O pos 2; 24 15; 51 3; 15 13; 15 6; 6 2; 2 WT WT pos Caucasian
110 f 56 2 2 A pos 2; 68 14; 51 8; 15 8; 13 3; 4 2; 4 WT WT pos Caucasian
111 f 50 2 2 O pos 2; 2 7; 44 5; 7 4; 11 3; 3 1; 4 WT WT pos Caucasian
112 m 49 2 2 A pos 11; 30 18; 51 5; 15 3; 3 2; 2 4; 13 WT WT pos Caucasian Migraine; Endometriosis
113 f 49 2 2 O pos 1; 26 35; 44 3; 4 1; 7 2; 5 2; 17 WT WT pos Caucasian Hashimoto’s Thyroiditis
114 m 36 2 2 B pos 3; 26 7; 27 1; 7 1; 15 5; 6 2; 3 WT WT pos Caucasian Thyroid nodules
115 f 49 3 4 A neg 11; 29 45; 58 3; 6 13; 15 6; 6 2; 3 WT WT pos Caucasian
116 f 27 2 2 O pos 2; 3 38; 50 6; 12 7; 7 2; 3 4; 4 WT WT pos Caucasian Hashimoto’s Thyroiditis
117 f 41 2 2 O pos 2; 11 44; 51 7; 15 4; 11 3; 3 4; 15 WT WT pos Caucasian Migraine
118 m 55 3 4 O pos 2; 68 44; 44 4; 5 13; 13 6; 6 4; 14 WT WT pos Caucasian
119 f 40 2 2 O pos 2; 30 14; 18 5; 8 7; 11 2; 3 4; 4 WT WT pos Caucasian
120 f 22 2 2 A pos 68; 68 13; 50 6; 16 7; 13 2; 3 104; 4 WT WT pos Caucasian Allergic asthma, breast cancer
121 f 51 2 2 A pos 2; 68 7; 44 5; 7 7; 15 2; 6 2; 4 WT WT pos Black
122 f 33 2 2 A pos 1; 2 13; 18 6; 7 7; 11 2; 3 3; 4 WT WT pos Caucasian Psoriasis, gout, hip dysplasia
123 f 75 4 5 O pos 3; 3 7; 35 4; 7 1; 15 5; 6 4; 4 WT WT pos Caucasian Myokarditis, perikarditis
124 m 64 4 5 A pos 2; 32 35; 44 4; 5 15; 15 6; 6 2; 2 het het pos Caucasian
125 m 60 4 5 O pos 2; 23 8; 44 2; 7 3; 7 2; 2 4; 835 WT WT pos Caucasian Hashimoto’s Thyroiditis
126 m 81 7 9 O pos 1; 3 15; 51 7; 14 1; 13 5; 6 2; 4 WT WT pos Middle Eastern
127 m 56 5 6 O pos 24; 24 35; 51 04; 16 8; 13 3; 6 2; 4 WT WT pos North African
128 f 52 3 4 O pos 1; 3 7; 7 7; 7 12; 15 3; 6 4; 4 WT WT pos Middle Eastern
129 m 37 5 6 A pos 11; 32 35; 40 4; 15 4; 7 3; 3 4; 4 WT WT pos Caucasian
130 m 67 4 5 A pos 2; 68 44; 52 7; 12 11; 15 3; 6 4; 4 WT WT pos Middle Eastern
131 m 34 4 5 O pos 2; 30 39; 53 4; 7 4; 13 2; 6 104; 4 WT WT pos Middle Eastern
132 m 82 4 5 O pos 11; 24 35; 52 2; 12 11; 15 3; 6 3; 4 WT WT pos Black
133 m 34 3 4 A pos 11; 11 52; 52 12; 12 15; 15 6; 6 2; 2 WT WT pos North African
134 f 41 4 5 A pos 29; 30 53; 58 4; 7 1; 3 4; 5 1; 104 WT WT pos Caucasian
135 f 80 4 5 A pos 2; 3 7; 35 2; 7 12; 15 3; 6 4; 4 WT WT pos North African
136 m 41 3 4 AB pos 1; 32 35; 57 4; 6 7; 15 3; 6 4; 4 WT WT pos Caucasian
137 m 49 3 4 O pos 29; 29 15; 49 3; 7 12; 15 5; 6 2; 665 WT WT pos East Asian
138 m 52 4 5 A pos 2; 3 7; 35 4; 7 3; 16 2; 5 4; 4 WT WT pos Black
139 m 65 4 5 A pos 2; 24 38; 49 7; 12 11; 13 3; 6 2; 4 WT WT pos Caucasian
140 m 82 4 5 A pos 2; 2 15; 39 2; 3 1; 1 5; 5 4; 4 WT WT pos North African
141 m 65 5 6 A pos 11; 24 39; 51 7; 16 1; 13 5; 6 4; 10 WT WT pos Caucasian
142 m 59 4 5 A pos 2; 33 8; 53 4; 7 3; 13 2; 6 4; 4 WT WT pos Caucasian
143 m 76 4 5 O pos 3; 30 49; 55 1; 7 3; 11 2; 3 3; 4 WT WT pos Caucasian
144 f 22 4 5 A pos 2; 3 15; 35 3; 4 4; 13 6; 6 4; 4 WT WT pos Middle Eastern
145 f 68 4 5 O pos 2; 29 44; 51 15; 16 11; 15 3; 6 1; 4 pos Caucasian Hashimoto’s Thyroiditis
146 m 77 4 5 A pos 1; 2 15; 57 4; 7 1; 7 3; 5 4; 4 WT WT pos Caucasian
147 m 73 4 5 O pos 33; 68 15; 35 2; 4 13; 15 6; 6 1; 18 WT WT pos Caucasian
148 m 68 4 5 A pos 3; 30 13; 14 6; 8 1; 7 2; 5 4; 4 WT WT pos Caucasian
149 m 63 4 4 O pos 24; 26 51; 52 7; 12 11; 15 3; 6 4; 4 WT WT pos Caucasian
150 m 57 4 4 O pos 2; 29 13; 44 6; 16 7; 7 2; 2 4; 11 WT WT pos Middle Eastern
151 5 5 A pos 1; 32 40; 44 2; 2 1; 12 3; 5 2; 4 WT WT pos Caucasian
152 m 92 4 5 A pos 3; 3 15; 47 3; 6 4; 13 3; 6 1; 5 WT WT pos Caucasian
153 f 66 4 4 A pos 2; 68 18; 53 4; 7 4; 13 3; 6 4; 4 WT WT pos Middle Eastern
154 f 65 3 3 AB pos 2; 3 38; 44 5; 12 11; 13 3; 6 2; 2 het WT pos Caucasian
155 f 71 4 4 O neg 1; 68 44; 44 5; 5 1; 15 5; 6 4; 23 WT WT pos Caucasian
156 m 80 4 4 A neg 3; 3 7; 15 7; 7 4; 4 3;3 WT WT pos Caucasian
157 m 68 5 5 O neg 11; 23 44; 49 5; 7 4; 13 2; 6 WT WT pos Caucasian Hypertension, carcinoma of the larynx, hypothyroidism, prostate hyperplasia, throat polyp, esophageal-pulmonary fistula
158 m 70 4 4 A pos 2;68 27;44 2;7 11 3 1;4 het WT pos Caucasian
159 m 54 4 4 A pos 2;3 7;44 5;7 4;15 3;6 2;4 WT WT pos Caucasian
160 f 58 3 3 O pos 2;2 14;44 5;15 11;15 3;6 4;23 WT WT pos Caucasian Hypertension; allergic rhinitis
161 m 53 3 4 B pos 32;33 35;51 1;4 11 3 3;15 WT WT pos Caucasian Hypertension
162 m 29 3 4 A neg WT WT pos Caucasian post kidney transplantation, cystinosis, hepatitis E, hypertension, allergy (atropine)

Blood typing

Blood typing for A, B, and O was performed using the Bioclone Kit from Ortho-Clinical Diagnostics (Unterschleissheim, Germany) according to the manufacturer’s instructions. Rh was determined using the IgM Anti-D Mono-Type reagent from Medion Grifols Diagnostic AG (Düdingen, Switzerland).

DNA isolation

Red blood cells (RBCs) were lysed with ACK lysis buffer (Thermo Fisher, Waltham, MA). The remaining cells were washed and centrifuged (at 1780 x g for 15 min), before they were collected in resuspension buffer (NaCl 6M, Na2EDTA pH 8), to which proteinase K (10 mg/ml) and 20% w/v sodium dodecyl sulfate (SDS) solution for overnight cell lysis at 37°C had been added. Cell debris was separated from nucleic acids after incubation in 6 M NaCl at 55°C for 15 min. The DNA was purified with 96% and 70% ethanol and, finally, resuspended in TE buffer.

HLA genotyping

Next-generation sequencing for HLA-A, -B, -C, -DRB1, -DQB1, and -DPB1 was performed by a high-resolution amplicon-based approach using NGSgo-MX6- and NGSgo-LibrX kits (GENDX, Utrecht, The Netherlands) according to the manufacturer’s instructions on an Illumina Mini-Seq device (Illumina, San Diego, CA).

Genotyping for Factor V Leiden and prothrombin G20210A

The Factor V Leiden and prothrombin (20210G>A)-genotypes (in the HGVS nomenclature version 20.05 NM000130.4: c.1601G>A, p.(Arg534Gln) and NM_000506.4: c.*97G>A, respectively) were determined by sequence-specific PCR (Factor V Leiden Quicktype and Factor II 20210 G>A Quicktype, Attomol Molekulare Diagnostika GmbH, Bronkow, Germany) according to the manufacturer’s instructions. The samples were analyzed in a 3% agarose gel in comparison to a blank, a negative (wildtype DNA), and a heterozygous control.

Design of custom-made next-generation sequencing (NGS) panel

A customized NGS panel was designed to cover exonic and partially intronic regions within genes in the natural killer cell receptor family [15], the renin-angiotensin-aldosterone system, the kallikrein-kinin system, and other genes previously found to be relevant in SARS-CoV-2-infection [1]. In total, several genomic loci, spanning 144,830 bp, were targeted by sequencing. Overall, the custom-made NGS panel targeted the coding, untranslated, and splicing-regulatory intronic regions of 90 genes. For a list of the genes, we refer to Fig 2. Primer sets for the targeted regions were designed de novo by Illumina DesignStudio.

Fig 2. Genes targeted in the next-generation sequencing panel and list of rare variants.

Fig 2

NGS library preparation and sequencing (non-HLA sequencing)

The DNA library preparation was performed following the Illumina DNA Prep with Enrichment protocol (Illumina, San Diego, USA) according to the manufacturer’s instructions. 600 ng of genomic input DNA was used for the generation of the libraries. The libraries were purified with AMPure XP beads (Beckman Coulter, High Wycombe, UK). Additionally, libraries and sample pools were quantified using Qubit dsDNA HS kit and Qubit 4 (Thermo Fisher Scientific, Waltham, USA), whereas the library size was analyzed with the High Sensitivity D1000 assay and 4200 Tapestation (Agilent Technologies, Palo Alto, USA). Samples were pooled reaching a final 1.4 pM library solution, which, subsequently, was loaded on the Illumina MiniSeq platform (Illumina, San Diego, USA) for (150 bp x 2) paired-end sequencing.

Read analysis and variant calling

Reads generated from the targeted regions were aligned to the reference genome hg19/GRCh37 using the Burrows-Wheeler Aligner (0.7.7-isis-1.0.2) [16]. Indexed Sequence Alignment/Map (SAM) and its Binary Alignment/Map (BAM) file versions were computed using SAMtools (0.1.19-isis-1.0.3). Variants were called and annotated using the GATK package (v1.6-23-gf0210b3) [17]. Sequence data were processed using the GensearchNGS software suit (1.7.058, Phenosystems, Braine le Chateau, Belgium). Only samples with a mean coverage depth higher than 90 on-target passing filter reads were considered for subsequent analysis. Data were filtered based on a variant coverage of at least ten reads and a minimum variant allele frequency (VAF) of 20%. The detected variants were assessed with the dbSNP database [18] (The Genome Aggregation Database, NHLBI Exome Sequencing Project, NCBI dbSNP, Human Gene Mutation Database). Further, a variant in silico prediction was carried out with PolyPhen-2 [19].

Descriptive statistics of age, gender, blood group and HLA

We wrote Python scripts (Python version 3.7.6) in Jupyter Notebook. We used modules from the scipy package (version 1.4.1) for statistical calculations and applied ML algorithms from the scikit-learn library (version 0.22.1).

Association testing

The focus of our analyses was on the genes SIGLEC7, ACE, SELL, TMPRSS2, and SLC6A20, since they were found to be associated with severity of SARS-CoV-2 infection in recent studies [1, 6, 7]. To explore SNPs, which lead to increased severity and to find novel SNPs, we conducted an association analysis.

SNPs that occurred in less than four patients were filtered out and excluded by the analysis. Testing significance, the p-values for the SNPs were calculated using logistic regression and adjusted for the covariates gender and age. The open-source whole-genome association analysis toolset PLINK (version 1.9) [19] was used to build the regression model. To adjust for multiple testing, the p-values were corrected using the false discovery rate approach by Benjamini-Hochberg [20].

To consider associations between all the variants within each gene, gene-based tests were applied. Accounting for multiple independent functional variants may increase the power to identify disease-associated genes. By aggregating the information on all the variants, we obtained a single p-value that corresponded to the significance of the association of the gene. We applied the versatile gene-based test for genome-wide association studies (VEGAS) [21] and the gene-based association test with extended Simes procedure (GATES) [22]. Concerning the role of HLA genotypes, the dataset limited power for the calculation of significant associations, similar to the statistical analysis by Schetelig et al. [23].

Analysis of mutation impact on protein structure

To find the SNP-containing structures in the Protein Data Bank (PDB) [24], we used UniProt [25]. We applied the SWISS-MODEL repository and the related analysis and visualization tool [26] to check the structural coverage. For unknown structures and regions without structural coverage, we analyzed predicted structures based on the AlphaFold database [27, 28]. For each predicted structure, AlphaFold provides a color-coded confidence score per residue to evaluate the prediction quality. To estimate the impact of a SNP on protein structure and function we ran the PolyPhen-2 software [19].

PolyPhen-2 annotates coding and nonsynonymous SNPs, and it predicts damaging missense mutations, using sequence-based and structure-based predictive features. To train and test PolyPhen-2, two pairs of datasets were generated. The first pair, HumDiv, is based on all 3,155 damaging alleles annotated in UniProt as causing human Mendelian diseases and affecting protein stability or function, as well as on 6,321 differences between human proteins and their closely related mammalian homologs, which have been assumed to be non-damaging. The second pair, HumVar3, consists of all the 13,032 human disease-causing mutations from UniProt and 8,946 human nonsynonymous single-nucleotide polymorphisms (nsSNPs) without annotated involvement in disease, which have been treated as non-damaging. PolyPhen-2 qualitatively classifies a mutation as benign, possibly damaging, or probably damaging, scoring with a value between 0 and 1, with 1 for damaging impact and 0 for non-damaging impact of the SNP on protein structure.

Integration of other studies

32 SNPs of five genes of interest, ACE, SIGLEC7, SELL, TMPRSS2, and SLC6A20, were rare variants and not reported in other datasets such as gnomAD (version 2.1.1) [29] or dbSNP. For a list of the 32 rare variants, we refer to Fig 2.

Results

Patient cohort

We gathered a dataset of patients with SARS-CoV-2 infection. For an outline of the experimental workflow and sequencing, we refer to Fig 1. Fig 2 lists 32 SNPs of the five genes ACE, SIGLEC7, SELL, TMPRSS2, and SLC6A20 that are rare variants and which had not been reported previously in other datasets such as gnomAD. Fig 3 depicts distributions of age, clinical score, ABO blood type, and HLA type for the patient cohort. We found no statistical significance for clinical scores of patients with any subgroup of ABO or HLA type; i.e., Mann-Whitney U tests led to significance levels greater than 5%.

Fig 3. Descriptive statistics.

Fig 3

(A) The number of patients within age groups. (B) The number of patients versus clinical score. (C) The number of patients versus ABO blood type. (D) The number of patients versus HLA-A type. (E) The number of patients versus HLA-B type. (F) The number of patients versus HLA-C type. (G) The number of patients versus HLA-DRB1 type. (H) The number of patients versus HLA-DQB1 type. (I) The number of patients versus HLA-DPB1 type. (J) Mean clinical score versus ABO blood type. (K) Mean clinical score versus HLA-A type. (L) Mean clinical score versus HLA-B type. (M) Mean clinical score versus HLA-C type. (N) Mean clinical score versus HLA-DRB1 type. (O) Mean clinical score versus HLA-DQB1 type. (P) Mean clinical score versus HLA-DPB1 type. (J-O) The black line shows the overall mean of the clinical score of 2.81. The black bullets show the mean clinical score for patient groups. Violin plots superimposed by box plots indicate the densities of clinical scores. We found no statistical significance for any of the subgroups in J-P regarding the deviation of their mean clinical score from the overall mean clinical score of 2.81, i.e. Mann-Whitney U tests gave significance values larger than 5%.

Fig 4 illustrates the role of age and gender in the clinical score. In the jitter plot in Fig 4A the red and blue dots represent data points for female and male patients, respectively. The grey curve (unisex) shows the increase in mean clinical score for all patients. The mean clinical score increases with age. A high clinical score is associated with high age (Kendall tau 0.41, p = 1.2 10−12, Stuart-Kendall Tau-c test). The unisex curve has its maximal slope of 0.1874 +/- 0.0495 [clinical score/year] at age 60.394 +/- 3.466 years. The blue curve (male) shows the increase in mean clinical score for male patients. The blue curve has its maximal slope of 0.245 +/- 0.124 [clinical score/year] at age 55.9 +/- 2.8 years. A high clinical score in male patients is associated with higher age (Kendall tau 0.41, p = 1.2 10−7, Stuart-Kendall Tau-c test). The red curve (female) shows the increase in mean clinical score for female patients. The curve for females is steeper than the curve for males and has its maximal slope of 0.942 +/- 0.552 [clinical score/year] at age 66.1 +/- 1.0 years. A high clinical score of female patients is associated with higher age (Kendall tau 0.34, p = 1.0 10−4, Stuart-Kendall Tau-c test). For young (age < 40) and old (age > 70) individuals, the clinical scores of female and male patients were not significantly different (Mann–Whitney U test). For age > 50 years, the mean clinical score increases monotonically with age (Fig 4B), and the clinical scores differ significantly from younger patients (age < 50, Mann–Whitney U test). The clinical scores of females are significantly lower than those of males (Mann-Whitney U test, p-value 1.02E-4) (Fig 4C). Fig 4D shows the receiver operating characteristics (ROC) of a random forest prediction of hospitalization, i.e., clinical score 4–7, based on age and gender. The blue line is the mean ROC curve of a Monte Carlo cross-validation with 100 random splits into 70% training and 30% test sets. The shaded area, light gray, indicates the standard deviation. The mean area under the curve (AUC) value of 0.72± 0.08 demonstrates the power of age and gender in predicting the hospitalization of a patient. For a discussion of the role of gender and age on Covid-19 severity, we refer to [30].

Fig 4. Significance of age and gender.

Fig 4

(A) Jitter plot of clinical score versus age. (B) Mean clinical score versus age group. (C) Mean clinical scores of female versus male patients of all ages. (B, C) The black line shows the overall mean of the clinical score of 2.81. The black bullets show the mean clinical score for patient groups. Violin plots superimposed by box plots indicate the densities of clinical scores. (D) Receiver operating characteristics (ROC) of a random forest prediction of hospitalization, i.e., clinical score 4–7, based on age and gender.

Panel sequencing quality

On average more than 1.2 M of passing-filter (PF) reads were generated per sample with 89.1% of them in Q3. Similarly, more than 1.2 M reads aligned to the reference genome representing 98.7% of all PF reads, generating a mean coverage of 332.6x per sample with a coverage uniformity of 98.7%.

Sample sequence-containing and base quality score files (FastQ) were aligned to the reference genome and transformed into SAM files and their binary counterpart, BAM files. The latter were subjected to variant calling and annotation, generating vcf files, calling an average of 199.69 single-nucleotide variants (SNVs) per sample with 89.75% of these being annotated in dbSNP. Fig 5A shows boxplots of the numbers of SNPs per patient, Fig 5B shows numbers of hemizygous, heterozygous, and homozygous SNPs per patient, and Fig 5C shows numbers of SNPs for patients with a low versus high clinical score.

Fig 5. Single-nucleotide polymorphisms in patient cohort.

Fig 5

A) Number of SNPs per patient. An average of approximately 220 SNPs per patient was detected. B) Comparison of the number of SNPs with zygosity. C) Comparison of the occurrence of missense SNPs in the low versus high clinical score. D) The whole AlphaFold-predicted structure, AF-Q9Y286-F1, of the nonsynonymous SNP, rs993496436, encoding the protein sialic acid-binding Ig-like lectin 7 (SIGLEC7) in humans (Q9Y286). The mutated serine is illustrated in a stick-and-ball representation. It is located within the light blue-colored area in ribbon representation, exhibiting a confidence score of 70 to 90, whereby 100 is the best score. The figure was generated using AlphaFold software. E) Magnification of the mutated serine illustrated in a stick-and-ball representation as in D). The mutated serine is located within a light blue-colored range, standing for a confidence score of 70 to 90.

Rare variants and structural properties

SNPs are associated with disease severity in infection with SARS-CoV-2. An average of approximately 220 SNPs per patient was collected and analyzed (Fig 5A). The majority of variants were heterozygous or homozygous, and only a few were hemizygous (Fig 5B). A slight enrichment of missense variants was observed in patients with high clinical score (Fig 5C). Specifically, a mean of 81.3 missense SNPs was determined for patients with low clinical score versus 83.1 missense SNPs for patients with high clinical score, whereby the distribution in the latter group was more broad.

The statistical significance of the association between variants and disease severity were calculated by PLINK logistic regression, and the p-values were corrected for the covariates age and gender. Four SNPs characterized by the strongest association with clinical score are listed in Table 2. The SNPs are located in the genes ACE, SIGLEC7, TMPRSS2, and SLC6A20. Three of them (rs993496436, rs3787950, rs2276858) are previously known, but the variant of the gene ACE, which is located on chromosome 17 at position 61562445, is not present in the dbSNP database.

Table 2. List of SNPs with the strongest association with clinical score.

The p-values are not corrected for multiple testing. The size of our data set (159 patients) is too small to reach a false discovery rate below 5%.

Chr Start Variant allele dbSNP ID Gene Occurrence mild/severe p-value
17 61562445 A>T no ACE 1/9 0,002
19 51647798 C>G rs993496436 SIGLEC 7 45/14 0,005
21 42866296 T>C rs3787950 TMPRSS2 15/15 0,039
3 45813993 G>A rs2276858 SLC6A20 10/2 0,045

10 patients, of whom 70% were male and 30% were female, were identified with a SNP in the ACE gene. 80% of these patients had a clinical score 4, and 40% of these patients were in the oldest age group (60–69 years). 59 patients had a SNP in the SIGLEC7 gene with a fairly even distribution between men (55%) and women (45%). 66% of these patients had a low clinical score of 2, and the distribution amongst the age groups was almost even. A SNP in the TMPRSS2 gene was identified in 30 patients (67% men, 33% women). 46% of patients had a clinical score 2, whereby the other clinical scores, as well as the representation of age groups was fairly well distributed. 12 patients (73% men, 27% female) were found to have a SNP in the SLC6A20 gene. The majority of patients (67%) were found to have a clinical score of 2, and 46% of patients with a SNP in this gene were in the age group 40–49 years.

The four SARS-CoV-2-related genes may be risk variants, which individually, however, were not strongly associated with disease severity. The size of our data set (159 patients) was too small to reach a false discovery rate below 5%. To increase the power of the analysis and to identify the aggregated risk of variants on a gene level, when simultaneously considering the whole set of SNPs, we applied gene-based tests. The cumulative association of genes SIGLEC7, ACE, TMPRSS2, SELL, and SLC6A20 are listed in Table 3. The number of SNPs per gene varies and is provided in the second column, “SNPs per gene”. False discovery rates (with correction for multiple testing), which were determined by GATES and VEGAS tests, are shown. Genes SIGLEC7 and ACE appear to have significant false discovery rates (p-value < 0.05). The significance for gene SIGLEC7 is associated with a group of patients who presented with at least one of two SNPs. The significance for gene ACE is associated with a group of patients who presented with at least one of twelve SNPs.

Table 3. Gene-based association test for the five SARS-CoV-2-related genes SIGLEC7, ACE, TMPRSS2, SELL, and SLC6A20.

The number of SNPs per gene considered for the test varies. False discovery rates (with correction for multiple testing) computed by two tests, GATES and VEGAS, are provided.

Gene SNPs per gene GATES VEGAS
SIGLEC7 2 0,011 0,012
ACE 13 0,021 0,020
TMPRSS2 8 0,220 0,254
SELL 6 0,483 0,613
SLC6A20 16 0,534 0,398

Given the significant association of the SIGLEC7 gene with clinical score (Table 3), we explored the protein structure and function of the coding nonsynonymous SNP rs993496436 of this gene, hypothesizing that the SNP rs993496436 might be relevant for disease severity in infection with SARS-CoV-2. The relevance of SNP rs993496436 had been previously suggested by Sharif-Askari et al. [6].

The SNP in the SIGLEC7 gene represents a mutation from cytosine to guanine. During transcription to a chain of amino acids, this SNP causes a change from serine to cysteine at position 190. Based on the corresponding UniProt entry, Q9Y286, we found six protein structures in the PDB database (1NKO, 107S, 1O7V, 2DF3, 2G5R, 2HRL) but none of them covered position 190. AlphaFold predicted a structure, AF-Q9Y286-F1, which covered the sequence from 1 to 467 and hence, included position 190 of the SNP (Fig 5D and 5E).

AlphaFold computed a middle to low confidence score for prediction of the structure in the region around position 190. The mutation is located in a loop region of a beta-sheet. The flexibility of loop regions makes it difficult to estimate how the mutation affects the structure and function of SIGLEC7. The software PolyPhen-2 predicted the mutation to be benign for both training sets HumDiv and HumVar with a score of 0.447 and 0.309, respectively (for the PolyPhen-2 report, S1 Fig). The mutation causes a change of residues from serine to cysteine. Serine and cysteine exhibit very different chemical features, so that–given the location of the amino acid change at the surface of the protein in a flexible loop region–one may speculate that the mutation may induce a significant change in structure and function of the protein.

Discussion

Since the beginning of the pandemic, several manuscripts have been published on risk factors associated with a mild versus a severe course of infection with SARS-CoV-2 [3135]. Our limited sample size did not allow any definitive conclusions on genetic risk factors such as ABO or HLA type directly influencing clinical course. Regardless, though small in nature, our study has value in providing a dataset on patient demographics, blood, and HLA type, as well as SNPs in genes of the NK cell receptor, renin-angiotensin-aldosterone and kallikrein-kinin systems in correlation with their respective clinical course. In addition, we provide a case study and protein structure analysis on a SNP in the SIGLEC7 gene, which was significantly associated with clinical score.

Our finding of no significant association of HLA-A, -B, -C, and -DRB1 genotypes with the severity of infection with SARS-CoV-2 was in accordance with the results of another study [23]. Also for HLA-DQB1, and -DPB1 genotypes, which were not studied by Schetelig et al. (2021), we identified no genotype as a major risk factor. As the study by Schetelig et al. (2021) corrected for gender, both age and age squared, both body mass index (BMI) and BMI squared, the intake of medication for diabetes mellitus or arterial hypertension and smoking status, a direct comparison of the results of our study and the study by Schetelig et al. (2021) is not possible. Both studies may suffer from insufficient prediction power to determine the role of HLA genotypes, as a much larger data set would be required to reach conclusive statements for the role of HLA genotypes. In contrast, several studies, including ours, discovered an association between male gender and age with more severe clinical course [30, 36, 37].

SIGLEC 7, which acts as an inhibitor of the cytotoxicity of NK cells and was found to be associated with clinical course in our study, may be involved in the clinical course of infection with SARS-CoV-2 by its ability to mediate sialic-dependent binding to cells or to block signal transduction via dephosphorylation of signaling molecules [38]. Given the position of the SNP in the gene and its corresponding amino acid alteration, the mutation may affect the clinical course via structural and, consequently, functional changes in the protein. In addition, ACE2, a homologue of ACE, is the recognized receptor for SARS-CoV-2 [39], and infection of cells by SARS-CoV-2 is believed to alter ACE/ACE2 balance [40]. Our association of SNPs in the ACE gene with a clinical score of 4 and this SNP’s higher incidence in males of the oldest age group in our study may confirm the pathophysiologic role of ACE/ACE2 in infection with SARS-CoV-2. However, these data will need to be further validated mechanistically in future studies.

In conclusion, our work provides SNP data in immunoregulatory genes in patients with SARS-CoV-2 infection, stratified by clinical course. We present a structural analysis on the SIGLEC7 protein, in which a SNP we found to impact clinical course, is expected to alter protein structure and function. It is to be hoped that our findings may encourage further work on the largely still obscure functions of SIGLEC7 and help predict clinical outcomes in patients with SARS-CoV-2 infection and possibly other diseases.

Supporting information

S1 Fig. PolyPhen-2 report for SNP rs993496436 in the SIGLEC7 gene.

The prediction of SNP rs993496436 is benign. The SNP causes a residue change from serine to cysteine at position 190 in the amino acid chain.

(PDF)

S1 Table. For each participant, S1 Table lists the SNPs and zygosity information (0: homozygous for the reference, 1: heterozygous, 2: homozygous for the alternate).

S1 Table is also available online in the open data repository figshare, https://doi.org/10.6084/m9.figshare.20068868.v2. The title of the file is “SNPs per patient and zygosity information”.

(XLSX)

Data Availability

The original data files are available in http://www.ncbi.nlm.nih.gov/bioproject/837053 with SRA number: SRP376127 All data needed to each the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings can me found as following: Clinical metadata and HLA types for each patient in the cohort of 159 patients: https://doi.org/10.6084/m9.figshare.21803928.v1. For each participant, Table S1 lists the SNPs and zygosity information (0: homozygous for the reference, 1: heterozygous, 2: homozygous for the alternate). Table S1 is also available online in the open data repository figshare, https://doi.org/10.6084/m9.figshare.20068868.v2. The title of the file is "SNPs per patient and zygosity information."

Funding Statement

This work was supported by the Goethe-Corona-Funds of the Goethe University Frankfurt to D.S.K. We acknowledge funding from the Alfons und Gertrud Kassel-Stiftung as part of the center for data science and AI and the DFG Cluster of Excellence Cardio Pulmonary Institute (CPI) [EXC 2026]. We also acknowledge funding from the consortia ACLF-I (Acute Liver Failure - Initiative) and ENABLE (Unraveling mechanisms driving cellular homeostasis, inflammation and infection to enable new approaches in translational medicine) (Hessian Ministry of the Arts and Sciences). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Ellinghaus D, Degenhardt F, Bujanda L, Buti M, Albillos A, Invernizzi P, et al. Genomewide Association Study of Severe Covid-19 with Respiratory Failure. N Engl J Med 2020. Oct 15; 383(16): 1522–1534. doi: 10.1056/NEJMoa2020283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Elliott L. Nations must unite to halt global economic slowdown, says new IMF head. The Guardian 2019; 8 October. [Google Scholar]
  • 3.Alenquer M, Ferreira F, Lousa D, Valério M, Medina-Lopes M, Bergman ML, et al. Signatures in SARS-CoV-2 spike protein conferring escape to neutralizing antibodies. PLoS Pathog 2021. Aug; 17(8): e1009772. doi: 10.1371/journal.ppat.1009772 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Guan WJ, Ni ZY, Hu Y, Liang WH, Ou CQ, He JX, et al. Clinical Characteristics of Coronavirus Disease 2019 in China. N Engl J Med 2020. Apr 30; 382(18): 1708–1720. doi: 10.1056/NEJMoa2002032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Jordan RE, Adab P, Cheng KK. Covid-19: risk factors for severe disease and death. Bmj 2020. Mar 26; 368: m1198. doi: 10.1136/bmj.m1198 [DOI] [PubMed] [Google Scholar]
  • 6.Saheb Sharif-Askari N, Saheb Sharif-Askari F, Mdkhana B, Al Heialy S, Alsafar HS, Hamoudi R, et al. Enhanced expression of immune checkpoint receptors during SARS-CoV-2 viral infection. Mol Ther Methods Clin Dev 2021. Mar 12; 20: 109–121. doi: 10.1016/j.omtm.2020.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zhang C, Verma A, Feng Y, Melo MCR, McQuillan M, Hansen M, et al. Global patterns of genetic variation and association with clinical phenotypes at genes involved in SARS-CoV-2 infection. medRxiv 2021. Jul 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bae M, Roh JD, Kim Y, Kim SS, Han HM, Yang E, et al. SLC6A20 transporter: a novel regulator of brain glycine homeostasis and NMDAR function. EMBO Mol Med 2021. Feb 5; 13(2): e12632. doi: 10.15252/emmm.202012632 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Vuille-dit-Bille RN, Camargo SM, Emmenegger L, Sasse T, Kummer E, Jando J, et al. Human intestine luminal ACE2 and amino acid transporter expression increased by ACE-inhibitors. Amino Acids 2015. Apr; 47(4): 693–705. doi: 10.1007/s00726-014-1889-6 [DOI] [PubMed] [Google Scholar]
  • 10.Gemmati D, Tisato V. Genetic Hypothesis and Pharmacogenetics Side of Renin-Angiotensin-System in COVID-19. Genes (Basel) 2020. Sep 3; 11(9). doi: 10.3390/genes11091044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Paniri A, Hosseini MM, Akhavan-Niaki H. First comprehensive computational analysis of functional consequences of TMPRSS2 SNPs in susceptibility to SARS-CoV-2 among different populations. J Biomol Struct Dyn 2021. Jul; 39(10): 3576–3593. doi: 10.1080/07391102.2020.1767690 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Arslan BA, Timucin AC. Immunotherapy approaches on innate immunity for SARS-Cov-2. Acta Virol 2020; 64(4): 389–395. doi: 10.4149/av_2020_401 [DOI] [PubMed] [Google Scholar]
  • 13.Patel NG, Bhasin A, Feinglass JM, Belknap SM, Angarone MP, Cohen ER, et al. Clinical Outcomes in Hospitalized Patients with COVID-19 on Therapeutic Anticoagulants. medRxiv 2020. [Google Scholar]
  • 14.infection WWGotCCaMoC-. A minimal common outcome measure set for COVID-19 clinical research. Lancet Infect Dis 2020. Aug; 20(8): e192–e197. doi: 10.1016/S1473-3099(20)30483-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sivori S, Vacca P, Del Zotto G, Munari E, Mingari MC, Moretta L. Human NK cells: surface receptors, inhibitory checkpoints, and translational applications. Cell Mol Immunol 2019. May; 16(5): 430–441. doi: 10.1038/s41423-019-0206-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009. Jul 15; 25(14): 1754–1760. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010. Sep; 20(9): 1297–1303. doi: 10.1101/gr.107524.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP: the NCBI database of genetic variation. Nucleic acids research 2001. Jan 1; 29(1): 308–311. doi: 10.1093/nar/29.1.308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007. Sep; 81(3): 559–575. doi: 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 1995; 57(1): 289–300. [Google Scholar]
  • 21.Liu JZ, McRae AF, Nyholt DR, Medland SE, Wray NR, Brown KM, et al. A versatile gene-based test for genome-wide association studies. Am J Hum Genet 2010. Jul 9; 87(1): 139–145. doi: 10.1016/j.ajhg.2010.06.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Li MX, Gui HS, Kwan JS, Sham PC. GATES: a rapid and powerful gene-based association test using extended Simes procedure. Am J Hum Genet 2011. Mar 11; 88(3): 283–293. doi: 10.1016/j.ajhg.2011.01.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Schetelig J, Heidenreich F, Baldauf H, Trost S, Falk B, Hoßbach C, et al. Individual HLA-A, -B, -C, and -DRB1 Genotypes Are No Major Factors Which Determine COVID-19 Severity. Front Immunol 2021; 12: 698193. doi: 10.3389/fimmu.2021.698193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic acids research 2000. Jan 1; 28(1): 235–242. doi: 10.1093/nar/28.1.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.UniProt: the universal protein knowledgebase in 2021. Nucleic acids research 2021. Jan 8; 49(D1): D480–d489. doi: 10.1093/nar/gkaa1100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bienert S, Waterhouse A, de Beer TA, Tauriello G, Studer G, Bordoli L, et al. The SWISS-MODEL Repository-new features and functionality. Nucleic acids research 2017. Jan 4; 45(D1): D313–d319. doi: 10.1093/nar/gkw1132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021. Aug; 596(7873): 583–589. doi: 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic acids research 2022; 50(D1): D439–D444. doi: 10.1093/nar/gkab1061 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Gudmundsson S, Singer-Berk M, Watts NA, Phu W, Goodrich JK, Solomonson M, et al. Variant interpretation using population databases: Lessons from gnomAD. Hum Mutat 2021. Dec 2. doi: 10.1002/humu.24309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gebhard C, Regitz-Zagrosek V, Neuhauser HK, Morgan R, Klein SL. Impact of sex and gender on COVID-19 outcomes in Europe. Biol Sex Differ 2020. May 25; 11(1): 29. doi: 10.1186/s13293-020-00304-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Smith N, Possémé C, Bondet V, Sugrue J, Townsend L, Charbit B, et al. Defective activation and regulation of type I interferon immunity is associated with increasing COVID-19 severity. Nat Commun 2022. Nov 25; 13(1): 7254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Angulo-Aguado M, Corredor-Orlandelli D, Carrillo-Martínez JC, Gonzalez-Cornejo M, Pineda-Mateus E, Rojas C, et al. Association Between the LZTFL1 rs11385942 Polymorphism and COVID-19 Severity in Colombian Population. Front Med (Lausanne) 2022; 9: 910098. doi: 10.3389/fmed.2022.910098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Gavriilaki E, Asteris PG, Touloumenidou T, Koravou EE, Koutra M, Papayanni PG, et al. Genetic justification of severe COVID-19 using a rigorous algorithm. Clin Immunol 2021. May; 226: 108726. doi: 10.1016/j.clim.2021.108726 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kaltoum ABO. Mutations and polymorphisms in genes involved in the infections by covid 19: a review. Gene Rep 2021. Jun; 23: 101062. doi: 10.1016/j.genrep.2021.101062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hu J, Li C, Wang S, Li T, Zhang H. Genetic variants are identified to increase risk of COVID-19 related mortality from UK Biobank data. medRxiv 2020. Nov 9. doi: 10.1101/2020.11.05.20226761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Regina J, Papadimitriou-Olivgeris M, Burger R, Le Pogam MA, Niemi T, Filippidis P, et al. Epidemiology, risk factors and clinical course of SARS-CoV-2 infected patients in a Swiss university hospital: An observational retrospective study. PLoS One 2020; 15(11): e0240781. doi: 10.1371/journal.pone.0240781 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Scully EP, Haverfield J, Ursin RL, Tannenbaum C, Klein SL. Considering how biological sex impacts immune responses and COVID-19 outcomes. Nat Rev Immunol 2020. Jul; 20(7): 442–447. doi: 10.1038/s41577-020-0348-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Yamakawa N, Yasuda Y, Yoshimura A, Goshima A, Crocker PR, Vergoten G, et al. Discovery of a new sialic acid binding region that regulates Siglec-7. Sci Rep 2020. May 26; 10(1): 8647. doi: 10.1038/s41598-020-64887-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Wan Y, Shang J, Graham R, Baric RS, Li F. Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus. J Virol 2020. Mar 17; 94(7). doi: 10.1128/JVI.00127-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Beyerstedt S, Casaro EB, Rangel É B. COVID-19: angiotensin-converting enzyme 2 (ACE2) expression and tissue susceptibility to SARS-CoV-2 infection. Eur J Clin Microbiol Infect Dis 2021. May; 40(5): 905–919. doi: 10.1007/s10096-020-04138-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Asli Suner Karakulah

8 Mar 2023

PONE-D-23-03131Dataset of single nucleotide polymorphisms of immune-associated genes in patients with SARS-CoV2 infectionPLOS ONE

Dear Dr. Katsaouni,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Your manuscript has been reviewed and requires modifications prior to making a decision. The comments of the reviewers are included at the bottom of this letter. Reviewers indicated that methods and results sections should be improved. We would be glad to consider a substantial revision of your work, where the reviewers' comments will be carefully addressed one by one.

Please submit your revised manuscript by Apr 22 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Asli Suner Karakulah, PhD

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

If you are reporting a retrospective study of medical records or archived samples, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information.

3. Thank you for stating the following financial disclosure: 

"This work was supported by the Goethe-Corona-Funds of the Goethe University Frankfurt to D.S.K. We acknowledge funding from the Alfons und Gertrud Kassel-Stiftung as part of the center for data science and AI and the DFG Cluster of Excellence Cardio Pulmonary Institute (CPI) [EXC 2026]. We also acknowledge funding from the consortia ACLF-I (Acute Liver Failure - Initiative) and ENABLE (Unraveling mechanisms driving cellular homeostasis, inflammation and infection to enable new approaches in translational medicine) (Hessian Ministry of the Arts and Sciences)."

 Please state what role the funders took in the study.  If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Thank you for stating the following financial disclosure: 

"This work was supported by the Goethe-Corona-Funds of the Goethe University Frankfurt to D.S.K. We acknowledge funding from the Alfons und Gertrud Kassel-Stiftung as part of the center for data science and AI and the DFG Cluster of Excellence Cardio Pulmonary Institute (CPI) [EXC 2026]. We also acknowledge funding from the consortia ACLF-I (Acute Liver Failure - Initiative) and ENABLE (Unraveling mechanisms driving cellular homeostasis, inflammation and infection to enable new approaches in translational medicine) (Hessian Ministry of the Arts and Sciences)."

We note that one or more of the authors is affiliated with the funding organization, indicating the funder may have had some role in the design, data collection, analysis or preparation of your manuscript for publication; Goethe University Frankfurt 

In other words, the funder played an indirect role through the participation of the co-authors. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please do the following:

(1) Review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. These amendments should be made in the online form.

(2) Confirm in your cover letter that you agree with the following statement, and we will change the online submission form on your behalf: 

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”"

5. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

"This work was supported by the Goethe-Corona-Funds of the Goethe University Frankfurt to

D.S.K. We acknowledge funding from the Alfons und Gertrud Kassel-Stiftung as part of the

center for data science and AI and the DFG Cluster of Excellence Cardio Pulmonary Institute

(CPI) [EXC 2026]. We also acknowledge funding from the consortia ACLF-I (Acute Liver Failure

- Initiative) and ENABLE (Unraveling mechanisms driving cellular homeostasis, inflammation

and infection to enable new approaches in translational medicine) (Hessian Ministry of the

Arts and Sciences)."

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

"This work was supported by the Goethe-Corona-Funds of the Goethe University Frankfurt to D.S.K. We acknowledge funding from the Alfons und Gertrud Kassel-Stiftung as part of the center for data science and AI and the DFG Cluster of Excellence Cardio Pulmonary Institute (CPI) [EXC 2026]. We also acknowledge funding from the consortia ACLF-I (Acute Liver Failure - Initiative) and ENABLE (Unraveling mechanisms driving cellular homeostasis, inflammation and infection to enable new approaches in translational medicine) (Hessian Ministry of the Arts and Sciences)."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

6. Thank you for stating the following in your Competing Interests section: "NO"

Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state ""The authors have declared that no competing interests exist."", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now 

This information should be included in your cover letter; we will change the online submission form on your behalf.

7. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

8. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

9. Please include a new copy of Table 1 in your manuscript; the current table is difficult to read. Please follow the link for more information: https://blogs.plos.org/plos/2019/06/looking-good-tips-for-creating-your-plos-figures-graphics/

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this paper, the authors are proposed “Dataset of single nucleotide polymorphisms of immune-associated genes in patients with SARS-CoV2 infection”

The strengths of the paper are that it is well structured, the description of the related work is well done and that results are extensively compared to results of the similar research.

Minor revisions:

1. Authors should draw a graphical abstract of the proposed approach

2. Authors should justify the proposed approach.

3. Proofread the entire manuscript.

4. Authors should submit dataset sample in supplementary files, and some supplementary files are not open.

Reviewer #2: Dear authors,

First of all, I congratulate you for doing this fascinating study.

The article itself is well written, although it needs some corrections, as I mentioned in manuscript using track changes.

Please use the italic format for the name of genes and normal font for the protein’s names.

Success in your further research

Reviewer #3: It is necessary that the authors mention in the methods section, the program with which they analyzed the data of age, sex, blood group and HLA.

Likewise, to enrich the results, it is advisable to mention the clinical characteristics of the patients, mainly those who presented the 4 SNPs, as well as discuss the clinical data to reinforce the importance of the study.

It is necessary to unify universal terminology, such as SARS-CoV-2, COVID-19, among others throughout the article.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Mohadeseh Haji Abdolvahab

Reviewer #3: Yes: Gustavo J. Vazquez-Zapien

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Covid_Manuscript_Final-v2.docx

PLoS One. 2023 Nov 16;18(11):e0287725. doi: 10.1371/journal.pone.0287725.r002

Author response to Decision Letter 0


26 May 2023

We would like to thank the reviewers for reading the manuscript, for the kind response and for their helpful suggestions. In the revised version, changes are highlighted in blue boldface letters.

Reviewer #1:

In this paper, the authors are proposed “Dataset of single nucleotide polymorphisms of immune-associated genes in patients with SARS-CoV2 infection”. The strengths of the paper are that it is well structured, the description of the related work is well done and that results are extensively compared to results of the similar research.

Response: We thank the reviewer for these positive comments.

Minor revisions:

Authors should draw a graphical abstract of the proposed approach

Response: In the original manuscript we had provided a figure on the workflow of our study. However, in response to this reviewer’s suggestion we now also include an updated graphical abstract.

Authors should justify the proposed approach.

Response: We agree with the reviewer and have now added sentences to this effect to the introduction (page 5).

Proofread the entire manuscript.

Response: We agree with the reviewer and apologize for any previous typos. After making the requested changes, the manuscript has now been proofread again.

Authors should submit dataset sample in supplementary files, and some supplementary files are not open.

Response: We apologize for this. We now include the entire dataset in supplementary files and have checked that they can be opened.

Reviewer #2:

Dear authors, First of all, I congratulate you for doing this fascinating study. The article itself is well written, although it needs some corrections, as I mentioned in manuscript using track changes.

Response: We thank the reviewer for these encouraging words. We also thank you for explicitly pointing out errors and misleading sentences in the manuscript. Your work was a great help in the revision of the manuscript. We carefully proofread the final version. We made changes in the text according to your suggestions marked in the manuscript. Please see the list of changes below:

Page 15/16: The following sentence easily can be misunderstood:

"SIGLEC7 and ACE appear to have significant false discovery rates (p-value < 0.05 ) with 2 and 12 considered SNPs, respectively."

We changed the text to:

"Genes SIGLEC7 and ACE appear to have significant false discovery rates (p-value < 0.05). The significance for the gene SIGLEC7 is associated with a group of patients who presented with at least one of two SNPs. The significance for gene ACE is associated with a group of patients who presented with at least one of twelve SNPs."

Page 16: The following sentence easily can be misunderstood:

"The SNP in the SIGLEC7 gene exhibited a mutation from cytosine to guanine, causing a residue change from serine to cysteine at position 190 in the amino acid chain."

We changed the text to:

"The SNP in the SIGLEC7 gene represents a mutation from cytosine to guanine. During transcription to a chain of amino acids, this SNP causes a change from serine to cysteine at position 190."

Please use the italic format for the name of genes and normal font for the protein’s names.

Response: We agree with the reviewer and apologize. All gene names have been italicized in capital letters, while protein names are written in capital letters and are not italicized.

Success in your further research

Thank you!

Reviewer #3:

It is necessary that the authors mention in the methods section, the program with which they analyzed the data of age, sex, blood group and HLA.

Response: We agree with the reviewer and thank him/her for raising this point. On page 10 we have now added the following text:

"Descriptive statistics of age, gender, blood group, and HLA

For the analysis of statistics we wrote Python scripts (Python version 3.7.6) in Jupyter Notebook. We used modules from the scipy package (version 1.4.1) for statistical calculations and applied ML algorithms from the scikit-learn library (version 0.22.1)."

Likewise, to enrich the results, it is advisable to mention the clinical characteristics of the patients, mainly those who presented the 4 SNPs, as well as discuss the clinical data to reinforce the importance of the study.

Response: We agree with the reviewer and thank him/her for this excellent suggestion. We have added details to the results section on page 15/16 and have discussed these data on page 19. We also politely refer the reader to our table with the patients’ metadata.

It is necessary to unify universal terminology, such as SARS-CoV-2, COVID-19, among others throughout the article.

Response: We agree with the reviewer. We have unified language with respect to SARS-CoV2 and other terms, as suggested by this reviewer.

Attachment

Submitted filename: rebuttal_26_5_23.docx

Decision Letter 1

Asli Suner Karakulah

12 Jun 2023

Dataset of single nucleotide polymorphisms of immune-associated genes in patients with SARS-CoV2 infection

PONE-D-23-03131R1

Dear Dr. Katsaouni,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Asli Suner Karakulah, PhD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

The authors addressed the reviewers' concerns and substantially improved the content of MS.

So, based on my own assessment as an academic editor, MS can be accepted in its current form.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: (No Response)

Reviewer #3: Dear Author.

Previous comments or suggestions have been answered and added to the manuscript. Thank you.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: Yes: Mohadeseh Haji Abdolvahab

Reviewer #3: Yes: Gustavo J. Vazquez-Zapien

**********

Acceptance letter

Asli Suner Karakulah

22 Jun 2023

PONE-D-23-03131R1

Dataset of single nucleotide polymorphisms of immune-associated genes in patients with SARS-CoV-2 infection

Dear Dr. Katsaouni:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Asli Suner Karakulah

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. PolyPhen-2 report for SNP rs993496436 in the SIGLEC7 gene.

    The prediction of SNP rs993496436 is benign. The SNP causes a residue change from serine to cysteine at position 190 in the amino acid chain.

    (PDF)

    S1 Table. For each participant, S1 Table lists the SNPs and zygosity information (0: homozygous for the reference, 1: heterozygous, 2: homozygous for the alternate).

    S1 Table is also available online in the open data repository figshare, https://doi.org/10.6084/m9.figshare.20068868.v2. The title of the file is “SNPs per patient and zygosity information”.

    (XLSX)

    Attachment

    Submitted filename: Covid_Manuscript_Final-v2.docx

    Attachment

    Submitted filename: rebuttal_26_5_23.docx

    Data Availability Statement

    The original data files are available in http://www.ncbi.nlm.nih.gov/bioproject/837053 with SRA number: SRP376127 All data needed to each the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings can me found as following: Clinical metadata and HLA types for each patient in the cohort of 159 patients: https://doi.org/10.6084/m9.figshare.21803928.v1. For each participant, Table S1 lists the SNPs and zygosity information (0: homozygous for the reference, 1: heterozygous, 2: homozygous for the alternate). Table S1 is also available online in the open data repository figshare, https://doi.org/10.6084/m9.figshare.20068868.v2. The title of the file is "SNPs per patient and zygosity information."


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES