Skip to main content
eLife logoLink to eLife
. 2024 Oct 3;13:RP93666. doi: 10.7554/eLife.93666

Novel risk loci for COVID-19 hospitalization among admixed American populations

Silvia Diz-de Almeida 1,2,3,4,, Raquel Cruz 1,2,3,4,, Andre D Luchessi 5, José M Lorenzo-Salazar 6, Miguel López de Heredia 3, Inés Quintela 7, Rafaela González-Montelongo 6, Vivian Nogueira Silbiger 5, Marta Sevilla Porras 3,8, Jair Antonio Tenorio Castaño 1,3,8, Julian Nevado 1,3,8, Jose María Aguado 9,10,11, Carlos Aguilar 12, Sergio Aguilera-Albesa 2,13, Virginia Almadana 14, Berta Almoguera 3,15, Nuria Alvarez 16, Álvaro Andreu-Bernabeu 17,18, Eunate Arana-Arri 19,20, Celso Arango 17,18,21, María J Arranz 22, Maria-Jesus Artiga 23, Raúl C Baptista-Rosas 24,25,26, María Barreda- Sánchez 27,28, Moncef Belhassen-Garcia 29, Joao F Bezerra 30, Marcos AC Bezerra 31, Lucía Boix-Palop 32, María Brion 33,34, Ramón Brugada 34,35,36,37, Matilde Bustos 38, Enrique J Calderón 38,39,40, Cristina Carbonell 41,42, Luis Castano 3,19,43,44,45, Jose E Castelao 46, Rosa Conde-Vicente 47, M Lourdes Cordero-Lorenzana 48, Jose L Cortes-Sanchez 49,50, Marta Corton 3,15, M Teresa Darnaude 51, Alba De Martino-Rodríguez 52,53, Victor del Campo-Pérez 54, Aranzazu Diaz de Bustamante 51, Elena Domínguez-Garrido 55, Rocío Eirós 56, María Carmen Fariñas 57,58,59, María J Fernandez-Nestosa 60, Uxía Fernández-Robelo 61, Amanda Fernández-Rodríguez 11,62, Tania Fernández-Villa 40,63, Manuela Gago-Dominguez 7,64, Belén Gil-Fournier 65, Javier Gómez-Arrue 52,53, Beatriz González Álvarez 52,53, Fernan Gonzalez Bernaldo de Quirós 66, Anna González-Neira 16, Javier González-Peñas 17,18,21, Juan F Gutiérrez-Bautista 67, María José Herrero 68,69, Antonio Herrero-Gonzalez 70, María A Jimenez-Sousa 11,62, María Claudia Lattig 71,72, Anabel Liger Borja 73, Rosario Lopez-Rodriguez 3,15,74, Esther Mancebo 75,76, Caridad Martín-López 73, Vicente Martín 40,63, Oscar Martinez-Nieto 72,77, Iciar Martinez-Lopez 78,79, Michel F Martinez-Resendez 49, Angel Martinez-Perez 80, Juliana F Mazzeu 81,82,83, Eleuterio Merayo Macías 84, Pablo Minguez 3,15, Victor Moreno Cuerda 85,86, Silviene F Oliveira 83,87,88,89, Eva Ortega-Paino 23, Mara Parellada 17,18,21, Estela Paz-Artal 75,76,90, Ney PC Santos 91, Patricia Pérez-Matute 92, Patricia Perez 93, M Elena Pérez-Tomás 28, Teresa Perucho 94, Mellina Pinsach-Abuin 34,35, Guillermo Pita 16, Ericka N Pompa-Mera 95,96, Gloria L Porras-Hurtado 97, Aurora Pujol 3,98,99, Soraya Ramiro León 65, Salvador Resino 11,62, Marianne R Fernandes 91,100, Emilio Rodríguez-Ruiz 64,101, Fernando Rodriguez-Artalejo 40,102,103,104, José A Rodriguez-Garcia 105, Francisco Ruiz-Cabello 64,106,107, Javier Ruiz-Hornillos 108,109,110, Pablo Ryan 11,111,112,113, José Manuel Soria 80, Juan Carlos Souto 114, Eduardo Tamayo 115,116, Alvaro Tamayo-Velasco 117, Juan Carlos Taracido-Fernandez 70, Alejandro Teper 118, Lilian Torres-Tobar 119, Miguel Urioste 120, Juan Valencia-Ramos 121, Zuleima Yáñez 122, Ruth Zarate 123, Itziar de Rojas 124,125, Agustín Ruiz 124,125, Pascual Sánchez 126, Luis Miguel Real 127; SCOURGE Cohort Group, Encarna Guillen-Navarro 28,128,129,130, Carmen Ayuso 3,15, Esteban Parra 131, José A Riancho 3,57,58,59, Augusto Rojas-Martinez 132, Carlos Flores 6,133,134,135,, Pablo Lapunzina 1,3,8, Ángel Carracedo 3,4,7,64,
Editors: Siming Zhao136, Murim Choi137
PMCID: PMC11449485  PMID: 39361370

Abstract

The genetic basis of severe COVID-19 has been thoroughly studied, and many genetic risk factors shared between populations have been identified. However, reduced sample sizes from non-European groups have limited the discovery of population-specific common risk loci. In this second study nested in the SCOURGE consortium, we conducted a genome-wide association study (GWAS) for COVID-19 hospitalization in admixed Americans, comprising a total of 4702 hospitalized cases recruited by SCOURGE and seven other participating studies in the COVID-19 Host Genetic Initiative. We identified four genome-wide significant associations, two of which constitute novel loci and were first discovered in Latin American populations (BAZ2B and DDIAS). A trans-ethnic meta-analysis revealed another novel cross-population risk locus in CREBBP. Finally, we assessed the performance of a cross-ancestry polygenic risk score in the SCOURGE admixed American cohort. This study constitutes the largest GWAS for COVID-19 hospitalization in admixed Latin Americans conducted to date. This allowed to reveal novel risk loci and emphasize the need of considering the diversity of populations in genomic research.

Research organism: None

Introduction

To date, more than 50 loci associated with COVID-19 susceptibility, hospitalization, and severity have been identified using genome-wide association studies (GWAS) (Kanai et al., 2023; Pairo-Castineira et al., 2023). The COVID-19 Host Genetics Initiative (HGI) has made significant efforts (Niemi et al., 2021) to augment the power to identify disease loci by recruiting individuals from diverse populations and conducting a trans-ancestry meta-analysis. Despite this, the lack of genetic diversity and a focus on cases of European ancestries still predominate in the studies (Popejoy and Fullerton, 2016; Sirugo et al., 2019). In addition, while trans-ancestry meta-analyses are a powerful approach for discovering shared genetic risk variants with similar effects across populations (Li and Keating, 2014), they may fail to identify risk variants that have larger effects on particular underrepresented populations. Genetic disease risk has been shaped by the particular evolutionary history of populations and environmental exposures (Rosenberg et al., 2010). Their action is particularly important for infectious diseases due to the selective constraints that are imposed by host‒pathogen interactions (Karlsson et al., 2014; Kwok et al., 2021). Literature examples of this in COVID-19 severity include a DOCK2 gene variant in East Asians (Namkoong et al., 2022) and frequent loss-of-function variants in IFNAR1 and IFNAR2 genes in Polynesian and Inuit populations, respectively (Bastard et al., 2022; Duncan et al., 2022).

Including diverse populations in case‒control GWAS with unrelated participants usually requires a prior classification of individuals in genetically homogeneous groups, which are typically analyzed separately to control the population stratification effects (Peterson et al., 2019). Populations with recent admixture impose an additional challenge to GWASs due to their complex genetic diversity and linkage disequilibrium (LD) patterns, requiring the development of alternative approaches and a careful inspection of results to reduce false positives due to population structure (Rosenberg et al., 2010). In fact, there are benefits in study power from modeling the admixed ancestries either locally, at the regional scale in the chromosomes, or globally, across the genome, depending on factors such as the heterogeneity of the risk variant in frequencies or the effects among the ancestry strata (Mester et al., 2023). Despite the development of novel methods specifically tailored for the analysis of admixed populations (Atkinson et al., 2021), the lack of a standardized analysis framework and the difficulties in confidently clustering admixed individuals into particular genetic groups often lead to their exclusion from GWAS.

The Spanish Coalition to Unlock Research on Host Genetics on COVID-19 (SCOURGE) recruited COVID-19 patients between March and December 2020 from hospitals across Spain and from March 2020 to July 2021 in Latin America (https://www.scourge-covid.org). A first GWAS of COVID-19 severity among Spanish patients of European descent revealed novel disease loci and explored age- and sex-varying effects of genetic factors (Cruz et al., 2022). Here, we present the findings of a GWAS meta-analysis in admixed Latin American (AMR) populations, comprising individuals from the SCOURGE Latin American cohort and the HGI studies, which allowed us to identify two novel severe COVID-19 loci, BAZ2B and DDIAS. Further analyses modeling the admixture from three genetic ancestral components and performing a trans-ethnic meta-analysis led to the identification of an additional risk locus near CREBBP. We finally assessed a cross-ancestry polygenic risk score (PGS) model with variants associated with critical COVID-19.

Results

Meta-analysis of COVID-19 hospitalization in admixed Americans

Study cohorts

Within the SCOURGE consortium, we included 1608 hospitalized cases and 1887 controls (not hospitalized COVID-19 patients) from Latin American countries and from recruitments of individuals of Latin American descent conducted in Spain (Supplementary file 1). Quality control details and estimation of global genetic inferred ancestry (GIA) (Figure 1—figure supplement 1) are described in ‘Materials and methods’, whereas clinical and demographic characteristics of patients included in the analysis are shown in Table 1. Summary statistics from the SCOURGE cohort were obtained under a logistic mixed model with the SAIGE model (‘Materials and methods’). Another seven studies participating in the COVID-19 HGI consortium were included in the meta-analysis of COVID-19 hospitalization in admixed Americans (Figure 1).

Table 1. Demographic characteristics of the SCOURGE Latin American cohort.
Variable Non-hospitalized
(N = 1887)
Hospitalized(N = 1625)
Age, mean years ±SD 39.1 ± 11.9 54.1 ± 14.5
Sex, N (%)
Female (%) 1253 (66.4) 668 (41.1)
Global genetic inferred ancestry, % mean ± SD
European 54.4 ± 16.2 39.4 ± 20.7
African 15.3 ± 12.7 9.1 ± 11.6
Native American 30.3 ± 19.8 51.3 ± 26.5
Comorbidities, N (%)
Vascular/endocrinological 488 (25.9) 888 (64.5)
Cardiac 60 (3.2) 151 (9.3)
Nervous 15 (0.8) 61 (3.8)
Digestive 14 (0.7) 33 (2.0)
Onco-hematological 21 (1.1) 48 (3.00)
Respiratory 76 (4.0) 118 (7.3)
Figure 1. Flow chart of this study.

Stage I of the study involved a meta-analysis of the Latin American genome-wide association studies (GWAS) from SCOURGE and the COVID-19 Host Genetics Initiative. The resulting meta-analysis was leveraged to prioritize genes by using a transcriptome-wide association study (TWAS), Bayesian fine-mapping and functional annotations, and to assess the generalizability of polygenic risk score (PGS) cross-population models in Latin Americans. Stage II involved two additional cross-population GWAS meta-analyses to further investigate the replicability of findings.

Figure 1.

Figure 1—figure supplement 1. Global genetic inferred ancestry (GIA) composition in the SCOURGE Latin American cohort.

Figure 1—figure supplement 1.

European (EUR), African (AFR), and Native American (AMR) GIA was derived with ADMIXTURE from a reference panel composed of Aymaran, Mayan, Nahuan, and Quechuan individuals of Native American genetic ancestry and randomly selected samples from the EUR and AFR 1KGP populations. The colors represent the different geographical sampling regions from which the admixed American individuals from SCOURGE were recruited.

GWAS meta-analysis

We performed a fixed-effects GWAS meta-analysis using the inverse of the variance as weights for the overlapping markers. The combined GWAS sample size consisted of 4702 admixed AMR hospitalized cases and 68,573 controls.

This GWAS meta-analysis revealed genome-wide significant associations at four risk loci (Table 2, Figure 2; a quantile‒quantile plot is shown in Figure 2—figure supplement 1), two of which (BAZ2B and DDIAS) were novel discoveries. A Meta-Analysis Model-based Assessment of replicability (MAMBA) approach to leverage the strength and consistency of associations across the contributing studies supported >90% likelihood for one of the novel loci to likely replicate in future studies (Table 3). Four lead variants were identified, linked to other 310 variants (Supplementary files 2 and 3). A gene-based association test revealed a significant association in BAZ2B and in previously known COVID-19 risk loci: LZTFL1, XCR1, FYCO1, CCR9, and IFNAR2 (Supplementary file 4).

Table 2. Lead independent variants in the admixed AMR genome-wide association studies (GWAS) meta-analysis.
SNP rsID chr:pos EA NEA OR (95% CI) p-Value EAF cases EAF controls Nearest gene Mamba PPR
rs13003835 2:159407982 T C 1.20 (1.12–1.27) 3.66E-08 0.563 0.429 BAZ2B 0.30
rs35731912 3:45848457 T C 1.65 (1.47–1.85) 6.30E-17 0.087 0.056 LZTFL1 0.95
rs2477820 6:41535254 A T 0.84 (0.79–0.89) 1.89E-08 0.453 0.517 FOXP4-AS1 0.18
rs77599934 11:82906875 G A 2.27 (1.7–3.04) 2.26E-08 0.016 0.011 DDIAS 0.95

EA: effect allele; NEA: noneffect allele; EAF: effect allele frequency in the SCOURGE study; PPR: posterior probability of replicability.

Figure 2. Manhattan plot for the admixed AMR genome-wide association studies (GWAS) meta-analysis.

Probability thresholds at p=5 × 10–8 and p=5 × 10–5 are indicated by the horizontal lines. Genome-wide significant associations with COVID-19 hospitalizations were found on chromosome 2 (within BAZ2B), chromosome 3 (within LZTFL1), chromosome 6 (within FOXP4), and chromosome 11 (within DDIAS).

Figure 2.

Figure 2—figure supplement 1. Quantile–quantile plot for the AMR genome-wide association studies (GWAS) meta-analysis.

Figure 2—figure supplement 1.

A lambda inflation factor of 1.015 was obtained.
Table 3. Novel variants in the SC-HGIALL and SC-HGI3POP meta-analyses (with respect to HGIv7).

Independent signals after LD clumping.

SNP rsID chr:pos EA NEA OR (95% CI) p-Value Nearest gene Analysis
rs76564172 16:3892266 T G 1.31 (1.19–1.44) 9.64E-09 CREBBP SC-HGI3POP
rs66833742 19:4063488 T C 0.94 (0.92–0.96) 1.89E-08 ZBTB7A SC-HGI3POP
rs66833742 19:4063488 T C 0.94 (0.92–0.96) 2.50E-08 ZBTB7A SC-HGIALL
rs2876034 20:6492834 A T 0.95 (0.93–0.97) 2.83E-08 CASC20 SC-HGIALL

EA: effect allele; NEA: non-effect allele.

Located within the BAZ2B gene, the sentinel variant rs13003835 (Figure 3) is an intronic variant associated with an increased risk of COVID-19 hospitalization (odds ratio [OR]=1.20, 95% confidence interval [CI] = 1.12–1.27, p=3.66 × 10–8). This association was not previously reported in any GWAS of COVID-19 published to date. Interestingly, rs13003835 did not reach significance (p=0.972) in the COVID-19 HGI trans-ancestry meta-analysis including the five population groups (Kanai et al., 2023).

Figure 3. New loci associated with COVID-19 hospitalization in Admixed american populations.

(A) Regional association plots for rs1003835 at chromosome 2 and rs77599934 at chromosome 11. (B) Allele frequency distribution across the 1000 Genomes Project populations for the lead variants rs1003835 and rs77599934. Retrieved from The Geography of Genetic Variants Web or GGV.

Figure 3.

Figure 3—figure supplement 1. Regional association plots for the fine mapped loci in chromosomes 2 (A) and 16 (B).

Figure 3—figure supplement 1.

Colored in red, the variants allocated to the credible set at the 95% confidence according to the Bayesian fine mapping. In blue, the sentinel variant.

The other novel risk locus is led by the sentinel variant rs77599934 (Figure 3), a rare intronic variant located in chromosome 11 within DDIAS and associated with the risk of COVID-19 hospitalization (OR = 2.27, 95% CI = 1.70–3.04, p=2.26 × 10–8).

We also observed a suggestive association with rs2601183 in chromosome 15, which is located between ZNF774 and IQGAP1 (allele-G OR = 1.20, 95% CI = 1.12–1.29, p=6.11 × 10–8, see Supplementary file 2), which has not yet been reported in any other GWAS of COVID-19 to date.

The GWAS meta-analysis also pinpointed two significant variants at known loci, LZTFL1 and FOXP4. The SNP rs35731912 was previously associated with COVID-19 severity in EUR populations (Degenhardt et al., 2022), and it was mapped to LZTFL1. While rs2477820 is a novel risk variant within the FOXP4 gene, it has a moderate LD (r2 = 0.295) with rs2496644, which has been linked to COVID-19 hospitalization (Kousathanas et al., 2022). This is consistent with the effects of LD in tag-SNPs when conducting GWAS in diverse populations.

None of the lead variants was associated with the comorbidities included in Table 1.

Functional mapping of novel risk variants

Variants belonging to the lead loci were prioritized by positional and expression quantitative trait loci (eQTL) mapping with FUMA, resulting in 31 mapped genes (Supplementary file 5). Within the region surrounding the lead variant in chromosome 2, FUMA prioritized four genes in addition to BAZ2B (PLA2R1, LY75, WDSUB1, and CD302). rs13003835 (allele C) is an eQTL of LY75 in the esophagus mucosa (NES = 0.27) and of BAZ2B-AS in whole blood (NES = 0.27), while rs2884110 (R2 = 0.85) is an eQTL of LY75 in lung (NES = 0.22). As for the chromosome 11, rs77599934 (allele G) is in moderate-to-strong LD (r2 = 0.776) with rs60606421 (G deletion, allele -), which is an eQTL associated with a reduced expression of DDIAS in the lungs (NES = −0.49, allele -). The sentinel variant for the region in chromosome 16 is in perfect LD (r2 = 1) with rs601183, an eQTL of ZNF774 in the lung.

Bayesian fine mapping

We performed different approaches to narrow down the prioritized loci to a set of most likely genes driving the associations. First, we computed credible sets at the 95% confidence level for causal variants and annotated them with VEP and V2G aggregate scoring. The 95% confidence credible set from the region of chromosome 2 around rs13003835 included 76 variants, which can be found in Supplementary file 6 and a regional plot is shown in Figure 3—figure supplement 1 (VEP and V2G annotations are included in Supplementary files 7 and 8). TheV2G score prioritized BAZ2B as the most likely gene driving the association. However, the approach was unable to converge allocating variants in a 95% confidence credible set for the region in chromosome 11.

Transcriptome-wide association study (TWAS)

Five novel genes, namely, SLC25A37, SMARCC1, CAMP, TYW3, and S100A12 (Supplementary file 9), were found to be significantly associated in the cross-tissue TWAS. To our knowledge, these genes have not been reported previously in any COVID-19 TWAS or GWAS analyses published to date. In the single-tissue analyses, ATP5O and CXCR6 were significantly associated in the lungs, CCR9 was significantly associated in whole blood, and IFNAR2 and SLC25A37 were associated in lymphocytes.

Likewise, we carried out TWAS analyses using the models trained in the admixed populations. However, no significant gene pairs were detected in this case. The top 10 genes with the lowest p-values for each of the datasets (Puerto Ricans, Mexicans, African Americans, and pooled cohorts) are shown in Supplementary file 10. Although not significant, KCNC3 was repeated in the four analyses, whereas MAPKAPK3, NAPSA, and THAP5 were repeated in three out of four. Both NAPSA and KCNC3 are located in the chromosome 19 and were reported in the latest HGI meta-analysis (Kanai et al., 2023).

All mapped genes from analyses conducted in AMR populations are shown in Figure 4, and associations for the two novel variants with expression are shown in Figure 4—figure supplement 1.

Figure 4. Summary of the results from gene prioritization strategies used for genetic associations in AMR populations.

Genome-wide association studies (GWAS) catalog association for BAZ2B-AS was with FEV/FCV ratio. Literature-based evidence is further explored in ‘Discussion’.

Figure 4.

Figure 4—figure supplement 1. Gene‒tissue pairs for which either rs1003835 or rs60606421 are significant expression quantitative trait loci (eQTL) at false discovery rate (FDR) < 0.05 (data retrieved from https://gtexportal.org/home/snp/).

Figure 4—figure supplement 1.

rs1003835 (chromosome 2) maps to BAZ2B, LY75, and PLA2R1 genes. As for the lead variant of chromosome 11, rs77599934, since it was not an eQTL, we used an LD proxy variant (rs60606421). DDIAS and PRCP genes map closely to this variant. NES and p-values correspond to the normalized effect size (and direction) of eQTL-gene associations and the p-value for the tissue, respectively.

Genetic architecture of COVID-19 hospitalization in AMR populations

Allele frequencies of rs13003835 and rs77599934 across ancestries

Neither rs13003835 (BAZ2B) nor rs77599934 (DDIAS) were significantly associated in the COVID-19 HGI B2 cross-population or population-specific meta-analyses. Thus, we investigated their allele frequencies (AF) across populations and compared their effect sizes.

According to gnomAD v3.1.2, the T allele at rs13003835 (BAZ2B) has an AF of 43% in admixed AMR groups, while AF is lower in the EUR populations (16%) and in the global sample (29%). Local ancestry inference (LAI) reported by gnomAD shows that within the Native American component, the risk allele T is the major allele, whereas it is the minor allele within the African and European LAI components. These large differences in AF might be the reason underlying the association found in AMR populations. However, when comparing effect sizes between populations, we found that they were in opposite directions between SAS-AMR and EUR-AFR-EAS and that there was large heterogeneity among them (Figure 5). We queried SNPs within 50 kb windows of the lead variant in each of the other populations that had p-value <0.01. The variant with the lowest p-value in the EUR population was rs559179177 (p=1.72 × 10–4), which is in perfect LD (r2 = 1) in the 1KGP EUR population with our sentinel variant (rs13003835), and in moderate LD r2 = 0.4 in AMR populations. Since this variant was absent from the AMR analysis, probably due to its low frequency, it could not be meta-analyzed. Power calculations revealed that the EUR analysis was underpowered for this variant to achieve genome-wide significance (77.6%, assuming an effect size of 0.46, EAF = 0.0027, and number of cases/controls as shown in the HGI website for B2-EUR). In the cross-population meta-analysis (B2-ALL), rs559179177 obtained a p-value of 5.9 × 10–4.

Figure 5. Forest plot showing effect sizes and the corresponding confidence intervals for the sentinel variants identified in the AMR meta-analysis across populations.

Figure 5.

All beta values with their corresponding CIs were retrieved from the B2 population-specific meta-analysis from the HGI v7 release, except for AMR, for which the beta value and IC from the HGIAMR-SCOURGE meta-analysis are represented.

rs77599934 (DDIAS) had an AF of 1.1% for the G allele in the non-hospitalized controls (Table 2), in line with the recorded gnomAD AF of 1% in admixed AMR groups. This variant has the potential to be a population-specific variant, given the allele frequencies in other population groups, such as EUR (0% in Finnish, 0.025% in non-Finnish), EAS (0%) and SAS (0.042%), and its greater effect size over AFR populations (Figure 5). Examining the LAI, the G allele occurs at a 10.8% frequency in the African component, while it is almost absent in the Native American and European. Due to its low MAF, rs77599934 was not analyzed in the COVID-19 HGI B2 cross-population meta-analysis and was only present in the HGI B2 AFR population-specific meta-analysis, precluding the comparison (Figure 5). For this reason, we retrieved the variant with the lowest p-value within a 50 kb region around rs77599934 in the COVID-19 HGI cross-population analysis to investigate whether it was in moderate-to-strong LD with our sentinel variant. The variant with the smallest p-value was rs75684040 (OR = 1.07, 95% CI = 1.03–1.12, p=1.84 × 10–3). However, LD calculations using the 1KGP phase 3 dataset indicated that rs77599934 and rs75684040 were poorly correlated (r2 = 0.11). As for AFR populations, the variant with the lowest p-value was rs138860115 (p=8.3 × 10–3), but it was not correlated with the lead SNP of this locus.

Cross-population meta-analyses

We carried out two cross-ancestry inverse variance-weighted fixed-effects meta-analyses with the admixed AMR GWAS meta-analysis results to evaluate whether the discovered risk loci replicated when considering other population groups. In doing so, we also identified novel cross-population COVID-19 hospitalization risk loci.

First, we combined the SCOURGE Latin American GWAS results with the HGI B2 ALL analysis (Supplementary file 11). We refer to this analysis as the SC-HGIALL meta-analysis. Out of the 40 genome-wide significant loci associated with COVID-19 hospitalization in the last HGI release (Kanai et al., 2023), this study replicated 39, and the association was stronger than in the original study in 29 of those (Supplementary file 12). However, the variant rs13003835 located in BAZ2B did not replicate (OR = 1.00, 95% CI = 0.98–1.03, p=0.644).

In this cross-ancestry meta-analysis, we replicated two associations that were not found in HGIv7, albeit they were sentinel variants in the latest GenOMICC meta-analysis (Pairo-Castineira et al., 2023). We found an association at the CASC20 locus led by the variant rs2876034 (OR = 0.95, 95% CI = 0.93–0.97, p=2.83 × 10–8). This variant is in strong LD with the sentinel variant of that study (rs2326788, r2 = 0.92), which was associated with critical COVID-19 (Pairo-Castineira et al., 2023). In addition, this meta-analysis identified the variant rs66833742 near ZBTB7A associated with COVID-19 hospitalization (OR = 0.94, 95% CI = 0.92–0.96, p=2.50 × 10–8). Notably, rs66833742 or its perfect proxy rs67602344 (r2 = 1) are also associated with upregulation of ZBTB7A in whole blood and in esophageal mucosa. This variant was previously associated with COVID-19 hospitalization (Pairo-Castineira et al., 2023).

In a second analysis, we also explored the associations across the defined admixed AMR, EUR, and AFR ancestral sources by combining through meta-analysis the SCOURGE Latin American GWAS results with the HGI studies in EUR, AFR, and admixed AMR and excluding those from EAS and SAS (Supplementary file 13). We refer to this as the SC-HGI3POP meta-analysis. The association at rs13003835 (BAZ2B, OR = 1.01, 95% CI = 0.98–1.03, p=0.605) was not replicated, and rs77599934 near DDIAS could not be assessed, although the association at the ZBTB7A locus was confirmed (rs66833742, OR = 0.94, 95% CI = 0.92–0.96, p=1.89 × 10–8). The variant rs76564172 located near CREBBP also reached statistical significance (OR = 1.31, 95% CI = 1.25–1.38, p=9.64 × 10–9). The sentinel variant of the region linked to CREBBP (in the trans-ancestry meta-analysis) was also subjected to a Bayesian fine mapping (Supplementary file 6). Eight variants were included in the credible set for the region in chromosome 16 (meta-analysis SC-HGI3POP).

Polygenic risk score models

Using the 49 variants associated with disease severity that are shared across populations according to the HGIv7, we constructed a PGS model to assess its generalizability in the admixed AMR (Supplementary file 14). First, we calculated the PGS for the SCOURGE Latin Americans and explored the association with COVID-19 hospitalization under a logistic regression model. The PGS model was associated with a 1.48-fold increase in COVID-19 hospitalization risk per every PGS standard deviation. It also contributed to explaining a slightly larger variance (∆R2 = 1.07%) than the baseline model.

Subsequently, we divided the individuals into PGS deciles and percentiles to assess their risk stratification. The median percentile among controls was 40, while in cases, it was 63. Those in the top PGS decile exhibited a 2.89-fold (95% CI = 2.37–3.54, p=1.29 × 10–7) greater risk compared to individuals in the deciles between 4 and 6 (corresponding to a score of the median distribution).

We also examined the distribution of PGS across a five-level severity scale to further determine if there was any correspondence between clinical severity and genetic risk. Median PGS were lower in the asymptomatic and mild groups, whereas higher median scores were observed in the moderate, severe, and critical patients (Figure 6). We fitted a multinomial model using the asymptomatic class as a reference and calculated the OR for each category (Supplementary file 15), observing that the disease genetic risk was similar among asymptomatic, mild, and moderate patients. Given that the PGS was built with variants associated with critical disease and/or hospitalization and that the categories severe and critical correspond to hospitalized patients, these results underscore the ability of cross-ancestry PGS for risk stratification even in an admixed population.

Figure 6. Polygenic risk distribution for COVID-19 hospitalization.

Figure 6.

(A) Polygenic risk stratified by polygenic risk score (PGS) deciles comparing each risk group against the lowest risk group (OR–95% CI). (B) Distribution of the PGS in each of the severity scale classes. 0, asymptomatic; 1, mild disease; 2, moderate disease; 3, severe disease; 4, critical disease.

Discussion

We have conducted the largest GWAS meta-analysis of COVID-19 hospitalization in admixed AMR to date. While the genetic risk basis discovered for COVID-19 is largely shared among populations, trans-ancestry meta-analyses on this disease have primarily included EUR samples. This dominance of GWAS in Europeans and the subsequent bias in sample sizes can mask population-specific genetic risks (i.e., variants that are monomorphic in some populations) or be less powered to detect risk variants having higher allele frequencies in population groups other than Europeans. In this sense, after combining data from admixed AMR patients, we found two risk loci that were first discovered in a GWAS of Latin American populations. Interestingly, the sentinel variant rs77599934 in the DDIAS gene is a rare coding variant (~1% for allele G) with a large effect on COVID-19 hospitalization that is nearly monomorphic in most of the other populations. This has likely led to its exclusion from the cross-population meta-analyses conducted to date, remaining undetectable.

Fine mapping of the region harboring DDIAS did not reveal further information about which gene could be the more prone to be causal or about the functional consequences of the risk variant, but our sentinel variant was in strong LD with an eQTL that associated with reduced gene expression of DDIAS in the lung. DDIAS, known as damage-induced apoptosis suppressor gene, is itself a plausible candidate gene. It has been linked to DNA damage repair mechanisms: research has shown that depletion of DDIAS leads to an increase in ATM phosphorylation and the formation of p53-binding protein (53BP1) foci, a known biomarker of DNA double-strand breaks, suggesting a potential role in double-strand break repair (Brunette et al., 2019). Interestingly, a study found that infection by SARS-CoV-2 also triggered the phosphorylation of ATM kinase and inhibited repair mechanisms, causing the accumulation of DNA damage (Gioia et al., 2023). This gene has also been proposed as a potential biomarker for lung cancer after finding that it interacts with STAT3 in lung cancer cells, regulating IL-6 (Im et al., 2020; Im et al., 2023) and thus mediating inflammatory processes, while another study determined that its blockade inhibited lung cancer cell growth (Won et al., 2014). Another prioritized gene from this region was PRCP, an angiotensinase that shares substrate specificity with ACE2 receptor. It has been positively linked to hypertension and some studies have raised hypotheses on its role in COVID-19 progression, particularly in relation to the development of pro-thrombotic events (Angeli et al., 2023; Silva-Aguiar et al., 2020).

The risk region found in chromosome 2 harbors more than one gene. The lead variant rs13003835 is located within BAZ2B, and it increases the expression of the antisense BAZ2B gene in whole blood. BAZ2B encodes one of the regulatory subunits of the Imitation switch (ISWI) chromatin remodelers (Li et al., 2021) constituting the BRF-1/BRF-5 complexes with SMARCA1 and SMARCA5, respectively. Interestingly, it was discovered that lnc-BAZ2B promotes macrophage activation through regulation of BAZ2B expression. Its overexpression resulted in pulmonary inflammation and elevated levels of MUC5AC in mice with asthma (Xia et al., 2021). This variant was also an eQTL for LY75 (encoding lymphocyte antigen 75) in the esophageal mucosa tissue. Lymphocyte antigen 75 is involved in immune processes through antigen presentation in dendritic cells and endocytosis (Mahnke et al., 2000) and has been associated with inflammatory diseases, representing a compelling candidate for the region. Increased expression of LY75 has been detected within hours after infection by SARS-CoV-2 (Mitchell et al., 2013; Sims et al., 2013). It is worth noting that differences in AF for this variant suggest that analyses in AMR populations might be more powered to detect the association, supporting the necessity of population-specific studies.

A third novel risk region was observed on chromosome 15 between the IQGAP1 and ZNF774 genes, although it did not reach genome-wide significance.

Secondary analyses revealed five TWAS-associated genes, some of which have already been linked to severe COVID-19. In a comprehensive multitissue gene expression profiling study (Gómez-Carballa et al., 2022), decreased expression of CAMP and S100A8/S100A9 genes in patients with severe COVID-19 was observed, while another study detected the upregulation of SCL25A37 among patients with severe COVID-19 (Policard et al., 2021). SMARCC1 is a subunit of the SWI/SNF chromatin remodeling complex that has been identified as proviral for SARS-CoV-2 and other coronavirus strains through a genome-wide screen (Wei et al., 2021). This complex is crucial for ACE2 expression and viral entry into the cell (Wei et al., 2023). However, it should be noted that using eQTL mostly from European populations such as those in GTEx could result in reduced power to detect associations.

To explore the genetic architecture of the trait among admixed AMR populations, we performed two cross-ancestry meta-analyses including the SCOURGE Latin American cohort GWAS findings. We found that the two novel risk variants were not associated with COVID-19 hospitalization outside the population-specific meta-analysis, highlighting the importance of complementing trans-ancestry meta-analyses with group-specific analyses. Notably, this analysis did not replicate the association at the DSTYK locus, which was associated with severe COVID-19 in Brazilian individuals with higher European admixture (Pereira et al., 2022). This lack of replication aligns with the initial hypothesis of that study suggesting that the risk haplotype was derived from European populations, as we reduced the weight of this ancestral contribution in our study by excluding those individuals.

Moreover, these cross-ancestry meta-analyses pointed to three loci that were not genome-wide significant in the HGIv7 ALL meta-analysis: a novel locus at CREBBP and two loci at ZBTB7A and CASC20 that were reported in another meta-analysis. CREBBP and ZBTB7A achieved a stronger significance when considering only the EUR, AFR, and admixed AMR GIA groups. According to a recent study, elevated levels of the ZBTB7A gene promote a quasihomeostatic state between coronaviruses and host cells, preventing cell death by regulating oxidative stress pathways (Zhu et al., 2022). This gene is involved in several signaling pathways, such as B- and T-cell differentiation (Gupta et al., 2020). On a separate note, CREBBP encodes the CREB binding protein (CBP), which is involved in transcriptional activation and is known to positively regulate the type I interferon response through virus-induced phosphorylation of IRF-3 (Yoneyama et al., 1998). In addition, the CREBP/CBP interaction has been implicated in SARS-CoV-2 infection (Yang et al., 2023) via the cAMP/PKA pathway. In fact, cells with suppressed CREBBP gene expression exhibit reduced replication of the so-called Delta and Omicron SARS-CoV-2 variants (Yang et al., 2023).

We developed a cross-population PGS model, which effectively stratified individuals based on their genetic risk and demonstrated consistency with the clinical severity classification of the patients. Only a few polygenic scores were derived from COVID-19 GWAS data. Horowitz et al., 2022 developed a score using 6 and 12 associated variants (PGS ID: PGP000302) and reported an associated OR (top 10% vs rest) of 1.38 for risk of hospitalization in European populations, whereas the OR for Latin American populations was 1.56. Since their sample size and the number of variants included in the PGS were lower, direct comparisons are not straightforward. Nevertheless, our analysis provides the first results for a PRS applied to a relatively large AMR cohort, being of value for future analyses regarding PRS transferability.

This study is subject to limitations, mostly concerning sample recruitment and composition. The SCOURGE Latino American sample size is small, and the GWAS is likely underpowered. Another limitation is the difference in case‒control recruitment across sampling regions that, yet controlled for, may reduce the ability to observe significant associations driven by different compositions of the populations. In this sense, the identified risk loci might not replicate in a cohort lacking any of the parental population sources from the three-way admixture. Likewise, we could not explicitly control for socioenvironmental factors that could have affected COVID-19 spread and hospitalization rates, although genetic principal components are known to capture nongenetic factors. Finally, we must acknowledge the lack of a replication cohort. We used all the available GWAS data for COVID-19 hospitalization in admixed AMR in this meta-analysis due to the low number of studies conducted. Therefore, we had no studies to replicate or validate the results. These concerns may be addressed in the future by including more AMR GWAS in the meta-analysis, both by involving diverse populations in study designs and supporting research from countries in Latin America.

This study provides novel insights into the genetic basis of COVID-19 severity, emphasizing the importance of considering host genetic factors by using non-European populations, especially of admixed sources. Such complementary efforts can pin down new variants and increase our knowledge on the host genetic factors of severe COVID-19.

Materials and methods

Key resources table.

Reagent type (species) or resource Designation Source or reference Identifiers Additional information
Commercial assay or kit Chemagic DNA Blood 100 kit PerkinElmer Chemagen Technologies GmbH
Software, algorithm Axiom Analysis Suite Thermo Fisher Scientific Version 4.0.3.3
Software, algorithm PLINK Purcell et al., 2007; https://www.cog-genomics.org/plink/ RRID:SCR_001757 Version 1.9; v2
Software, algorithm TOPMed Imputation Server https://imputation.biodatacatalyst.nhlbi.nih.gov/ Version 2
Software, algorithm ADMIXTURE Alexander et al., 2009; https://dalexander.github.io/admixture/ RRID:SCR_001263 Version 1.3.0
Software, algorithm SAIGEgds Zheng and Davis, 2021; https://www.bioconductor.org/packages/release/bioc/html/SAIGEgds.html Version 1.10.0
Software, algorithm METAL Willer et al., 2010; https://csg.sph.umich.edu/abecasis/metal/ RRID:SCR_002013 Version 2011-03-25
Software, algorithm FUMA Watanabe et al., 2017; https://fuma.ctglab.nl/ RRID:SCR_017521 Version 1.5.2
Software, algorithm MAMBA McGuire et al., 2021; https://github.com/dan11mcguire/mamba Version 1
Software, algorithm S-PrediXcan; S-MultiXcan Barbeira et al., 2018; https://github.com/hakyimlab/MetaXcan RRID:SCR_016739 Version 1
Software, algorithm GTEx v8 mashr prediction models https://predictdb.org/post/2021/07/21/gtex-v8-models-on-eqtl-and-sqtl/
Other GWAS Catalog https://www.ebi.ac.uk/gwas/ RRID:SCR_012745 Section ‘Definition of the genetic risk loci and putative functional impact’

GWAS in Latin Americans from SCOURGE

The SCOURGE Latin American cohort

A total of 3729 COVID-19-positive cases were recruited across five countries from Latin America (Mexico, Brazil, Colombia, Paraguay, and Ecuador) by 13 participating centers (Supplementary file 1) from March 2020 to July 2021. In addition, we included 1082 COVID-19-positive individuals recruited between March and December 2020 in Spain who either had evidence of origin from a Latin American country or showed inferred genetic admixture between AMR, EUR, and AFR (with <0.05% SAS/EAS). These individuals were excluded from a previous SCOURGE study that focused on participants with European genetic ancestries (Cruz et al., 2022). We used hospitalization as a proxy for disease severity and defined COVID-19-positive patients who underwent hospitalization as a consequence of the infection as cases and those who did not need hospitalization due to COVID-19 as controls.

Samples and data were collected with informed consent after the approval of the Ethics and Scientific Committees from the participating centers and by the Galician Ethics Committee Ref 2020/197. Recruitment of patients from IMSS (in Mexico City) was approved by the National Committee of Clinical Research from Instituto Mexicano del Seguro Social, Mexico (protocol R-2020-785-082).

Samples and data were processed following normalized procedures. The REDCap electronic data capture tool (Harris et al., 2009; Harris et al., 2019), hosted at Centro de Investigación Biomédica en Red (CIBER) from the Instituto de Salud Carlos III (ISCIII), was used to collect and manage demographic, epidemiological, and clinical variables. Subjects were diagnosed with COVID-19 based on quantitative PCR tests (79.3%) or according to clinical (2.2%) or laboratory procedures (antibody tests: 16.3%; other microbiological tests: 2.2%).

SNP array genotyping

Genomic DNA was obtained from peripheral blood and isolated using the Chemagic DNA Blood 100 kit (PerkinElmer Chemagen Technologies GmbH), following the manufacturer’s recommendations.

Samples were genotyped with the Axiom Spain Biobank Array (Thermo Fisher Scientific) following the manufacturer’s instructions in the Santiago de Compostela Node of the National Genotyping Center (CeGen-ISCIII; http://www.usc.es/cegen). This array contains probes for genotyping a total of 757,836 SNPs. Clustering and genotype calling were performed using Axiom Analysis Suite v4.0.3.3 software.

Quality control steps and variant imputation

A quality control (QC) procedure using PLINK 1.9 (Purcell et al., 2007) was applied to both samples and the genotyped SNPs. We excluded variants with a minor allele frequency (MAF) <1%, a call rate <98%, and markers strongly deviating from Hardy–Weinberg equilibrium expectations (p<1 × 10–6) with mid-p adjustment. We also explored the excess of heterozygosity to discard potential cross-sample contamination. Samples missing >2% of the variants were filtered out. Subsequently, we kept the autosomal SNPs, removed high-LD regions, and conducted LD pruning (windows of 1000 SNPs, with a step size of 80 and an r2 threshold of 0.1) to assess kinship and estimate the global ancestral proportions. Kinship was evaluated based on IBD values, removing one individual from each pair with PI_HAT > 0.25 that showed a Z0, Z1, and Z2 coherent pattern (according to the theoretical expected values for each relatedness level). Genetic principal components (PCs) were calculated with PLINK with the subset of LD pruned variants.

Genotypes were imputed with the TOPMed version r2 reference panel (GRCh38) using the TOPMed Imputation Server, and variants with Rsq < 0.3 or with MAF <1% were filtered out. A total of 4348 individuals and 10,671,028 genetic variants were included in the analyses.

Genetic admixture estimation

Global GIA, referred to the genetic similarity to the used reference individuals, was estimated with ADMIXTURE (Alexander et al., 2009) v1.3 software following a two-step procedure. First, we randomly sampled 79 European (EUR) and 79 African (AFR) samples from The 1000 Genomes Project (1KGP) (Auton et al., 2015) and merged them with the 79 Native American (AMR) samples from Mao et al., 2007 keeping the biallelic SNPs. LD-pruned variants were selected from this merge using the same parameters as in the QC. We then ran an unsupervised analysis with K = 3 to redefine and homogenize the clusters and to compose a refined reference for the analyses by applying a threshold of ≥95% of belonging to a particular cluster. As a result, 20 AFR, 18 EUR, and 38 AMR individuals were removed. The same LD-pruned variants data from the remaining individuals were merged with the SCOURGE Latin American cohort to perform supervised clustering and estimate admixture proportions. A total of 471 samples from the SCOURGE cohort with >80% estimated European GIA were removed to reduce the weight of the European ancestral component, leaving a total of 3512 admixed Latin American (AMR) subjects for downstream analyses.

Association analysis

The results for the SCOURGE Latin American GWAS were obtained by testing for COVID-19 hospitalization as a surrogate of severity. To accommodate the continuum of GIA in the cohort, we opted for a joint testing of all the individuals as a single study using a mixed regression model as this approach has demonstrated a greater power and sufficient control of population structure (Wojcik et al., 2019). The SCOURGE cohort consisted of 3512 COVID-19-positive patients: cases (n = 1625) were defined as hospitalized COVID-19 patients, and controls (n = 1887) were defined as non-hospitalized COVID-19-positive patients.

Logistic mixed regression models were fitted using the SAIGEgds (Zheng and Davis, 2021) package in R, which implements the two-step mixed SAIGE (Zhou et al., 2018) model methodology and the SPA test. Baseline covariables included sex, age (continuous), and the first 10 PCs. To account for potential heterogeneity in the recruitment and hospitalization criteria across the participating countries, we adjusted the models by groups of the recruitment areas classified into six categories: Brazil, Colombia, Ecuador, Mexico, Paraguay, and Spain. This dataset has not been used in any previously published GWAS of COVID-19.

Meta-analysis of Latin American populations

The results of the SCOURGE Latin American cohort were meta-analyzed with the AMR HGI-B2 data, conforming our primary analysis. Summary results from the HGI freeze 7 B2 analysis corresponding to the admixed AMR population were obtained from the public repository (April 8, 2022: https://www.covid19hg.org/results/r7/), summing up 3077 cases and 66,686 controls from seven contributing studies. We selected the B2 phenotype definition because it offered more power, and the presence of population controls not ascertained for COVID-19 does not have a drastic impact on the association results.

The meta-analysis was performed using an inverse-variance weighting method in METAL (Willer et al., 2010). The average allele frequency was calculated, and variants with low imputation quality (Rsq < 0.3) were filtered out, leaving 10,121,172 variants for meta-analysis.

Heterogeneity between studies was evaluated with Cochran’s Q-test. The inflation of results was assessed based on a genomic control (lambda).

Replicability of associations

The model-based method MAMBA (McGuire et al., 2021) was used to calculate the posterior probabilities of replication for each of the lead variant (PPR; PP that an SNP has a non-zero replicable effect). We defined PPR <0.1 as a low posterior probability of replication, following the original paper, whereas those with a PPR >90% were considered consistent and likely to replicate in future studies. Variants with p<1 × 10–05 were clumped and combined with random pruned variants from the 1KGP AMR reference panel. Then, MAMBA was applied to the set of significant and non-significant variants.

Each of the lead variants was also tested for association with the main comorbidities in the SCOURGE cohort with logistic regression models (adjusted by the same base covariables as the GWAS).

Definition of the genetic risk loci and putative functional impact

Definition of lead variant and novel loci

To define the lead variants in the loci that were genome-wide significant, LD-clumping was performed on the meta-analysis data using a threshold p-value<5 × 10–8, clump distance = 1500 kb, independence set at a threshold r2 = 0.1 and the SCOURGE cohort genotype data as the LD reference panel. Independent loci were deemed as a novel finding if they met the following criteria: (1) p-value<5 × 10–8 in the meta-analysis and p-value>5 × 10–8 in the HGI B2 ALL meta-analysis or in the HGI B2 AMR and AFR and EUR analyses when considered separately; (2) Cochran’s Q-test for heterogeneity of effects is <0.05/Nloci, where Nloci is the number of independent variants with p<5 × 10–8; and (3) the nearest gene has not been previously described in the latest HGIv7 update.

Annotation and initial mapping

Functional annotation was performed with FUMA (Watanabe et al., 2017) for those variants with a p-value<5 × 10–8 or in moderate-to-strong LD (r2 > 0.6) with the lead variants, where the LD was calculated from the 1KGP AMR panel. Genetic risk loci were defined by collapsing LD blocks within 250 kb. Then, genes, scaled CADD v1.4 scores, and RegulomeDB v1.1 scores were annotated for the resulting variants with ANNOVAR in FUMA (Watanabe et al., 2017). Gene-based analysis was also performed using MAGMA (de Leeuw et al., 2015) as implemented in FUMA under the SNP-wide mean model using the 1KGP AMR reference panel. Significance was set at a threshold p<2.66 × 10–6 (which assumes that variants can be mapped to a total of 18,817 genes).

FUMA allowed us to perform initial gene mapping by two approaches: (1) positional mapping, which assigns variants to genes by physical distance using 10 kb windows; and (2) eQTL mapping based on GTEx v.8 data from whole blood, lungs, lymphocytes, and esophageal mucosa tissues, establishing a false discovery rate (FDR) of 0.05 to declare significance for variant–gene pairs.

Subsequently, to assign the variants to the most likely gene driving the association, we refined the candidate genes by fine mapping the discovered regions.

Bayesian fine-mapping

To conduct a Bayesian fine mapping, credible sets for the genetic loci considered novel findings were calculated on the results from each of the three meta-analyses to identify a subset of variants most likely containing the causal variant at the 95% confidence level, assuming that there is a single causal variant and that it has been tested. We used corrcoverage (https://cran.rstudio.com/web/packages/corrcoverage/index.html) for R to calculate the posterior probabilities of the variant being causal for all variants with an r2 > 0.1 with the leading SNP and within 1 Mb except for the novel variant in chromosome 19, for which we used a window of 0.5 Mb. Variants were added to the credible set until the sum of the posterior probabilities was ≥0.95.

VEP and V2G annotation

We used the Variant-to-Gene (V2G) score to prioritize the genes that were most likely affected by the functional evidence based on eQTL, chromatin interactions, in silico functional predictions, and distance between the prioritized variants and transcription start site (TSS), based on data from the Open Targets Genetics portal (Ghoussaini et al., 2021). Details of the data integration and the weighting of each of the datasets are described in detail at https://genetics-docs.opentargets.org/our-approach/data-pipeline. V2G is a score for ranking the functional genomics evidence that supports the connection between variants and genes (the higher the score the more likely the variant to be functionally implicated on the assigned gene). We used VEP release 111 (https://www.ensembl.org/info/docs/tools/vep/index.html; accessed April 10, 2024; McLaren et al., 2016) to annotate the following: gene symbol, function (exonic, intronic, intergenic, non-coding RNA, etc.), impact, feature type, feature, and biotype.

We queried the GWAS catalog (date of accession: 1/07/2024) for evidence of association of each of the prioritized genes with traits related to lung diseases or phenotypes. Lastly, those which were linked to COVID-19, infection, or lung diseases in the revised literature were classified as ‘literature evidence’.

Transcription-wide association studies

TWAS were conducted using the pretrained prediction models with MASHR-computed effect sizes on GTEx v8 datasets (Barbeira et al., 2019a; Barbeira et al., 2021). The results from the Latin American meta-analysis were harmonized and integrated with the prediction models through S-PrediXcan (Barbeira et al., 2018) for lung, whole blood, lymphocyte, and esophageal mucosal tissues. Statistical significance was set at p-value<0.05 divided by the number of genes that were tested for each tissue. Subsequently, we leveraged results for all 49 tissues and ran a multitissue TWAS (S-MultiXcan) to improve the power for association, as demonstrated recently (Barbeira et al., 2019b). TWAS was also performed using recently published gene expression datasets derived from a cohort of African Americans, Puerto Ricans, and Mexican Americans (GALA II-SAGE) (Kachuri et al., 2023).

Cross-population meta-analyses

We conducted two additional meta-analyses to investigate the ability of combining populations to replicate our discovered risk loci. This methodology enabled the comparison of effects and the significance of associations in the novel risk loci between the results from analyses that included or excluded other population groups.

The first meta-analysis comprised the five populations analyzed within HGI (B2-ALL). Additionally, to evaluate the three GIA components within the SCOURGE Latin American cohort (Bryc et al., 2010), we conducted a meta-analysis of the admixed AMR, EUR, and AFR cohorts (B2). All summary statistics were retrieved from the HGI repository. We applied the same meta-analysis methodology and filters as in the admixed AMR meta-analysis.

Cross-population polygenic risk score

A PGS for critical COVID-19 was derived by combining the variants associated with hospitalization or disease severity that have been discovered to date. We curated a list of lead variants that were (1) associated with either severe disease or hospitalization in the latest HGIv7 release (Kanai et al., 2023) (using the hospitalization weights) or (2) associated with severe disease in the latest GenOMICC meta-analysis (Pairo-Castineira et al., 2023) that were not reported in the latest HGI release. A total of 48 markers were used in the PGS model (see Supplementary file 13) since two variants were absent from our study.

Scores were calculated and normalized for the SCOURGE Latin American cohort with PLINK 1.9. This cross-ancestry PGS was used as a predictor for hospitalization (COVID-19-positive patients who were hospitalized vs COVID-19-positive patients who did not necessitate hospital admission) by fitting a logistic regression model. Prediction accuracy for the PGS was assessed by performing 500 bootstrap resamples of the increase in the pseudo-R2. We also divided the sample into deciles and percentiles to assess risk stratification. The models were fit for the dependent variable adjusting for sex, age, the first 10 PCs, and the sampling region (in the admixed AMR cohort) with and without the PGS, and the partial pseudo-R2 was computed and averaged among the resamples.

A clinical severity scale was used in a multinomial regression model to further evaluate the power of this cross-ancestry PGS for risk stratification. These severity strata were defined as follows: (0) asymptomatic; (1) mild, that is, with symptoms, but without pulmonary infiltrates or need of oxygen therapy; (2) moderate, that is, with pulmonary infiltrates affecting <50% of the lungs or need of supplemental oxygen therapy; (3) severe disease, that is, with hospital admission and PaO2 <65 mmHg or SaO2 <90%, PaO2/FiO2<300, SaO2/FiO2<440, dyspnea, respiratory frequency ≥22 bpm, and infiltrates affecting >50% of the lungs; and (4) critical disease, that is, with an admission to the ICU or need of mechanical ventilation (invasive or noninvasive).

Acknowledgements

The contribution of the Centro National de Genotipado (CEGEN) and Centro de Supercomputación de Galicia (CESGA) for funding this project by providing supercomputing infrastructures is also acknowledged. The authors are also particularly grateful for the supply of material and the collaboration of patients, health professionals from participating centers and biobanks. Namely, Biobanc-Mur, and biobancs of the Complexo Hospitalario Universitario de A Coruña, Complexo Hospitalario Universitario de Santiago, Hospital Clínico San Carlos, Hospital La Fe, Hospital Universitario Puerta de Hierro Majadahonda—Instituto de Investigación Sanitaria Puerta de Hierro—Segovia de Arana, Hospital Ramón y Cajal, IDIBGI, IdISBa, IIS Biocruces Bizkaia, IIS Galicia Sur. Also biobanks of the Sistema de Salud de Aragón, Sistema Sanitario Público de Andalucía, and Banco Nacional de ADN. Instituto de Salud Carlos III (COV20_00622 to AC, COV20/00792 to MB, COV20_00181 to CA, COV20_1144 to MAJS and AFR, PI20/00876 to CF); European Union (ERDF) ‘A way of making Europe’. Fundación Amancio Ortega, Banco de Santander (to AC), Estrella de Levante SA and Colabora Mujer Association (to EGN) and Obra Social La Caixa (to RB); Agencia Estatal de Investigación (RTC-2017-6471-1 to CF), Cabildo Insular de Tenerife (CGIEU0000219140 ‘Apuestas científicas del ITER para colaborar en la lucha contra la COVID-19’ to CF) and Fundación Canaria Instituto de Investigación Sanitaria de Canarias (PIFIISC20/57 to CF). SD-DA was supported by a Xunta de Galicia predoctoral fellowship.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Carlos Flores, Email: cflores@ull.edu.es.

Ángel Carracedo, Email: angel.carracedo@usc.es.

Siming Zhao, Dartmouth College, United States.

Murim Choi, Seoul National University, Republic of Korea.

Funding Information

This paper was supported by the following grants:

  • Instituto de Salud Carlos III COV20_00622 to Ángel Carracedo.

  • Instituto de Salud Carlos III COV20/00792 to Matilde Bustos.

  • Instituto de Salud Carlos III COV20_00181 to Carmen Ayuso.

  • Instituto de Salud Carlos III COV20_1144 to Amanda Fernández-Rodríguez, María A Jimenez-Sousa.

  • Instituto de Salud Carlos III PI20/00876 to Carlos Flores.

  • European Regional Development Fund A way of making Europe to Ángel Carracedo.

  • Fundación Amancio Ortega to Ángel Carracedo.

  • Banco Santander to Ángel Carracedo.

  • Estrella de Levante S.A. to Encarna Guillen-Navarro.

  • Colabora Mujer Association to Encarna Guillen-Navarro.

  • Obra Social La Caixa to Raúl C Baptista-Rosas.

  • Agencia Estatal de Investigación RTC-2017-6471-1 to Carlos Flores.

  • Cabildo Insular de Tenerife CGIEU0000219140 to Carlos Flores.

  • Fundación Canaria Instituto de Investigación Sanitaria de Canarias PIFIISC20/57 to Carlos Flores.

  • Xunta de Galicia to Silvia Diz-de Almeida.

  • Axencia GAIN to Silvia Diz-de Almeida.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Conceptualization, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Formal analysis, Validation, Methodology, Writing – original draft, Writing – review and editing.

Formal analysis, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Project administration, Writing – review and editing.

Data curation, Project administration, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Methodology, Resources.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Investigation, Methodology, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Project administration, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing.

Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Ethics

Human subjects: Ethics and Scientific Committees from the participating centres and by the Galician Ethics Committee Ref 2020/197 gave ethical approval for this work. Recruitment of patients from IMSS (in Mexico City), was approved by of the National Committee of Clinical Research, Instituto Mexicano del Seguro Social, Mexico (protocol R-2020-785-082).

Additional files

Supplementary file 1. Participating centers.
elife-93666-supp1.xlsx (14.5KB, xlsx)
Supplementary file 2. Independent variants with p-value<1 × 10–05 in the SC-HGI_AMR GWAS meta-analysis (hg38).

EA: effect allele; NEA: non-effect allele; EAF: effect allele frequency; EAF_avg: averaged effect allele frequency; FreqSE: standard error of averaged effect allele frequency; SCOURGE_AMR: SCOURGE Latin-America; HGIB2_AMR: HGI meta-analysis of AMR studies.

elife-93666-supp2.xlsx (25KB, xlsx)
Supplementary file 3. Annotated SNPs in moderate-to-strong LD with lead SNPs of the genome-wide significant loci in the SC-HGI_AMR GWAS meta-analysis, with ANNOVAR.

NEA: non-effect allele; EA: effect allele; r2: maximum r2 of the SNP with one of the independent SNPs; IndSigSNP: the independent SNP which has the maximum r2 value with the SNP; dist: distance to the nearest gene; func: functional consequence of the SNP on the gene; CADD: CADD score; RDB: RegulomeDB score; minChrState: the minimum 15-core chromatin state across 127 tissues/cell types; commonChrState: the most common 15-core chromatin state across 127 tissues/cell types; posMapFilt: 1 if the SNP was used for positional mapping, 0 otherwise; eqtlMapFilt: 1 if the SNP was used for eQTL mapping, 0 otherwise.

elife-93666-supp3.xlsx (54KB, xlsx)
Supplementary file 4. Results from the MAGMA gene-based analysis in the SC-HGI_AMR GWAS meta-analysis (hg37).

NSNPS: number of SNPs in the gene; NPARAM: the number of relevant parameters used in the model; ZSTAT: z statistics.

elife-93666-supp4.xlsx (1.4MB, xlsx)
Supplementary file 5. Prioritized genes by eQTL and positional mapping by FUMA in the SC-HGI_AMR GWAS meta-analysis results (hg37).

HUGO: HGNC gene symbol; pLI: pLI score from ExAC database, probability of being intolerant to loss of function (higher the score, higher the intolerance); ncRVIS: non-coding residual variation intolerance score (higher the score, higher intolerance to non-coding variation); posMapSNPs: number of SNPs mapped by positional mapping; posMapMaxCADD: the maximum CADD score of mapped SNPs by positional mapping; eqtlMapSNPS: the number of SNPs mapped to the genes based on eQTL mapping; eqtlMapminP: the minimum eQTL p-value of mapped SNPs; eqtlMapminQ: the minimum eQTL FDR of mapped SNPs; eqtlMapts: tissue of mapped eQTLs; eqtlDirection: consequential direction of mapped eQTL SNPs after aligning the risk alleles; minGwasP: minimum GWAS p-value of mapped eQTLs; IndSigSNPs: independent SNPs that are in LD with the mapped SNPs.

elife-93666-supp5.xlsx (17.4KB, xlsx)
Supplementary file 6. Fine-mapped credible set derived with corrcoverage (95%) for the associated region in chromosome 2 (BAZ2B).
elife-93666-supp6.xlsx (18KB, xlsx)
Supplementary file 7. VEP annotations for the variants included in the fine-mapped credible sets for the novel associated loci in chromosome 2 (hg38).
elife-93666-supp7.xlsx (24.2KB, xlsx)
Supplementary file 8. V2G scores for the variants included in the fine-mapped credible sets in the novel risk loci from chromosomes 2 and 16 (hg38).

Shaded in green, the prioritized gene by the V2G score.

elife-93666-supp8.xlsx (18.7KB, xlsx)
Supplementary file 9. MultiXcan results for the SC-HGI_AMR GWAS meta-analysis.

N: number of tissues available for the gene; n_indep: number of independent components of variation kept among the tissues' predictions; p_i_best: best p-value of single tissue S-prediXcan association; t_i_best: name of best single tissue S-prediXcan association; p_i_worst: worst p-value of single tissue S-prediXcan association; t_i_worst: name of worst single tissue S-prediXcan association; eigen_max: eigenvalue of the top independent component in the SVD decomposition of predicted expression correlation; eigen_min: eigenvalue of the last independent component in the SVD decomposition of predicted expression correlation; eigen_min_kept: eigenvalue of the smallest independent component that was kept in the SVD decomposition of predicted expression correlation; z_min: minimum z-score among single-tissue S-prediXcan associations; z_max: maximum z-score among single-tissue S-prediXcan associations; z_mean: mean z-score among single tissue S-prediXcan associations; z_sd: standard deviation of the mean z-score among single-tissue S-prediXcan associations; tmi: trace of T*T', where T is the correlation of predicted expression levels for different tissues multiplied by its SVD pseudo-inverse and is an estimate for the number of independent components of variation in predicted expression across tissues.

elife-93666-supp9.xlsx (4.2MB, xlsx)
Supplementary file 10. Top 10 genes for the TWAS trained with the GALA II-SAGE models in admixed Americans.

Bonferroni correction thresholds: Pooled p<4.19E-06; PR p<4.99E-06; MX p<5.19E-06; AA p<4.67E-06. Var_g: variance of the gene expression; pred_perf_r2: cross-validated R2 of tissue model’s correlation to gene’s measured transcriptome; pref_perf_qval: qval of tissue model’s correlation to gene’s measured transcriptome; n_snps_used: number of snps from GWAS used in S-prediXcan analysis; n_snp_in_cov: number of snps in the covariance matrix; n_snps_in_model: number of snps in the model; best_gwas_p: the highest p-value from GWAS snps used in this model; largest_weight: the largest weight in this model.

elife-93666-supp10.xlsx (18.6KB, xlsx)
Supplementary file 11. Independent variants with p-value<1e-05 in the SC-HGI_ALL GWAS meta-analysis (hg38).

EA: effect allele; NEA: non-effect allele; EAF_avg: averaged effect allele frequency; FreqSE: standard error of averaged effect allele frequency.

elife-93666-supp11.xlsx (68.5KB, xlsx)
Supplementary file 12. Results of the 40 lead variants associated with COVID-19 hospitalization in the HGIv7 (hg38).

SC-HGI_ALL: meta-analysis SCOURGE-HGI_ALL; SC-HGI_AMR: meta-analysis SCOURGE-HGI_AMR; SC-HGI_3POP: meta-analysis SCOURGE-HGI_3POP.

Supplementary file 13. Independent variants with p-value<1e-05 in the SC-HGI_3POP GWAS meta-analysis (hg38).

EA: effect allele; NEA: non-effect allele; EAF_avg: average effect allele frequency; FreqSE: standard error of averaged effect allele frequency.

elife-93666-supp13.xlsx (88.1KB, xlsx)
Supplementary file 14. Instruments used in the polygenic risk score model (hg38).
elife-93666-supp14.xlsx (13.5KB, xlsx)
Supplementary file 15. Multinomial regression results.

Reference class for the multinomial regression is ‘asymptomatic’.

elife-93666-supp15.xlsx (18.8KB, xlsx)
MDAR checklist

Data availability

Summary statistics from the SCOURGE Latin American GWAS and the analysis scripts are available from the public repository https://github.com/CIBERER/Scourge-COVID19 (copy archived at CIBERER, 2024).

The following previously published datasets were used:

Consortium GTEx. 2020. GTEx V8. GTEx Portal. Single-Tissue cis-QTL

Consortium Genomes Project 2016. 1000 Genomes Phase 3. PLINK 2.0 Resources. 1000 Genomes phase 3

COVID-19 Host Genetics Initiative 2022. COVID19-hg GWAS meta-analyses round 7. COVID-19 hg repository. r7

References

  1. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research. 2009;19:1655–1664. doi: 10.1101/gr.094052.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Angeli F, Zappa M, Reboldi G, Gentile G, Trapasso M, Spanevello A, Verdecchia P. The spike effect of acute respiratory syndrome coronavirus 2 and coronavirus disease 2019 vaccines on blood pressure. European Journal of Internal Medicine. 2023;109:12–21. doi: 10.1016/j.ejim.2022.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Atkinson EG, Maihofer AX, Kanai M, Martin AR, Karczewski KJ, Santoro ML, Ulirsch JC, Kamatani Y, Okada Y, Finucane HK, Koenen KC, Nievergelt CM, Daly MJ, Neale BM. Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power. Nature Genetics. 2021;53:195–204. doi: 10.1038/s41588-020-00766-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR, 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Barbeira AN, Dickinson SP, Bonazzola R, Zheng J, Wheeler HE, Torres JM, Torstenson ES, Shah KP, Garcia T, Edwards TL, Stahl EA, Huckins LM, GTEx Consortium. Nicolae DL, Cox NJ, Im HK. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nature Communications. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barbeira AN, Bonazzola R, Gamazon ER, Liang Y, Park Y, Kim-Hellmuth S, Wang G, Jiang Z, Zhou D, Hormozdiari F, Liu B, Rao A, Hamel AR, Pividori MD, Aguet F, Bastarache L, Jordan DM, Verbanck M, Do R, Im HK. GWAS and gtex QTL integration. 0.1Zenodo. 2019a doi: 10.5281/zenodo.3518299. [DOI]
  7. Barbeira AN, Pividori M, Zheng J, Wheeler HE, Nicolae DL, Im HK. Integrating predicted transcriptome from multiple tissues improves association detection. PLOS Genetics. 2019b;15:e1007889. doi: 10.1371/journal.pgen.1007889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Barbeira AN, Bonazzola R, Gamazon ER, Liang Y, Park Y, Kim-Hellmuth S, Wang G, Jiang Z, Zhou D, Hormozdiari F, Liu B, Rao A, Hamel AR, Pividori MD, Aguet F, GTEx GWAS Working Group. Bastarache L, Jordan DM, Verbanck M, Do R, GTEx Consortium. Stephens M, Ardlie K, McCarthy M, Montgomery SB, Segrè AV, Brown CD, Lappalainen T, Wen X, Im HK. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biology. 2021;22:49. doi: 10.1186/s13059-020-02252-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bastard P, Hsiao KC, Zhang Q, Choin J, Best E, Chen J, Gervais A, Bizien L, Materna M, Harmant C, Roux M, Hawley NL, Weeks DE, McGarvey ST, Sandoval K, Barberena-Jonas C, Quinto-Cortés CD, Hagelberg E, Mentzer AJ, Robson K, Coulibaly B, Seeleuthner Y, Bigio B, Li Z, Uzé G, Pellegrini S, Lorenzo L, Sbihi Z, Latour S, Besnard M, Adam de Beaumais T, Jacqz Aigrain E, Béziat V, Deka R, Esera Tulifau L, Viali S, Reupena MS, Naseri T, McNaughton P, Sarkozy V, Peake J, Blincoe A, Primhak S, Stables S, Gibson K, Woon ST, Drake KM, Hill AVS, Chan CY, King R, Ameratunga R, Teiti I, Aubry M, Cao-Lormeau VM, Tangye SG, Zhang SY, Jouanguy E, Gray P, Abel L, Moreno-Estrada A, Minster RL, Quintana-Murci L, Wood AC, Casanova JL. A loss-of-function IFNAR1 allele in Polynesia underlies severe viral diseases in homozygotes. The Journal of Experimental Medicine. 2022;219:e20220028. doi: 10.1084/jem.20220028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Brunette GJ, Jamalruddin MA, Baldock RA, Clark NL, Bernstein KA. Evolution-based screening enables genome-wide prioritization and discovery of DNA repair genes. PNAS. 2019;116:19593–19599. doi: 10.1073/pnas.1906559116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bryc K, Velez C, Karafet T, Moreno-Estrada A, Reynolds A, Auton A, Hammer M, Bustamante CD, Ostrer H. Colloquium paper: genome-wide patterns of population structure and admixture among Hispanic/Latino populations. PNAS. 2010;107:8954–8961. doi: 10.1073/pnas.0914618107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. CIBERER Scourge-COVID19. swh:1:rev:4051b1e44a25033f8ff2bbfe2469b641c18246b3Software Heritage. 2024 https://archive.softwareheritage.org/swh:1:dir:2678aa2c995f50a9ec01edc33485a43e1ed7d021;origin=https://github.com/CIBERER/Scourge-COVID19;visit=swh:1:snp:5fe4675a63aa619b0a4092aae7167b1561d001f5;anchor=swh:1:rev:4051b1e44a25033f8ff2bbfe2469b641c18246b3
  13. Cruz R, Diz-de Almeida S, López de Heredia M, Quintela I, Ceballos FC, Pita G, Lorenzo-Salazar JM, González-Montelongo R, Gago-Domínguez M, Sevilla Porras M, Tenorio Castaño JA, Nevado J, Aguado JM, Aguilar C, Aguilera-Albesa S, Almadana V, Almoguera B, Alvarez N, Andreu-Bernabeu Á, Arana-Arri E, Arango C, Arranz MJ, Artiga M-J, Baptista-Rosas RC, Barreda-Sánchez M, Belhassen-Garcia M, Bezerra JF, Bezerra MAC, Boix-Palop L, Brion M, Brugada R, Bustos M, Calderón EJ, Carbonell C, Castano L, Castelao JE, Conde-Vicente R, Cordero-Lorenzana ML, Cortes-Sanchez JL, Corton M, Darnaude MT, De Martino-Rodríguez A, Del Campo-Pérez V, Diaz de Bustamante A, Domínguez-Garrido E, Luchessi AD, Eiros R, Estigarribia Sanabria GM, Carmen Fariñas M, Fernández-Robelo U, Fernández-Rodríguez A, Fernández-Villa T, Gil-Fournier B, Gómez-Arrue J, González Álvarez B, Gonzalez Bernaldo de Quirós F, González-Peñas J, Gutiérrez-Bautista JF, Herrero MJ, Herrero-Gonzalez A, Jimenez-Sousa MA, Lattig MC, Liger Borja A, Lopez-Rodriguez R, Mancebo E, Martín-López C, Martín V, Martinez-Nieto O, Martinez-Lopez I, Martinez-Resendez MF, Martinez-Perez A, Mazzeu JF, Merayo Macías E, Minguez P, Moreno Cuerda V, Silbiger VN, Oliveira SF, Ortega-Paino E, Parellada M, Paz-Artal E, Santos NPC, Pérez-Matute P, Perez P, Pérez-Tomás ME, Perucho T, Pinsach-Abuin ML, Pompa-Mera EN, Porras-Hurtado GL, Pujol A, Ramiro León S, Resino S, Fernandes MR, Rodríguez-Ruiz E, Rodriguez-Artalejo F, Rodriguez-Garcia JA, Ruiz Cabello F, Ruiz-Hornillos J, Ryan P, Soria JM, Souto JC, Tamayo E, Tamayo-Velasco A, Taracido-Fernandez JC, Teper A, Torres-Tobar L, Urioste M, Valencia-Ramos J, Yáñez Z, Zarate R, Nakanishi T, Pigazzini S, Degenhardt F, Butler-Laporte G, Maya-Miles D, Bujanda L, Bouysran Y, Palom A, Ellinghaus D, Martínez-Bueno M, Rolker S, Amitrano S, Roade L, Fava F, Spinner CD, Prati D, Bernardo D, Garcia F, Darcis G, Fernández-Cadenas I, Holter JC, Banales JM, Frithiof R, Duga S, Asselta R, Pereira AC, Romero-Gómez M, Nafría-Jiménez B, Hov JR, Migeotte I, Renieri A, Planas AM, Ludwig KU, Buti M, Rahmouni S, Alarcón-Riquelme ME, Schulte EC, Franke A, Karlsen TH, Valenti L, Zeberg H, Richards B, Ganna A, Boada M, de Rojas I, Ruiz A, Sánchez-Juan P, Real LM, SCOURGE Cohort Group. HOSTAGE Cohort Group. GRA@CE Cohort Group. Guillen-Navarro E, Ayuso C, González-Neira A, Riancho JA, Rojas-Martinez A, Flores C, Lapunzina P, Carracedo A. Novel genes and sex differences in COVID-19 severity. Human Molecular Genetics. 2022;31:3789–3806. doi: 10.1093/hmg/ddac132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene-set analysis of GWAS data. PLOS Computational Biology. 2015;11:e1004219. doi: 10.1371/journal.pcbi.1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Degenhardt F, Ellinghaus D, Juzenas S, Lerga-Jaso J, Wendorff M, Maya-Miles D, Uellendahl-Werth F, ElAbd H, Rühlemann MC, Arora J, Özer O, Lenning OB, Myhre R, Vadla MS, Wacker EM, Wienbrandt L, Blandino Ortiz A, de Salazar A, Garrido Chercoles A, Palom A, Ruiz A, Garcia-Fernandez A-E, Blanco-Grau A, Mantovani A, Zanella A, Holten AR, Mayer A, Bandera A, Cherubini A, Protti A, Aghemo A, Gerussi A, Ramirez A, Braun A, Nebel A, Barreira A, Lleo A, Teles A, Kildal AB, Biondi A, Caballero-Garralda A, Ganna A, Gori A, Glück A, Lind A, Tanck A, Hinney A, Carreras Nolla A, Fracanzani AL, Peschuck A, Cavallero A, Dyrhol-Riise AM, Ruello A, Julià A, Muscatello A, Pesenti A, Voza A, Rando-Segura A, Solier A, Schmidt A, Cortes B, Mateos B, Nafria-Jimenez B, Schaefer B, Jensen B, Bellinghausen C, Maj C, Ferrando C, de la Horra C, Quereda C, Skurk C, Thibeault C, Scollo C, Herr C, Spinner CD, Gassner C, Lange C, Hu C, Paccapelo C, Lehmann C, Angelini C, Cappadona C, Azuure C, COVICAT study group, Aachen Study (COVAS) Bianco C, Cea C, Sancho C, Hoff DAL, Galimberti D, Prati D, Haschka D, Jiménez D, Pestaña D, Toapanta D, Muñiz-Diaz E, Azzolini E, Sandoval E, Binatti E, Scarpini E, Helbig ET, Casalone E, Urrechaga E, Paraboschi EM, Pontali E, Reverter E, Calderón EJ, Navas E, Solligård E, Contro E, Arana-Arri E, Aziz F, Garcia F, García Sánchez F, Ceriotti F, Martinelli-Boneschi F, Peyvandi F, Kurth F, Blasi F, Malvestiti F, Medrano FJ, Mesonero F, Rodriguez-Frias F, Hanses F, Müller F, Hemmrich-Stanisak G, Bellani G, Grasselli G, Pezzoli G, Costantino G, Albano G, Cardamone G, Bellelli G, Citerio G, Foti G, Lamorte G, Matullo G, Baselli G, Kurihara H, Neb H, My I, Kurth I, Hernández I, Pink I, de Rojas I, Galván-Femenia I, Holter JC, Afset JE, Heyckendorf J, Kässens J, Damås JK, Rybniker J, Altmüller J, Ampuero J, Martín J, Erdmann J, Banales JM, Badia JR, Dopazo J, Schneider J, Bergan J, Barretina J, Walter J, Hernández Quero J, Goikoetxea J, Delgado J, Guerrero JM, Fazaal J, Kraft J, Schröder J, Risnes K, Banasik K, Müller KE, Gaede KI, Garcia-Etxebarria K, Tonby K, Heggelund L, Izquierdo-Sanchez L, Bettini LR, Sumoy L, Sander LE, Lippert LJ, Terranova L, Nkambule L, Knopp L, Gustad LT, Garbarino L, Santoro L, Téllez L, Roade L, Ostadreza M, Intxausti M, Kogevinas M, Riveiro-Barciela M, Berger MM, Schaefer M, Niemi MEK, Gutiérrez-Stampa MA, Carrabba M, Figuera Basso ME, Valsecchi MG, Hernandez-Tejero M, Vehreschild MJGT, Manunta M, Acosta-Herrera M, D’Angiò M, Baldini M, Cazzaniga M, Grimsrud MM, Cornberg M, Nöthen MM, Marquié M, Castoldi M, Cordioli M, Cecconi M, D’Amato M, Augustin M, Tomasi M, Boada M, Dreher M, Seilmaier MJ, Joannidis M, Wittig M, Mazzocco M, Ciccarelli M, Rodríguez-Gandía M, Bocciolone M, Miozzo M, Imaz Ayo N, Blay N, Chueca N, Montano N, Braun N, Ludwig N, Marx N, Martínez N, Norwegian SARS-CoV-2 Study group. Cornely OA, Witzke O, Palmieri O, Pa Study Group. Faverio P, Preatoni P, Bonfanti P, Omodei P, Tentorio P, Castro P, Rodrigues PM, España PP, Hoffmann P, Rosenstiel P, Schommers P, Suwalski P, de Pablo R, Ferrer R, Bals R, Gualtierotti R, Gallego-Durán R, Nieto R, Carpani R, Morilla R, Badalamenti S, Haider S, Ciesek S, May S, Bombace S, Marsal S, Pigazzini S, Klein S, Pelusi S, Wilfling S, Bosari S, Volland S, Brunak S, Raychaudhuri S, Schreiber S, Heilmann-Heimbach S, Aliberti S, Ripke S, Dudman S, Wesse T, Zheng T, STORM Study group, The Humanitas Task Force, The Humanitas Gavazzeni Task Force. Bahmer T, Eggermann T, Illig T, Brenner T, Pumarola T, Feldt T, Folseraas T, Gonzalez Cejudo T, Landmesser U, Protzer U, Hehr U, Rimoldi V, Monzani V, Skogen V, Keitel V, Kopfnagel V, Friaza V, Andrade V, Moreno V, Albrecht W, Peter W, Poller W, Farre X, Yi X, Wang X, Khodamoradi Y, Karadeniz Z, Latiano A, Goerg S, Bacher P, Koehler P, Tran F, Zoller H, Schulte EC, Heidecker B, Ludwig KU, Fernández J, Romero-Gómez M, Albillos A, Invernizzi P, Buti M, Duga S, Bujanda L, Hov JR, Lenz TL, Asselta R, de Cid R, Valenti L, Karlsen TH, Cáceres M, Franke A. Detailed stratified GWAS analysis for severe COVID-19 in four European populations. Human Molecular Genetics. 2022;31:3945–3966. doi: 10.1093/hmg/ddac158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Duncan CJA, Skouboe MK, Howarth S, Hollensen AK, Chen R, Børresen ML, Thompson BJ, Stremenova Spegarova J, Hatton CF, Stæger FF, Andersen MK, Whittaker J, Paludan SR, Jørgensen SE, Thomsen MK, Mikkelsen JG, Heilmann C, Buhas D, Øbro NF, Bay JT, Marquart HV, de la Morena MT, Klejka JA, Hirschfeld M, Borgwardt L, Forss I, Masmas T, Poulsen A, Noya F, Rouleau G, Hansen T, Zhou S, Albrechtsen A, Alizadehfar R, Allenspach EJ, Hambleton S, Mogensen TH. Life-threatening viral disease in a novel form of autosomal recessive IFNAR2 deficiency in the Arctic. The Journal of Experimental Medicine. 2022;219:e20212427. doi: 10.1084/jem.20212427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Ghoussaini M, Mountjoy E, Carmona M, Peat G, Schmidt EM, Hercules A, Fumis L, Miranda A, Carvalho-Silva D, Buniello A, Burdett T, Hayhurst J, Baker J, Ferrer J, Gonzalez-Uriarte A, Jupp S, Karim MA, Koscielny G, Machlitt-Northen S, Malangone C, Pendlington ZM, Roncaglia P, Suveges D, Wright D, Vrousgou O, Papa E, Parkinson H, MacArthur JAL, Todd JA, Barrett JC, Schwartzentruber J, Hulcoop DG, Ochoa D, McDonagh EM, Dunham I. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Research. 2021;49:D1311–D1320. doi: 10.1093/nar/gkaa840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Gioia U, Tavella S, Martínez-Orellana P, Cicio G, Colliva A, Ceccon M, Cabrini M, Henriques AC, Fumagalli V, Paldino A, Presot E, Rajasekharan S, Iacomino N, Pisati F, Matti V, Sepe S, Conte MI, Barozzi S, Lavagnino Z, Carletti T, Volpe MC, Cavalcante P, Iannacone M, Rampazzo C, Bussani R, Tripodo C, Zacchigna S, Marcello A, d’Adda di Fagagna F. SARS-CoV-2 infection induces DNA damage, through CHK1 degradation and impaired 53BP1 recruitment, and cellular senescence. Nature Cell Biology. 2023;25:550–564. doi: 10.1038/s41556-023-01096-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gómez-Carballa A, Rivero-Calle I, Pardo-Seco J, Gómez-Rial J, Rivero-Velasco C, Rodríguez-Núñez N, Barbeito-Castiñeiras G, Pérez-Freixo H, Cebey-López M, Barral-Arca R, Rodriguez-Tenreiro C, Dacosta-Urbieta A, Bello X, Pischedda S, Currás-Tuala MJ, Viz-Lasheras S, Martinón-Torres F, Salas A, Antonio AG, Cristina CS. A multi-tissue study of immune gene expression profiling highlights the key role of the nasal epithelium in COVID-19 severity. Environmental Research. 2022;210:112890. doi: 10.1016/j.envres.2022.112890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gupta S, Singh AK, Prajapati KS, Kushwaha PP, Shuaib M, Kumar S. Emerging role of ZBTB7A as an oncogenic driver and transcriptional repressor. Cancer Letters. 2020;483:22–34. doi: 10.1016/j.canlet.2020.04.015. [DOI] [PubMed] [Google Scholar]
  21. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics. 2009;42:377–381. doi: 10.1016/j.jbi.2008.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, McLeod L, Delacqua G, Delacqua F, Kirby J, Duda SN, REDCap Consortium The REDCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics. 2019;95:103208. doi: 10.1016/j.jbi.2019.103208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Horowitz JE, Kosmicki JA, Damask A, Sharma D, Roberts GHL, Justice AE, Banerjee N, Coignet MV, Yadav A, Leader JB, Marcketta A, Park DS, Lanche R, Maxwell E, Knight SC, Bai X, Guturu H, Sun D, Baltzell A, Kury FSP, Backman JD, Girshick AR, O’Dushlaine C, McCurdy SR, Partha R, Mansfield AJ, Turissini DA, Li AH, Zhang M, Mbatchou J, Watanabe K, Gurski L, McCarthy SE, Kang HM, Dobbyn L, Stahl E, Verma A, Sirugo G, Regeneron Genetics Center. Ritchie MD, Jones M, Balasubramanian S, Siminovitch K, Salerno WJ, Shuldiner AR, Rader DJ, Mirshahi T, Locke AE, Marchini J, Overton JD, Carey DJ, Habegger L, Cantor MN, Rand KA, Hong EL, Reid JG, Ball CA, Baras A, Abecasis GR, Ferreira MAR. Genome-wide analysis provides genetic evidence that ACE2 influences COVID-19 risk and yields risk scores associated with severe disease. Nature Genetics. 2022;54:382–392. doi: 10.1038/s41588-021-01006-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Im JY, Kim BK, Lee KW, Chun SY, Kang MJ, Won M. DDIAS promotes STAT3 activation by preventing STAT3 recruitment to PTPRM in lung cancer cells. Oncogenesis. 2020;9:1. doi: 10.1038/s41389-019-0187-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Im JY, Kang MJ, Kim BK, Won M. DDIAS, DNA damage-induced apoptosis suppressor, is a potential therapeutic target in cancer. Experimental & Molecular Medicine. 2023;1–7:00974-6. doi: 10.1038/s12276-023-00974-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kachuri L, Mak ACY, Hu D, Eng C, Huntsman S, Elhawary JR, Gupta N, Gabriel S, Xiao S, Keys KL, Oni-Orisan A, Rodríguez-Santana JR, LeNoir MA, Borrell LN, Zaitlen NA, Williams LK, Gignoux CR, Burchard EG, Ziv E. Gene expression in African Americans, Puerto Ricans and Mexican Americans reveals ancestry-specific patterns of genetic architecture. Nature Genetics. 2023;55:952–963. doi: 10.1038/s41588-023-01377-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kanai M, Andrews SJ, Cordioli M, Stevens C, Neale BM, Daly M, Ganna A, Pathak GA, Iwasaki A, Karjalainen J, Mehtonen J, Pirinen M, Chwialkowska K, Trankiem A, Balaconis MK, Veerapen K, Wolford BN, Ahmad HF, Andrews S. A second update on mapping the human genetic architecture of COVID-19. Nature. 2023;621:06355-3. doi: 10.1038/s41586-023-06355-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Karlsson EK, Kwiatkowski DP, Sabeti PC. Natural selection and infectious disease in human populations. Nature Reviews. Genetics. 2014;15:379–393. doi: 10.1038/nrg3734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kousathanas A, Pairo-Castineira E, Rawlik K, Stuckey A, Odhams CA, Walker S, Russell CD, Malinauskas T, Wu Y, Millar J, Shen X, Elliott KS, Griffiths F, Oosthuyzen W, Morrice K, Keating S, Wang B, Rhodes D, Klaric L, Zechner M, Parkinson N, Siddiq A, Goddard P, Donovan S, Maslove D, Nichol A, Semple MG, Zainy T, Maleady-Crowe F, Todd L, Salehi S, Knight J, Elgar G, Chan G, Arumugam P, Patch C, Rendon A, Bentley D, Kingsley C, Kosmicki JA, Horowitz JE, Baras A, Abecasis GR, Ferreira MAR, Justice A, Mirshahi T, Oetjens M, Rader DJ, Ritchie MD, Verma A, Fowler TA, Shankar-Hari M, Summers C, Hinds C, Horby P, Ling L, McAuley D, Montgomery H, Openshaw PJM, Elliott P, Walsh T, Tenesa A, GenOMICC investigators. 23andMe investigators. COVID-19 Human Genetics Initiative. Fawkes A, Murphy L, Rowan K, Ponting CP, Vitart V, Wilson JF, Yang J, Bretherick AD, Scott RH, Hendry SC, Moutsianas L, Law A, Caulfield MJ, Baillie JK. Whole-genome sequencing reveals host factors underlying critical COVID-19. Nature. 2022;607:97–103. doi: 10.1038/s41586-022-04576-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kwok AJ, Mentzer A, Knight JC. Host genetics and infectious disease: new tools, insights and translational opportunities. Nature Reviews. Genetics. 2021;22:137–153. doi: 10.1038/s41576-020-00297-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Li YR, Keating BJ. Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations. Genome Medicine. 2014;6:91. doi: 10.1186/s13073-014-0091-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li Y, Gong H, Wang P, Zhu Y, Peng H, Cui Y, Li H, Liu J, Wang Z. The emerging role of ISWI chromatin remodeling complexes in cancer. Journal of Experimental & Clinical Cancer Research. 2021;40:346. doi: 10.1186/s13046-021-02151-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Mahnke K, Guo M, Lee S, Sepulveda H, Swain SL, Nussenzweig M, Steinman RM. The dendritic cell receptor for endocytosis, DEC-205, can recycle and enhance antigen presentation via major histocompatibility complex class II-positive lysosomal compartments. The Journal of Cell Biology. 2000;151:673–684. doi: 10.1083/jcb.151.3.673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Mao X, Bigham AW, Mei R, Gutierrez G, Weiss KM, Brutsaert TD, Leon-Velarde F, Moore LG, Vargas E, McKeigue PM, Shriver MD, Parra EJ. A genomewide admixture mapping panel for Hispanic/Latino populations. American Journal of Human Genetics. 2007;80:1171–1178. doi: 10.1086/518564. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. McGuire D, Jiang Y, Liu M, Weissenkampen JD, Eckert S, Yang L, Chen F, Berg A, Vrieze S, Jiang B, Li Q, Liu DJ, GWAS and Sequencing Consortium of Alcohol and Nicotine Use (GSCAN) Model-based assessment of replicability for genome-wide association meta-analysis. Nature Communications. 2021;12:1964. doi: 10.1038/s41467-021-21226-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, Flicek P, Cunningham F. The ensembl variant effect predictor. Genome Biology. 2016;17:122. doi: 10.1186/s13059-016-0974-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Mester R, Hou K, Ding Y, Meeks G, Burch KS, Bhattacharya A, Henn BM, Pasaniuc B. Impact of cross-ancestry genetic architecture on GWAS in admixed populations. bioRxiv. 2023 doi: 10.1101/2023.01.20.524946. [DOI] [PMC free article] [PubMed]
  38. Mitchell HD, Eisfeld AJ, Sims AC, McDermott JE, Matzke MM, Webb-Robertson B-JM, Tilton SC, Tchitchek N, Josset L, Li C, Ellis AL, Chang JH, Heegel RA, Luna ML, Schepmoes AA, Shukla AK, Metz TO, Neumann G, Benecke AG, Smith RD, Baric RS, Kawaoka Y, Katze MG, Waters KM. A network integration approach to predict conserved regulators related to pathogenicity of influenza and SARS-CoV respiratory viruses. PLOS ONE. 2013;8:e69374. doi: 10.1371/journal.pone.0069374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Namkoong H, Edahiro R, Takano T, Nishihara H, Shirai Y, Sonehara K, Tanaka H, Azekawa S, Mikami Y, Lee H, Hasegawa T, Okudela K, Okuzaki D, Motooka D, Kanai M, Naito T, Yamamoto K, Wang QS, Saiki R, Ishihara R, Matsubara Y, Hamamoto J, Hayashi H, Yoshimura Y, Tachikawa N, Yanagita E, Hyugaji T, Shimizu E, Katayama K, Kato Y, Morita T, Takahashi K, Harada N, Naito T, Hiki M, Matsushita Y, Takagi H, Aoki R, Nakamura A, Harada S, Sasano H, Kabata H, Masaki K, Kamata H, Ikemura S, Chubachi S, Okamori S, Terai H, Morita A, Asakura T, Sasaki J, Morisaki H, Uwamino Y, Nanki K, Uchida S, Uno S, Nishimura T, Ishiguro T, Isono T, Shibata S, Matsui Y, Hosoda C, Takano K, Nishida T, Kobayashi Y, Takaku Y, Takayanagi N, Ueda S, Tada A, Miyawaki M, Yamamoto M, Yoshida E, Hayashi R, Nagasaka T, Arai S, Kaneko Y, Sasaki K, Tagaya E, Kawana M, Arimura K, Takahashi K, Anzai T, Ito S, Endo A, Uchimura Y, Miyazaki Y, Honda T, Tateishi T, Tohda S, Ichimura N, Sonobe K, Sassa CT, Nakajima J, Nakano Y, Nakajima Y, Anan R, Arai R, Kurihara Y, Harada Y, Nishio K, Ueda T, Azuma M, Saito R, Sado T, Miyazaki Y, Sato R, Haruta Y, Nagasaki T, Yasui Y, Hasegawa Y, Mutoh Y, Kimura T, Sato T, Takei R, Hagimoto S, Noguchi Y, Yamano Y, Sasano H, Ota S, Nakamori Y, Yoshiya K, Saito F, Yoshihara T, Wada D, Iwamura H, Kanayama S, Maruyama S, Yoshiyama T, Ohta K, Kokuto H, Ogata H, Tanaka Y, Arakawa K, Shimoda M, Osawa T, Tateno H, Hase I, Yoshida S, Suzuki S, Kawada M, Horinouchi H, Saito F, Mitamura K, Hagihara M, Ochi J, Uchida T, Baba R, Arai D, Ogura T, Takahashi H, Hagiwara S, Nagao G, Konishi S, Nakachi I, Murakami K, Yamada M, Sugiura H, Sano H, Matsumoto S, Kimura N, Ono Y, Baba H, Suzuki Y, Nakayama S, Masuzawa K, Namba S, Suzuki K, Naito Y, Liu Y-C, Takuwa A, Sugihara F, Wing JB, Sakakibara S, Hizawa N, Shiroyama T, Miyawaki S, Kawamura Y, Nakayama A, Matsuo H, Maeda Y, Nii T, Noda Y, Niitsu T, Adachi Y, Enomoto T, Amiya S, Hara R, Yamaguchi Y, Murakami T, Kuge T, Matsumoto K, Yamamoto Y, Yamamoto M, Yoneda M, Kishikawa T, Yamada S, Kawabata S, Kijima N, Takagaki M, Sasa N, Ueno Y, Suzuki M, Takemoto N, Eguchi H, Fukusumi T, Imai T, Fukushima M, Kishima H, Inohara H, Tomono K, Kato K, Takahashi M, Matsuda F, Hirata H, Takeda Y, Koh H, Manabe T, Funatsu Y, Ito F, Fukui T, Shinozuka K, Kohashi S, Miyazaki M, Shoko T, Kojima M, Adachi T, Ishikawa M, Takahashi K, Inoue T, Hirano T, Kobayashi K, Takaoka H, Watanabe K, Miyazawa N, Kimura Y, Sado R, Sugimoto H, Kamiya A, Kuwahara N, Fujiwara A, Matsunaga T, Sato Y, Okada T, Hirai Y, Kawashima H, Narita A, Niwa K, Sekikawa Y, Nishi K, Nishitsuji M, Tani M, Suzuki J, Nakatsumi H, Ogura T, Kitamura H, Hagiwara E, Murohashi K, Okabayashi H, Mochimaru T, Nukaga S, Satomi R, Oyamada Y, Mori N, Baba T, Fukui Y, Odate M, Mashimo S, Makino Y, Yagi K, Hashiguchi M, Kagyo J, Shiomi T, Fuke S, Saito H, Tsuchida T, Fujitani S, Takita M, Morikawa D, Yoshida T, Izumo T, Inomata M, Kuse N, Awano N, Tone M, Ito A, Nakamura Y, Hoshino K, Maruyama J, Ishikura H, Takata T, Odani T, Amishima M, Hattori T, Shichinohe Y, Kagaya T, Kita T, Ohta K, Sakagami S, Koshida K, Hayashi K, Shimizu T, Kozu Y, Hiranuma H, Gon Y, Izumi N, Nagata K, Ueda K, Taki R, Hanada S, Kawamura K, Ichikado K, Nishiyama K, Muranaka H, Nakamura K, Hashimoto N, Wakahara K, Sakamoto K, Omote N, Ando A, Kodama N, Kaneyama Y, Maeda S, Kuraki T, Matsumoto T, Yokote K, Nakada T-A, Abe R, Oshima T, Shimada T, Harada M, Takahashi T, Ono H, Sakurai T, Shibusawa T, Kimizuka Y, Kawana A, Sano T, Watanabe C, Suematsu R, Sageshima H, Yoshifuji A, Ito K, Takahashi S, Ishioka K, Nakamura M, Masuda M, Wakabayashi A, Watanabe H, Ueda S, Nishikawa M, Chihara Y, Takeuchi M, Onoi K, Shinozuka J, Sueyoshi A, Nagasaki Y, Okamoto M, Ishihara S, Shimo M, Tokunaga Y, Kusaka Y, Ohba T, Isogai S, Ogawa A, Inoue T, Fukuyama S, Eriguchi Y, Yonekawa A, Kan-O K, Matsumoto K, Kanaoka K, Ihara S, Komuta K, Inoue Y, Chiba S, Yamagata K, Hiramatsu Y, Kai H, Asano K, Oguma T, Ito Y, Hashimoto S, Yamasaki M, Kasamatsu Y, Komase Y, Hida N, Tsuburai T, Oyama B, Takada M, Kanda H, Kitagawa Y, Fukuta T, Miyake T, Yoshida S, Ogura S, Abe S, Kono Y, Togashi Y, Takoi H, Kikuchi R, Ogawa S, Ogata T, Ishihara S, Kanehiro A, Ozaki S, Fuchimoto Y, Wada S, Fujimoto N, Nishiyama K, Terashima M, Beppu S, Yoshida K, Narumoto O, Nagai H, Ooshima N, Motegi M, Umeda A, Miyagawa K, Shimada H, Endo M, Ohira Y, Watanabe M, Inoue S, Igarashi A, Sato M, Sagara H, Tanaka A, Ohta S, Kimura T, Shibata Y, Tanino Y, Nikaido T, Minemura H, Sato Y, Yamada Y, Hashino T, Shinoki M, Iwagoe H, Takahashi H, Fujii K, Kishi H, Kanai M, Imamura T, Yamashita T, Yatomi M, Maeno T, Hayashi S, Takahashi M, Kuramochi M, Kamimaki I, Tominaga Y, Ishii T, Utsugi M, Ono A, Tanaka T, Kashiwada T, Fujita K, Saito Y, Seike M, Watanabe H, Matsuse H, Kodaka N, Nakano C, Oshio T, Hirouchi T, Makino S, Egi M, Biobank Japan Project. Omae Y, Nannya Y, Ueno T, Katayama K, Ai M, Fukui Y, Kumanogoh A, Sato T, Hasegawa N, Tokunaga K, Ishii M, Koike R, Kitagawa Y, Kimura A, Imoto S, Miyano S, Ogawa S, Kanai T, Fukunaga K, Okada Y. DOCK2 is involved in the host genetics and biology of severe COVID-19. Nature. 2022;609:754–760. doi: 10.1038/s41586-022-05163-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Niemi MEK, Karjalainen J, Liao RG, Neale BM, Daly M, Ganna A, Pathak GA, Andrews SJ, Kanai M, Veerapen K, Fernandez-Cadenas I, Schulte EC, Striano P, Marttila M, Minica C, Marouli E, Karim MA, Wendt FR, Savage J. Mapping the human genetic architecture of COVID-19. Nature. 2021;600:03767-x. doi: 10.1038/s41586-021-03767-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Pairo-Castineira E, Rawlik K, Bretherick AD, Qi T, Wu Y, Nassiri I, McConkey GA, Zechner M, Klaric L, Griffiths F, Oosthuyzen W, Kousathanas A, Richmond A, Millar J, Russell CD, Malinauskas T, Thwaites R, Morrice K, Keating S, Maslove D, Nichol A, Semple MG, Knight J, Shankar-Hari M, Summers C, Hinds C, Horby P, Ling L, McAuley D, Montgomery H, Openshaw PJM, Begg C, Walsh T, Tenesa A, Flores C, Riancho JA, Rojas-Martinez A, Lapunzina P, GenOMICC Investigators. SCOURGE Consortium. ISARICC Investigators. 23andMe COVID-19 Team. Yang J, Ponting CP, Wilson JF, Vitart V, Abedalthagafi M, Luchessi AD, Parra EJ, Cruz R, Carracedo A, Fawkes A, Murphy L, Rowan K, Pereira AC, Law A, Fairfax B, Hendry SC, Baillie JK. GWAS and meta-analysis identifies 49 genetic variants underlying critical COVID-19. Nature. 2023;617:764–768. doi: 10.1038/s41586-023-06034-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Pereira AC, Bes TM, Velho M, Marques E, Jannes CE, Valino KR, Dinardo CL, Costa SF, Duarte AJS, Santos AR, Mitne-Neto M, Medina-Pestana J, Krieger JE. Genetic risk factors and COVID-19 severity in Brazil: results from BRACOVID study. Human Molecular Genetics. 2022;31:3021–3031. doi: 10.1093/hmg/ddac045. [DOI] [PubMed] [Google Scholar]
  43. Peterson RE, Kuchenbaecker K, Walters RK, Chen C-Y, Popejoy AB, Periyasamy S, Lam M, Iyegbe C, Strawbridge RJ, Brick L, Carey CE, Martin AR, Meyers JL, Su J, Chen J, Edwards AC, Kalungi A, Koen N, Majara L, Schwarz E, Smoller JW, Stahl EA, Sullivan PF, Vassos E, Mowry B, Prieto ML, Cuellar-Barboza A, Bigdeli TB, Edenberg HJ, Huang H, Duncan LE. Genome-wide association studies in ancestrally diverse populations: opportunities, methods, pitfalls, and recommendations. Cell. 2019;179:589–603. doi: 10.1016/j.cell.2019.08.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Policard M, Jain S, Rego S, Dakshanamurthy S. Immune characterization and profiles of SARS-CoV-2 infected patients reveals potential host therapeutic targets and SARS-CoV-2 oncogenesis mechanism. Virus Research. 2021;301:198464. doi: 10.1016/j.virusres.2021.198464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature. 2016;538:161–164. doi: 10.1038/538161a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Rosenberg NA, Huang L, Jewett EM, Szpiech ZA, Jankovic I, Boehnke M. Genome-wide association studies in diverse populations. Nature Reviews. Genetics. 2010;11:356–366. doi: 10.1038/nrg2760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Silva-Aguiar RP, Peruchetti DB, Rocco PRM, Schmaier AH, E Silva PMR, Martins MA, Carvalho VF, Pinheiro AAS, Caruso-Neves C. Role of the renin-angiotensin system in the development of severe COVID-19 in hypertensive patients. American Journal of Physiology. Lung Cellular and Molecular Physiology. 2020;319:L596–L602. doi: 10.1152/ajplung.00286.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Sims AC, Tilton SC, Menachery VD, Gralinski LE, Schäfer A, Matzke MM, Webb-Robertson BJM, Chang J, Luna ML, Long CE, Shukla AK, Burkett SE, Zornetzer G, Tseng CTK, Metz TO, Pickles R, McWeeney S, Smith RD, Katze MG, Waters KM, Baric RS. Release of severe acute respiratory syndrome coronavirus nuclear import block enhances host transcription in human lung cells. Journal of Virology. 2013;87:3885–3902. doi: 10.1128/JVI.02520-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Sirugo G, Williams SM, Tishkoff SA. The missing diversity in human genetic studies. Cell. 2019;177:26–31. doi: 10.1016/j.cell.2019.02.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nature Communications. 2017;8:1826. doi: 10.1038/s41467-017-01261-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wei J, Alfajaro MM, DeWeirdt PC, Hanna RE, Lu-Culligan WJ, Cai WL, Strine MS, Zhang SM, Graziano VR, Schmitz CO, Chen JS, Mankowski MC, Filler RB, Ravindra NG, Gasque V, de Miguel FJ, Patil A, Chen H, Oguntuyo KY, Abriola L, Surovtseva YV, Orchard RC, Lee B, Lindenbach BD, Politi K, van Dijk D, Kadoch C, Simon MD, Yan Q, Doench JG, Wilen CB. Genome-wide CRISPR screens reveal host factors critical for SARS-CoV-2 infection. Cell. 2021;184:76–91. doi: 10.1016/j.cell.2020.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Wei J, Patil A, Collings CK, Alfajaro MM, Liang Y, Cai WL, Strine MS, Filler RB, DeWeirdt PC, Hanna RE, Menasche BL, Ökten A, Peña-Hernández MA, Klein J, McNamara A, Rosales R, McGovern BL, Luis Rodriguez M, García-Sastre A, White KM, Qin Y, Doench JG, Yan Q, Iwasaki A, Zwaka TP, Qi J, Kadoch C, Wilen CB. Pharmacological disruption of mSWI/SNF complex activity restricts SARS-CoV-2 infection. Nature Genetics. 2023;55:471–483. doi: 10.1038/s41588-023-01307-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Wojcik GL, Graff M, Nishimura KK, Tao R, Haessler J, Gignoux CR, Highland HM, Patel YM, Sorokin EP, Avery CL, Belbin GM, Bien SA, Cheng I, Cullina S, Hodonsky CJ, Hu Y, Huckins LM, Jeff J, Justice AE, Kocarnik JM, Lim U, Lin BM, Lu Y, Nelson SC, Park S-SL, Poisner H, Preuss MH, Richard MA, Schurmann C, Setiawan VW, Sockell A, Vahi K, Verbanck M, Vishnu A, Walker RW, Young KL, Zubair N, Acuña-Alonso V, Ambite JL, Barnes KC, Boerwinkle E, Bottinger EP, Bustamante CD, Caberto C, Canizales-Quinteros S, Conomos MP, Deelman E, Do R, Doheny K, Fernández-Rhodes L, Fornage M, Hailu B, Heiss G, Henn BM, Hindorff LA, Jackson RD, Laurie CA, Laurie CC, Li Y, Lin D-Y, Moreno-Estrada A, Nadkarni G, Norman PJ, Pooler LC, Reiner AP, Romm J, Sabatti C, Sandoval K, Sheng X, Stahl EA, Stram DO, Thornton TA, Wassel CL, Wilkens LR, Winkler CA, Yoneyama S, Buyske S, Haiman CA, Kooperberg C, Le Marchand L, Loos RJF, Matise TC, North KE, Peters U, Kenny EE, Carlson CS. Genetic analyses of diverse populations improves discovery for complex traits. Nature. 2019;570:514–518. doi: 10.1038/s41586-019-1310-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Won KJ, Im JY, Yun CO, Chung KS, Kim YJ, Lee JS, Jung YJ, Kim BK, Song KB, Kim YH, Chun HK, Jung KE, Kim MH, Won M. Human Noxin is an anti-apoptotic protein in response to DNA damage of A549 non-small cell lung carcinoma. International Journal of Cancer. 2014;134:2595–2604. doi: 10.1002/ijc.28600. [DOI] [PubMed] [Google Scholar]
  57. Xia L, Wang X, Liu L, Fu J, Xiao W, Liang Q, Han X, Huang S, Sun L, Gao Y, Zhang C, Yang L, Wang L, Qian L, Zhou Y. lnc-BAZ2B promotes M2 macrophage activation and inflammation in children with asthma through stabilizing BAZ2B pre-mRNA. The Journal of Allergy and Clinical Immunology. 2021;147:921–932. doi: 10.1016/j.jaci.2020.06.034. [DOI] [PubMed] [Google Scholar]
  58. Yang Q, Tang J, Cao J, Liu F, Fu M, Xue B, Zhou A, Chen S, Liu J, Zhou Y, Shi Y, Peng W, Chen X. SARS-CoV-2 infection activates CREB/CBP in cellular cyclic AMP-dependent pathways. Journal of Medical Virology. 2023;95:e28383. doi: 10.1002/jmv.28383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Yoneyama M, Suhara W, Fukuhara Y, Fukuda M, Nishida E, Fujita T. Direct triggering of the type I interferon system by virus infection: activation of a transcription factor complex containing IRF-3 and CBP/p300. The EMBO Journal. 1998;17:1087–1095. doi: 10.1093/emboj/17.4.1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Zheng X, Davis JW. SAIGEgds-an efficient statistical tool for large-scale PheWAS with mixed models. Bioinformatics. 2021;37:728–730. doi: 10.1093/bioinformatics/btaa731. [DOI] [PubMed] [Google Scholar]
  61. Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, LeFaive J, VandeHaar P, Gagliano SA, Gifford A, Bastarache LA, Wei WQ, Denny JC, Lin M, Hveem K, Kang HM, Abecasis GR, Willer CJ, Lee S. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nature Genetics. 2018;50:1335–1341. doi: 10.1038/s41588-018-0184-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Zhu X, Trimarco JD, Williams CA, Barrera A, Reddy TE, Heaton NS. ZBTB7A promotes virus-host homeostasis during human coronavirus 229E infection. Cell Reports. 2022;41:111540. doi: 10.1016/j.celrep.2022.111540. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife assessment

Siming Zhao 1

The authors conducted a valuable GWAS meta-analysis for COVID-19 hospitalization in admixed American populations and prioritized risk variants and genes. The evidence supporting the claims of the authors is solid. The work will be of interest to scientists studying the genetic basis of COVID pathogenesis.

Reviewer #1 (Public review):

Anonymous

Summary:

This paper conducted a GWAS meta-analysis for COVID-19 hospitalization among admixed American populations. The authors identified four genome-wide significant associations, including two novel loci (BAZ2B and DDIAS), and an additional risk locus near CREBBP using cross-ancestry meta-analysis. They utilized multiple strategies to prioritize risk variants and target genes. Finally, they constructed and assessed a polygenic risk score model with 49 variants associated with critical COVID-19 conditions.

Strengths:

Given that most of the previous studies were done in European ancestries, this study provides unique findings about the genetics of COVID-19 in admixed American populations. The GWAS data would be a valuable resource for the community. The authors conducted comprehensive analyses using multiple different strategies, including Bayesian fine mapping, colocalization, TWAS, etc., to prioritize risk variants and target genes. The polygenic risk score (PGS) result demonstrated the ability of cross-population PGS model for COVID-19 risk stratification.

Weaknesses:

(1) One of the major limitations of this study is that the GWAS sample size is relatively small, which limits its power.

(2) Lack of replication cohort.

(3) Colocalization and TWAS used eQTL data from GTEx data, which are mainly from European ancestries.

Comments on latest version:

The authors addressed most of my concerns.

Reviewer #2 (Public review):

Anonymous

This is a genome-wide association study of COVID-19 in individuals of admixed American ancestry (AMR) recruited from Brazil, Colombia, Ecuador, Mexico, Paraguay and Spain. After quality control and admixture analysis, a total of 3,512 individuals were interrogated for 10,671,028 genetic variants (genotyped + imputed). The genetic association results for these cohorts were meta-analyzed with the results from The Host Genetics Initiative (HGI), involving 3,077 cases and 66,686 controls. The authors found two novel genetic loci associated with COVID-19 at 2q24.2 (rs13003835) and 11q14.1 (rs77599934), and other two independent signals at 3p21.31 (rs35731912) and 6p21.1 (rs2477820) already reported as associated with COVID-19 in previous GWASs. Additional meta-analysis with other HGI studies also suggested risk variants near CREBBP, ZBTB7A and CASC20 genes.

Strengths:

These findings rely on state-of-the-art methods in the field of Statistical Genomics and help to address the issue of low number of GWASs in non-European populations, ultimately contributing to reduce health inequalities across the globe.

Weaknesses:

There is no replication cohort, as acknowledged by the authors (page 29, line 587) and no experimental validation to assess the biological effect of putative causal variants/genes. Thus, the study provides good evidence of association, rather than causation, between the genetic variants and COVID-19.

Comments on latest version:

The issues identified in the first round of review were well addressed by the authors in the revised version of the manuscript.

Reviewer #3 (Public review):

Anonymous

Summary:

In the context of the SCOURGE consortium's research, the authors conduct a GWAS meta-analysis on 4,702 hospitalized individuals of admixed American descent suffering from COVID-19. This study identified four significant genetic associations, including two loci initially discovered in Latin American cohorts. Furthermore, a trans-ethnic meta-analysis highlighted an additional novel risk locus in the CREBBP gene, underscoring the critical role of genetic diversity in understanding the pathogenesis of COVID-19.

Strengths:

(1) The study identified two novel severe COVID-19 loci (BAZ2B and DDIAS) by the largest GWAS meta-analysis for COVID-19 hospitalization in admixed Americans.

(2) With a trans-ethnic meta-analysis, an additional risk locus near CREBBP was identified.

Weaknesses:

(1) The GWAS power is limited due to the relatively small number of cases.

(2) There is no replication study for the novel severe COVID-19 loci, which may lead to false positive findings.

(3) The variants selected for the PGS appear arbitrary and may not leverage the GWAS findings.

(4) The TWAS models were predominantly trained on European samples, and there is no replication study for the findings as well.

eLife. 2024 Oct 3;13:RP93666. doi: 10.7554/eLife.93666.3.sa4

Author response

Silvia Diz-de Almeida 1, Raquel Cruz 2, Andre D Luchessi 3, José M Lorenzo-Salazar 4, Miguel López de Heredia 5, Inés Quintela 6, Rafaela González-Montelongo 7, Vivian Nogueira Silbiger 8, Marta Sevilla Porras 9, Jair Antonio Tenorio Castaño 10, Julian Nevado 11, Jose María Aguado 12, Carlos Aguilar 13, Sergio Aguilera-Albesa 14, Virginia Almadana 15, Berta Almoguera 16, Nuria Alvarez 17, Álvaro Andreu-Bernabeu 18, Eunate Arana-Arri 19, Celso Arango 20, María J Arranz 21, Maria-Jesus Artiga 22, Raúl C Baptista-Rosas 23, María Barreda- Sánchez 24, Moncef Belhassen-Garcia 25, Joao F Bezerra 26, Marcos AC Bezerra 27, Lucía Boix-Palop 28, María Brion 29, Ramón Brugada 30, Matilde Bustos 31, Enrique J Calderón 32, Cristina Carbonell 33, Luis Castano 34, Jose E Castelao 35, Rosa Conde-Vicente 36, M Lourdes Cordero-Lorenzana 37, Jose L Cortes-Sanchez 38, Marta Corton 39, M Teresa Darnaude 40, Alba De Martino-Rodríguez 41, Victor del Campo-Pérez 42, Aranzazu Diaz de Bustamante 43, Elena Domínguez-Garrido 44, Rocío Eirós 45, María Carmen Fariñas 46, María J Fernandez-Nestosa 47, Uxía Fernández-Robelo 48, Amanda Fernández-Rodríguez 49, Tania Fernández-Villa 50, Manuela Gago-Dominguez 51, Belén Gil-Fournier 52, Javier Gómez-Arrue 53, Beatriz González Álvarez 54, Fernan Gonzalez Bernaldo de Quirós 55, Anna González-Neira 56, Javier González-Peñas 57, Juan F Gutiérrez-Bautista 58, María José Herrero 59, Antonio Herrero-Gonzalez 60, María A Jimenez-Sousa 61, María Claudia Lattig 62, Anabel Liger Borja 63, Rosario Lopez-Rodriguez 64, Esther Mancebo 65, Caridad Martín-López 66, Vicente Martín 67, Oscar Martinez-Nieto 68, Iciar Martinez-Lopez 69, Michel F Martinez-Resendez 70, Angel Martinez-Perez 71, Juliana F Mazzeu 72, Eleuterio Merayo Macías 73, Pablo Minguez 74, Victor Moreno Cuerda 75, Silviene F Oliveira 76, Eva Ortega-Paino 77, Mara Parellada 78, Estela Paz-Artal 79, Ney PC Santos 80, Patricia Pérez-Matute 81, Patricia Perez 82, M Elena Pérez-Tomás 83, Teresa Perucho 84, Mellina Pinsach-Abuin 85, Guillermo Pita 86, Ericka N Pompa-Mera 87, Gloria L Porras-Hurtado 88, Aurora Pujol 89, Soraya Ramiro León 90, Salvador Resino 91, Marianne R Fernandes 92, Emilio Rodríguez-Ruiz 93, Fernando Rodriguez-Artalejo 94, José A Rodriguez-Garcia 95, Francisco Ruiz-Cabello 96, Javier Ruiz-Hornillos 97, Pablo Ryan 98, José Manuel Soria 99, Juan Carlos Souto 100, Eduardo Tamayo 101, Alvaro Tamayo-Velasco 102, Juan Carlos Taracido-Fernandez 103, Alejandro Teper 104, Lilian Torres-Tobar 105, Miguel Urioste 106, Juan Valencia-Ramos 107, Zuleima Yáñez 108, Ruth Zarate 109, Itziar de Rojas 110, Agustín Ruiz 111, Pascual Sánchez 112, Luis Miguel Real 113, Encarna Guillen-Navarro 114, Carmen Ayuso 115, Esteban Parra 116, José A Riancho 117, Augusto Rojas-Martinez 118, Carlos Flores 119, Pablo Lapunzina 120, Ángel Carracedo 121

The following is the authors’ response to the original reviews.

Public Reviews:

Reviewer #1 (Public Review):

Summary:

This paper conducted a GWAS meta-analysis for COVID-19 hospitalization among admixed American populations. The authors identified four genome-wide significant associations, including two novel loci (BAZ2B and DDIAS), and an additional risk locus near CREBBP using cross-ancestry meta-analysis. They utilized multiple strategies to prioritize risk variants and target genes. Finally, they constructed and assessed a polygenic risk score model with 49 variants associated with critical COVID-19 conditions.

Strengths:

Given that most of the previous studies were done in European ancestries, this study provides unique findings about the genetics of COVID-19 in admixed American populations. The GWAS data would be a valuable resource for the community. The authors conducted comprehensive analyses using multiple different strategies, including Bayesian fine mapping, colocalization, TWAS, etc., to prioritize risk variants and target genes. The polygenic risk score (PGS) result demonstrated the ability of the cross-population

PGS model for COVID-19 risk stratification.

Thank you very much for the positive comments and the willingness to revise this manuscript.

Weaknesses:

(1) One of the major limitations of this study is that the GWAS sample size is relatively small, which limits its power.

(2) The fine mapping section is unclear and there is a lack of information. The authors assumed one causal signal per locus, and only provided credible sets, but did not provide posterior inclusion probabilities (PIP) for the variants to be causal.

(3) Colocalization and TWAS used eQTL data from GTEx data, which are mainly from European ancestries. It is unclear how much impact the ancestry mismatch would have on the result. The readers should be cautious when interpreting the results and designing follow-up studies.

We agree with that the sample size is relatively small. Despite that, it was sufficient to reveal novel risk loci supporting the robustness of the main findings. We have indicated this limitation at the end of the discussion section.

Thank you for rising this point. As suggested, we have also used SuSIE, which allows to assume more than one causal signal per locus. However, in this case the results were not different from those obtained with the original Bayesian colocalization performed with corrcoverage. Regarding the PIP, at the fine mapping stage we are inclined to put more weight on the functional annotations of the variants in the credible set than on the statistical contributions to the signal. This is the reason why we prefer not to put weight on the PIP of the variants but prioritize variants that were enriched functional annotations.

This is a good point regarding the lack of diversity in GTEx data. We have also used data from AMR populations (GALA II-SAGE models), although it was only available for blood tissue. Regarding the ancestry mismatch between datasets, several studies have attempted to explore the impact. Gay et al. (PMID: 32912333) studied local ancestry effects on eQTLs from the GTEx consortium and concluded that adjustment of eQTLs by local ancestry only yields modest improvement over using global ancestry (as done in GTEx). Moreover, the colocalization results between adjusting by Local Ancestry and Global Ancestry were not significantly different. Besides, Mogil et al. (PMID: 30096133) observed that genes with higher heritability share genetic architecture between populations. Nevertheless, both studies have evidenced decreased power and poorer predictive performances regarding gene expression because of reduced diversity in eQTL analyses. As consequence of the ancestry mismatch, we now warn the readers that this may compromise signal detection (Discussion, lines 531-533).

Reviewer #2 (Public Review):

This is a genome-wide association study of COVID-19 in individuals of admixed American ancestry (AMR) recruited from Brazil, Colombia, Ecuador, Mexico, Paraguay, and Spain. After quality control and admixture analysis, a total of 3,512 individuals were interrogated for 10,671,028 genetic variants (genotyped + imputed). The genetic association results for these cohorts were meta-analyzed with the results from The Host Genetics Initiative (HGI), involving 3,077 cases and 66,686 controls. The authors found two novel genetic loci associated with COVID-19 at 2q24.2 (rs13003835) and 11q14.1 (rs77599934), and other two independent signals at 3p21.31 (rs35731912) and 6p21.1 (rs2477820) already reported as associated with COVID-19 in previous GWASs. Additional meta-analysis with other HGI studies also suggested risk variants near CREBBP, ZBTB7A, and CASC20 genes.

Strengths:

These findings rely on state-of-the-art methods in the field of Statistical Genomics and help to address the issue of a low number of GWASs in non-European populations, ultimately contributing to reducing health inequalities across the globe.

Thank you very much for the positive comments and the willingness to revise this manuscript.

Weaknesses:

There is no replication cohort, as acknowledged by the authors (page 29, line 587), and no experimental validation to assess the biological effect of putative causal variants/genes. Thus, the study provides good evidence of association, rather than causation, between the genetic variants and COVID-19. Lastly, I consider it crucial to report the results for the SCOURGE Latin American GWAS, in addition to its meta-analysis with HGI results, since HGI data has a different phenotype scheme (Hospitalized COVID vs Population) compared to SCOURGE (Hospitalized COVID vs Non-hospitalized COVID).

We essentially agree with the reviewer in that one of the main limitations of the study is the lack of a replication stage because of the use of all available datasets on a one-stage analysis. To contribute to the interpretation of the findings in the absence of a replication stage, we now assessed the replicability of the novel loci using the Meta-Analysis Model-based Assessment of replicability (MAMBA) approach (PMID: 33785739) and included the posterior probabilities of replication in Table 2. We also explored further the potential replicability of signals in other populations. We agree that the results should be interpreted in terms of associations given the lack of functional validation of main findings, so we have slightly modified the discussion.

As suggested, the SCOURGE Latin American GWAS summary is now accessible by direct request to the Consortium GitHub repository (https://github.com/CIBERER/Scourge-COVID19) (lines 797-799). We have also included the results from the SCOURGE GWAS analysis for the replication of the 40 lead variants in the Supplementary Table 12. Results from the SCOURGE GWAS for the lead variants in the AMR meta-analysis with HGI were already included in the Supplementary Table 2. As note, we have not been able to conduct the meta-analysis with the same hospitalization scheme as in the HGI study since the population-specific results for those analyses were not publicly released. However, sensitivity analyses included within the supplementary material from the COVID-19 Host Genetics Initiative (2021) stated that there were no significant differences in effects (Odds Ratios) between analyses using population controls or just non-hospitalized COVID-19 patients.

Reviewer #3 (Public Review):

Summary:

In the context of the SCOURGE consortium's research, the authors conduct a GWAS meta-analysis on 4,702 hospitalized individuals of admixed American descent suffering from COVID-19. This study identified four significant genetic associations, including two loci initially discovered in Latin American cohorts. Furthermore, a trans-ethnic meta-analysis highlighted an additional novel risk locus in the CREBBP gene, underscoring the critical role of genetic diversity in understanding the pathogenesis of COVID-19.

Strengths:

(1) The study identified two novel severe COVID-19 loci (BAZ2B and DDIAS) by the largest GWAS meta-analysis for COVID-19 hospitalization in admixed Americans.

(2) With a trans-ethnic meta-analysis, an additional risk locus near CREBBP was identified.

Thank you very much for the positive comments and the willingness to revise this manuscript.

Weaknesses:

(1) The GWAS power is limited due to the relatively small number of cases.

(2) There is no replication study for the novel severe COVID-19 loci, which may lead to false positive findings.

We agree with that the sample size is relatively small. Despite that, it was sufficient to reveal novel risk loci supporting the robustness of the main findings. We have indicated this limitation at the end of the discussion section.

Regarding the lack of a replication study, we now assessed the replicability of the novel loci using the Meta-Analysis Model-based Assessment of replicability (MAMBA) approach (PMID: 33785739). We have included the posterior probabilities of replication in Table 2.

(3) Significant differences exist in the ages between cases and controls, which could potentially introduce biased confounders. I'm curious about how the authors treated age as a covariate. For instance, did they use ten-year intervals? This needs clarification for reproducibility.

Thank you for rising this point. Age was included as a continuous variable. This has been now indicated in line 667 (within Material and Methods).

(4)"Those in the top PGS decile exhibited a 5.90-fold (95% CI=3.29-10.60, p=2.79x10-9) greater risk compared to individuals in the lowest decile". I would recommend comparing with the 40-60% PGS decile rather than the lowest decile, as the lowest PGS decile does not represent 'normal controls'.

Thank you. In the revised version, the PGS categories was compared following the recommendation (lines 461-463).

(5) In the field of PGS, it's common to require an independent dataset for training and testing the PGS model. Here, there seems to be an overfitting issue due to using the same subjects for both training and testing the variants.

We are sorry for the misunderstanding. In fact, we have followed the standard to avoid overfitting of the PGS model and have used different training and testing datasets. The training data (GWAS) was the HGI-B2 ALL meta-analysis, in which our AMR GWAS was not included. The PRS model was then tested in the SCOURGE AMR cohort. However, it is true that we did test the combination of the PRS adding the new discovered variants in the SCOURGE cohort. To avoid potential overfitting by adding the new loci, we have excluded from the manuscript the results on which we included the newly discovered variants.

(6) The variants selected for the PGS appear arbitrary and may not leverage the GWAS findings without an independent training dataset.

Again, we are sorry for the misunderstanding. The PGS model was built with 43 variants associated with hospitalization or severity within the HGI v7 results and 7 which were discovered by the GenOMICC consortium in their latest study and were not in the latest HGI release. The variants are included within the Supplementary Table 14, but we have now annotated the discovery GWAS.

(7) The TWAS models were predominantly trained on European samples, and there is no replication study for the findings as well.

This is a good point regarding the lack of diversity in GTEx data. We have also used data from AMR populations (GALA II-SAGE models), although it was only available for blood tissue. Regarding the ancestry mismatch between datasets, several studies have attempted to explore the impact. Gay et al. (PMID: 32912333) studied local ancestry effects on eQTLs from the GTEx consortium and concluded that adjustment of eQTLs by local ancestry only yields modest improvement over using global ancestry (as done in GTEx). Moreover, the colocalization results between adjusting by Local Ancestry and Global Ancestry were not significantly different. Besides, Mogil et al. (PMID: 30096133) observed that genes with higher heritability share genetic architecture between populations. Nevertheless, both studies have evidenced decreased power and poorer predictive performances regarding gene expression because of reduced diversity in eQTL analyses. As consequence of the ancestry mismatch, we now warn the readers that this may compromise signal detection (Discussion, lines 531-533).

Recommendations for the authors:

Reviewer #1 (Recommendations For The Authors):

(1) The authors mentioned the fine mapping method did not converge for the locus in chr 11. I would consider trying a different fine-mapping method (such as SuSiE or FINEMAP). It would be helpful to provide posterior inclusion probabilities (PIP) for the variants in fine mapping results and plot the PIP values in the regional association plots.

As suggested, we have also used SuSIE, which allows to assume more than one causal signal per locus. However, in this case the results were not different from those obtained with the original Bayesian colocalization performed with corrcoverage. SuSIE’s fine-mapping for chromosome 11 prioritized a single variant, which is likely due to the rare frequency. Thus, we have maintained the fine-mapping as it was originally indicated in the previous version of the manuscript but have now included the credible set in Supplementary Table 6.

Regarding the PIP, at the fine mapping stage we are inclined to put more weight on the functional annotations of the variants in the credible set than on the statistical contributions to the signal. This is the reason why we prefer not to put weight on the PIP of the variants but prioritize variants that were enriched functional annotations.

(2) Please provide more detailed information about the VEP and V2G analysis and how to interpret those results. My understanding of V2G is that it includes different sources of information (such as molecular QTLs and chromatin interactions from different tissues/cell types, etc.). It is unclear what sources of information and weight settings were used in the V2G model.

Thank you for rising this point. As suggested, we have clarified the basis for VEP and V2G and the interpretation (lines 732-743).

(3) The authors identified multiple genes with different strategies, e.g. FUMA, V2G, COLOC, TWAS, etc. How many genes were found/supported by evidence provided by multiple methods? It could be helpful to have a table summarizing the risk genes found by different strategies, and the evidence supporting the genes. e.g. which genes are found by which methods, and the biological functions of the genes, etc.

Thank you for rising this point. As suggested, we now added a new figure (Figure 5) to summarize the findings with the multiple methods used.

(4) It would be helpful to make the code/scripts available for reproducibility.

As suggested, the SCOURGE Latin American GWAS summary and the analysis scripts (https://github.com/CIBERER/Scourge-COVID19/tree/main/scripts/novel-risk-hosp-AMR-2024) are now accessible in the Consortium GitHub repository (https://github.com/CIBERER/Scourge-COVID19) (lines 806-807).

(5) The fonts in some of the figures (e.g. Figure 2) are hard to read.

Thank you. We have now included the figures as SVG files.

Reviewer #2 (Recommendations For The Authors):

- The abstract lacks a conclusion sentence.

Thank you. As suggested, we have included two additional sentences with broad conclusions from the study. We preferred to avoid relying on conclusions related to known or new biological links of the prioritized genes given the lack of functional validation of main findings.

- Regarding the association analysis (page 27, line 677), I wonder if some of the 10 principal components (PCs) are capturing information about the recruitment areas (countries). It may be relevant to test for multicollinearity among these variables.

Since we acknowledge that some of the categories might be correlated with a certain PC but not all of them do, we have calculated GVIF values for the main variables to assess the categorical variable as a single entity. The scaled GVIF^1(1/2*Df) value for the categorical variable is 1.52. Thus, if we square this value, we obtain 2.31, which can be then used for applying usual rule-of-thumb for VIF values.

- Still on the topic of association analysis, did the authors adjust the logistic model for comorbidities variables from Table 1? Given these comorbidities also have a genetic component and their distribution differs between non-hospitalized vs hospitalized, I am concerned that comorbidities might be confounding the association between genetic variants and COVID.

We did not adjust by comorbidities since HGI studies were not adjusted either and we aimed to be as aligned as possible with HGI. However, as suggested, we have now tested the association between each of the comorbidities in Table 1 and each of the variants in Table 2, using the comorbidities as dependent variables and adjusting for the main covariables (age, sex, PCs and country of recruitment). None of the variants were significantly associated to the comorbidities (line 333).

- If I understood correctly, the 49 genetic variants used to develop the polygenic risk score model (PRS) were based on the HGI total sample size (data release 7), which is predominantly of European ancestry. I am concerned about the prediction accuracy in the AMR population (PRS transferability issue).

We have explored literature in search of other PRS to compare the associated OR in our cohort with ORs calculated in European populations. Horowitz et al. (2022) reported an OR of 1.38 for the top 10% with respect to hospitalization risk in European individuals using a GRS with 12 variants.

We acknowledge that this might be an issue and is now explained in discussion of the revised version (lines 561-568). However, as this is the first time a PRS for COVID-19 is applied to a relatively large AMR cohort, we believe that this analysis will be of value for further analyses regarding PRS transferability, providing a source for comparison in further studies.

- On page 23, line 579, the authors acknowledge their "GWAS is underpowered". This sentence requires a sample/power calculation, otherwise, I suggest using "is likely underpowered".

Thanks for the input. We have modified the sentence as suggested.

Reviewer #3 (Recommendations For The Authors):

I wonder if the authors have an approximate date when the GWAS summary statistic will be available. I reviewed some manuscripts in the past, and the authors claimed they would deposit the data soon, but in fact it would not happen until 2 years later.

The summary statistics are already available from the SCOURGE Consortium repository https://github.com/CIBERER/Scourge-COVID19 (lines 806-807).

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Consortium GTEx. 2020. GTEx V8. GTEx Portal. Single-Tissue cis-QTL
    2. Consortium Genomes Project 2016. 1000 Genomes Phase 3. PLINK 2.0 Resources. 1000 Genomes phase 3
    3. COVID-19 Host Genetics Initiative 2022. COVID19-hg GWAS meta-analyses round 7. COVID-19 hg repository. r7

    Supplementary Materials

    Supplementary file 1. Participating centers.
    elife-93666-supp1.xlsx (14.5KB, xlsx)
    Supplementary file 2. Independent variants with p-value<1 × 10–05 in the SC-HGI_AMR GWAS meta-analysis (hg38).

    EA: effect allele; NEA: non-effect allele; EAF: effect allele frequency; EAF_avg: averaged effect allele frequency; FreqSE: standard error of averaged effect allele frequency; SCOURGE_AMR: SCOURGE Latin-America; HGIB2_AMR: HGI meta-analysis of AMR studies.

    elife-93666-supp2.xlsx (25KB, xlsx)
    Supplementary file 3. Annotated SNPs in moderate-to-strong LD with lead SNPs of the genome-wide significant loci in the SC-HGI_AMR GWAS meta-analysis, with ANNOVAR.

    NEA: non-effect allele; EA: effect allele; r2: maximum r2 of the SNP with one of the independent SNPs; IndSigSNP: the independent SNP which has the maximum r2 value with the SNP; dist: distance to the nearest gene; func: functional consequence of the SNP on the gene; CADD: CADD score; RDB: RegulomeDB score; minChrState: the minimum 15-core chromatin state across 127 tissues/cell types; commonChrState: the most common 15-core chromatin state across 127 tissues/cell types; posMapFilt: 1 if the SNP was used for positional mapping, 0 otherwise; eqtlMapFilt: 1 if the SNP was used for eQTL mapping, 0 otherwise.

    elife-93666-supp3.xlsx (54KB, xlsx)
    Supplementary file 4. Results from the MAGMA gene-based analysis in the SC-HGI_AMR GWAS meta-analysis (hg37).

    NSNPS: number of SNPs in the gene; NPARAM: the number of relevant parameters used in the model; ZSTAT: z statistics.

    elife-93666-supp4.xlsx (1.4MB, xlsx)
    Supplementary file 5. Prioritized genes by eQTL and positional mapping by FUMA in the SC-HGI_AMR GWAS meta-analysis results (hg37).

    HUGO: HGNC gene symbol; pLI: pLI score from ExAC database, probability of being intolerant to loss of function (higher the score, higher the intolerance); ncRVIS: non-coding residual variation intolerance score (higher the score, higher intolerance to non-coding variation); posMapSNPs: number of SNPs mapped by positional mapping; posMapMaxCADD: the maximum CADD score of mapped SNPs by positional mapping; eqtlMapSNPS: the number of SNPs mapped to the genes based on eQTL mapping; eqtlMapminP: the minimum eQTL p-value of mapped SNPs; eqtlMapminQ: the minimum eQTL FDR of mapped SNPs; eqtlMapts: tissue of mapped eQTLs; eqtlDirection: consequential direction of mapped eQTL SNPs after aligning the risk alleles; minGwasP: minimum GWAS p-value of mapped eQTLs; IndSigSNPs: independent SNPs that are in LD with the mapped SNPs.

    elife-93666-supp5.xlsx (17.4KB, xlsx)
    Supplementary file 6. Fine-mapped credible set derived with corrcoverage (95%) for the associated region in chromosome 2 (BAZ2B).
    elife-93666-supp6.xlsx (18KB, xlsx)
    Supplementary file 7. VEP annotations for the variants included in the fine-mapped credible sets for the novel associated loci in chromosome 2 (hg38).
    elife-93666-supp7.xlsx (24.2KB, xlsx)
    Supplementary file 8. V2G scores for the variants included in the fine-mapped credible sets in the novel risk loci from chromosomes 2 and 16 (hg38).

    Shaded in green, the prioritized gene by the V2G score.

    elife-93666-supp8.xlsx (18.7KB, xlsx)
    Supplementary file 9. MultiXcan results for the SC-HGI_AMR GWAS meta-analysis.

    N: number of tissues available for the gene; n_indep: number of independent components of variation kept among the tissues' predictions; p_i_best: best p-value of single tissue S-prediXcan association; t_i_best: name of best single tissue S-prediXcan association; p_i_worst: worst p-value of single tissue S-prediXcan association; t_i_worst: name of worst single tissue S-prediXcan association; eigen_max: eigenvalue of the top independent component in the SVD decomposition of predicted expression correlation; eigen_min: eigenvalue of the last independent component in the SVD decomposition of predicted expression correlation; eigen_min_kept: eigenvalue of the smallest independent component that was kept in the SVD decomposition of predicted expression correlation; z_min: minimum z-score among single-tissue S-prediXcan associations; z_max: maximum z-score among single-tissue S-prediXcan associations; z_mean: mean z-score among single tissue S-prediXcan associations; z_sd: standard deviation of the mean z-score among single-tissue S-prediXcan associations; tmi: trace of T*T', where T is the correlation of predicted expression levels for different tissues multiplied by its SVD pseudo-inverse and is an estimate for the number of independent components of variation in predicted expression across tissues.

    elife-93666-supp9.xlsx (4.2MB, xlsx)
    Supplementary file 10. Top 10 genes for the TWAS trained with the GALA II-SAGE models in admixed Americans.

    Bonferroni correction thresholds: Pooled p<4.19E-06; PR p<4.99E-06; MX p<5.19E-06; AA p<4.67E-06. Var_g: variance of the gene expression; pred_perf_r2: cross-validated R2 of tissue model’s correlation to gene’s measured transcriptome; pref_perf_qval: qval of tissue model’s correlation to gene’s measured transcriptome; n_snps_used: number of snps from GWAS used in S-prediXcan analysis; n_snp_in_cov: number of snps in the covariance matrix; n_snps_in_model: number of snps in the model; best_gwas_p: the highest p-value from GWAS snps used in this model; largest_weight: the largest weight in this model.

    elife-93666-supp10.xlsx (18.6KB, xlsx)
    Supplementary file 11. Independent variants with p-value<1e-05 in the SC-HGI_ALL GWAS meta-analysis (hg38).

    EA: effect allele; NEA: non-effect allele; EAF_avg: averaged effect allele frequency; FreqSE: standard error of averaged effect allele frequency.

    elife-93666-supp11.xlsx (68.5KB, xlsx)
    Supplementary file 12. Results of the 40 lead variants associated with COVID-19 hospitalization in the HGIv7 (hg38).

    SC-HGI_ALL: meta-analysis SCOURGE-HGI_ALL; SC-HGI_AMR: meta-analysis SCOURGE-HGI_AMR; SC-HGI_3POP: meta-analysis SCOURGE-HGI_3POP.

    Supplementary file 13. Independent variants with p-value<1e-05 in the SC-HGI_3POP GWAS meta-analysis (hg38).

    EA: effect allele; NEA: non-effect allele; EAF_avg: average effect allele frequency; FreqSE: standard error of averaged effect allele frequency.

    elife-93666-supp13.xlsx (88.1KB, xlsx)
    Supplementary file 14. Instruments used in the polygenic risk score model (hg38).
    elife-93666-supp14.xlsx (13.5KB, xlsx)
    Supplementary file 15. Multinomial regression results.

    Reference class for the multinomial regression is ‘asymptomatic’.

    elife-93666-supp15.xlsx (18.8KB, xlsx)
    MDAR checklist

    Data Availability Statement

    Summary statistics from the SCOURGE Latin American GWAS and the analysis scripts are available from the public repository https://github.com/CIBERER/Scourge-COVID19 (copy archived at CIBERER, 2024).

    The following previously published datasets were used:

    Consortium GTEx. 2020. GTEx V8. GTEx Portal. Single-Tissue cis-QTL

    Consortium Genomes Project 2016. 1000 Genomes Phase 3. PLINK 2.0 Resources. 1000 Genomes phase 3

    COVID-19 Host Genetics Initiative 2022. COVID19-hg GWAS meta-analyses round 7. COVID-19 hg repository. r7


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES