Skip to main content
eLife logoLink to eLife
. 2021 Aug 17;10:e69719. doi: 10.7554/eLife.69719

A proteome-wide genetic investigation identifies several SARS-CoV-2-exploited host targets of clinical relevance

Mohd Anisul 1,2,, Jarrod Shilts 1, Jeremy Schwartzentruber 1,2, James Hayhurst 2,3, Annalisa Buniello 2,3, Elmutaz Shaikho Elhaj Mohammed 4, Jie Zheng 5, Michael Holmes 6,7, David Ochoa 2,3, Miguel Carmona 2,3, Joseph Maranville 4, Tom R Gaunt 5, Valur Emilsson 8,9, Vilmundur Gudnason 8,9, Ellen M McDonagh 2,3, Gavin J Wright 1,10, Maya Ghoussaini 1,2,†,✉,#, Ian Dunham 1,2,3,†,✉,#
Editors: John W Schoggins11, Jos W Van der Meer12
PMCID: PMC8457835  PMID: 34402426

Abstract

Background:

The virus SARS-CoV-2 can exploit biological vulnerabilities (e.g. host proteins) in susceptible hosts that predispose to the development of severe COVID-19.

Methods:

To identify host proteins that may contribute to the risk of severe COVID-19, we undertook proteome-wide genetic colocalisation tests, and polygenic (pan) and cis-Mendelian randomisation analyses leveraging publicly available protein and COVID-19 datasets.

Results:

Our analytic approach identified several known targets (e.g. ABO, OAS1), but also nominated new proteins such as soluble Fas (colocalisation probability >0.9, p=1 × 10-4), implicating Fas-mediated apoptosis as a potential target for COVID-19 risk. The polygenic (pan) and cis-Mendelian randomisation analyses showed consistent associations of genetically predicted ABO protein with several COVID-19 phenotypes. The ABO signal is highly pleiotropic, and a look-up of proteins associated with the ABO signal revealed that the strongest association was with soluble CD209. We demonstrated experimentally that CD209 directly interacts with the spike protein of SARS-CoV-2, suggesting a mechanism that could explain the ABO association with COVID-19.

Conclusions:

Our work provides a prioritised list of host targets potentially exploited by SARS-CoV-2 and is a precursor for further research on CD209 and FAS as therapeutically tractable targets for COVID-19.

Funding:

MAK, JSc, JH, AB, DO, MC, EMM, MG, ID were funded by Open Targets. J.Z. and T.R.G were funded by the UK Medical Research Council Integrative Epidemiology Unit (MC_UU_00011/4). JSh and GJW were funded by the Wellcome Trust Grant 206194. This research was funded in part by the Wellcome Trust [Grant 206194]. For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Research organism: Human

eLife digest

Individuals who become infected with the virus that causes COVID-19 can experience a wide variety of symptoms. These can range from no symptoms or minor symptoms to severe illness and death. Key demographic factors, such as age, gender and race, are known to affect how susceptible an individual is to infection. However, molecular factors, such as unique gene mutations and gene expression levels can also have a major impact on patient responses by affecting the levels of proteins in the body. Proteins that are too abundant or too scarce may mean the difference between dying from or surviving COVID-19.

Identifying the molecular factors in a host that affect how viruses can infect individuals, evade immune defences or trigger severe illness, could provide new ways to treat patients with COVID-19. Such factors are likely to remain constant, even when the virus mutates into new strains. Hence, insights would likely apply across all virus strains, including current strains, such as alpha and delta, and any new strains that may emerge in the future.

Using such a ‘natural experiment’ approach, Karim et al. compared the genetic profiles of over 30,000 COVID-19 patients and a million healthy individuals. Nine proteins were found to have an impact on COVID-19 infection and disease severity. Four proteins were ranked as top priorities for potential treatment targets. One protein, called CD209 (also known as DC-SIGN), is involved in how the virus enters the host cells, and had one of the strongest associations with COVID-19. Two proteins, called IL-6R and FAS, were involved in the immune response and could be responsible for the immune over-activation often seen in severe COVID-19. Finally, one protein, called OAS1, formed part of the body’s innate antiviral defence system and appeared to reduce susceptibility to COVID-19.

Knowing more about the proteins that influence the severity of COVID-19 opens up new ways to predict, protect and treat patients who may have severe or fatal reactions to infection. Indeed, one of the identified proteins (IL-6R) had already been targeted in recent clinical trials with some encouraging results. Considering CD209 as a potential receptor for the virus could provide another avenue for therapeutics, similar to previously successful approaches to block the virus’ known interaction with a receptor protein. Ultimately, this research could supply an entirely new set of treatment options to help combat the COVID-19 pandemic.

Introduction

At the current time, the coronavirus disease 2019 (COVID-19) pandemic is implicated in the deaths of more than 4 million people worldwide (Dong et al., 2020). Although effective vaccines have been developed to substantially reduce mortality and morbidity due to severe COVID-19, the emergence of mutated strains of the SARS-CoV-2 virus has challenged the effectiveness of existing vaccines and raised the urgency of identifying alternate therapeutic pathways to target the virus (Tegally, 2020; Erik et al., 2020 ; Collier et al., 2021). Nevertheless, it is likely that the mutated strains of SARS-CoV-2 will continue to exploit the same vulnerable host biology to bind onto and infect cells and, in susceptible individuals, evade immune defences and promote the excessive host inflammatory response that is characteristic of severe COVID-19 (Gordon et al., 2020a). Therefore, the identification of host proteins that play roles in COVID-19 susceptibility and severity remains crucial to the development of therapeutics as host protein mechanisms are independent of genomic mutations in the virus. An improved understanding of these therapeutically relevant virus-host pathways may also be important in combating viruses beyond SARS-CoV-2 (Perrin-Cocon et al., 2020).

Several large-scale systematic experimental efforts have identified key host proteins that interact with viral proteins in the pathogenesis of severe COVID-19 (Gordon et al., 2020a; Gordon et al., 2020b; Bouhaddou et al., 2020). These notably include efforts to identify direct interactions with the spike protein of SARS-CoV-2, which mediates virus attachment onto receptors to infect host cells and is also the basis of most vaccines (Shang et al., 2020; Harvey et al., 2021). To complement in vitro host protein characterisation efforts, several groups have leveraged genetic datasets of human proteins and COVID-19 disease to identify therapeutically actionable candidate host proteins that are likely to play roles in enhancing COVID-19 susceptibility or to be involved in the pathogenesis of severe COVID-19 (Pairo-Castineira et al., 2021; Zhou et al., 2021). One of the approaches used was Mendelian randomisation (MR). MR simulates the design of randomised trials, with the underlying principle that randomisation of alleles at conception offers the opportunity to examine approximate differences in average risk of disease between comparable groups in a population that differ only in the distribution of the risk factor of interest (Davies et al., 2018), for example, protein abundance (Zheng et al., 2020). This allows the use of alleles as genetic instruments representing genetically predicted protein levels to proxy effects of pharmacological modulation of the protein. Some of the clinically actionable proteins identified by the MR approach are part of type I interferon signalling (encoded by genes: IFNAR2, TYK2, OAS1) and interleukin-6 (IL-6) signalling pathways (IL6R). Only one of these proteins (encoded by OAS1) had any evidence of genetic colocalisation, that is, evidence that genetic associations of the protein and COVID outcomes shared the same causal genetic signal (Zhou et al., 2021). An additional protein that was supported by both MR and genetic colocalisation tests was ABO (Zhou et al., 2021), reported in several published genome-wide association studies (GWAS) of COVID-19 (Pairo-Castineira et al., 2021; Ellinghaus et al., 2020). In response to the first published GWAS of COVID-19, we reported findings that link the ABO signal with a number of clinically actionable targets including coagulation factors (von Willebrand factor [vWF], and Factor VIII [F8]), IL-6, and CD209/DC-SIGN (Karim et al., 2020).

However, in most of the previous MR studies (Pairo-Castineira et al., 2021; Zhou et al., 2021), investigators only used curated cis-acting variants (genetic variants near or in the gene encoding the relevant protein) as genetic instruments to represent effects of genetically predicted protein concentrations, rather than genome-wide instruments. While the use of cis-acting variants can minimise the risk of horizontal pleiotropic effects (i.e. associations driven by other proteins not on the causal pathway for the disease), it can suffer from lower power than a genome-wide analysis due to fewer available instruments (Zheng et al., 2020). Furthermore, in previous protein-COVID-19 MR studies, genetic colocalisation tests were carried out only for protein-phenotype associations that were significant in the MR analysis, potentially excluding many protein-phenotype associations that may share the same causal genetic signal but are underpowered in a proteome-wide MR approach.

In the present study, we expanded on these previous reports by undertaking a proteome-wide two-sample pan- and cis-MR analysis using the Sun et al. GWAS (Sun et al., 2018) of plasma protein concentrations and several COVID-19 GWAS phenotypes from the ICDA COVID-19 Host Genetics Initiative (October 2020 release) (Huang et al., 2020). First, we showed that genetically predicted circulating ABO protein was associated with COVID-19 susceptibility and severity and the lead ABO signal was associated strongly with plasma concentrations of soluble CD209. Second, we collected evidence for a direct mechanism of interaction between the SARS-CoV-2 spike protein and human CD209 protein. Third, we performed proteome-wide genetic colocalisation tests, followed by single-instrument cis-MR analysis, and we report additional novel targets of therapeutic relevance. Finally, we examined associated phenotypes using the colocalising signals from the Open Targets Genetics portal (http://genetics.opentargets.org) to shed light on the biological basis of association of the proteins with the COVID-19 phenotypes.

Materials and methods

Key resources table.

Reagent type (species) or resource Designation Source or reference Identifiers Additional information
Cell line (Homo sapiens) HEK293-E Yves Durocher, PMID:11788735 RRID:CVCL_6974
Transfected construct (Homo sapiens) pCMV6-CD209 Origene Cat.# SC304915 Plasmid for CD209 cDNA expression in cell-based binding assay
Transfected construct (Homo sapiens) pTT3-ACE2-BLH PMID:33432067
Plasmid for recombinant ACE2 extracellular domain, for plate-based assays as the immobilised form
Transfected construct (Homo sapiens) pTT3-CD209-BLH This paper
Plasmid for recombinant CD209 extracellular domain for plate-based assays as the immobilised form
Transfected construct (Homo sapiens) pTT3-Cd4d3+ d4 Addgene RRID:Addgene_32402 Plasmid for recombinant tag control (Cd4 domains 3 and 4)
Transfected construct (Homo sapiens) pTT3-SPIKE-COMP-BLac This paper
Plasmid for recombinant SARS-CoV-2 spike extracellular domain for plate-based assays as the soluble form
Transfected construct (Homo sapiens) pTT3-BirA-FLAG Addgene RRID:Addgene_64395 Biotin ligase plasmid for recombinant protein biotinylation
Peptide, recombinant protein Streptavidin R-phycoerythrin BioLegend Cat.# 405245 For tetramer staining in cell-based binding assay
Chemical compound, drug DAPI (4',6-diamidino-2-phenylindole) BioLegend Cat.# 422801 1 μM for flow cytometry live/dead staining
Chemical compound, drug D-biotin Sigma-Aldrich Cat.# 2031 100 μM supplemented to cell culture media for biotinylation
Software, algorithm R (version 4.0.3) R Foundation www.r-project.orgRRID:SCR_001905 Analysis and generating plots

Genetic associations of proteins

We primarily used Sun et al. protein GWAS data (Sun et al., 2018; Emilsson et al., 2018) for the pan-/cis-MR analyses and for performing genetic colocalisation tests (described below). The pan-/cis-MR effects were expressed per standard deviation (SD) higher genetically predicted plasma protein concentrations. Two additional proteomic datasets (Emilsson et al., 2018; Suhre et al., 2017) were used to identify proteins associated with the ABO locus. The genotyping protocols and QC of these proteomic studies have been described previously (Sun et al., 2018; Emilsson et al., 2018; Suhre et al., 2017). All three of the proteomic studies have used the SOMAscan assay platform (an aptamer-based protein detection platform) to detect and quantify protein abundance (Gold et al., 2012).

Genetic associations of COVID-19

We used seven meta-analysed COVID-19 datasets from the October 2020 release of the ICDA COVID-HGI group (https://www.covid19hg.org/results/r4/). These seven COVID-19 outcomes are A1 (very severe respiratory confirmed COVID vs. not hospitalised COVID), A2 (very severe respiratory confirmed COVID vs. population), B1 (hospitalised COVID vs. not hospitalised COVID), B2 (hospitalised COVID vs. population), C1 (COVID vs. lab/self-reported negative), C2 (COVID vs. population), and D1 (predicted COVID from self-reported symptoms vs. predicted or self-reported non-COVID). Definitions of these outcomes are provided in Supplementary file 1.

Harmonisation of protein and COVID summary statistics

Prior to analyses, we performed a liftover of datasets that reported genomic coordinates using the GRCh37 assembly to GRCh38. We also checked and ensured that the effect allele in a GWAS locus is the alternative allele in the forward strand of the reference genome. To infer strand for palindromic variants (variants with A/T or G/C alleles, i.e. variants with the same pair of letters on the forward strand as on the reverse strand), we first checked the orientation of all non-palindromic variants with respect to the reference genome to assess whether there was a strand consensus of 99% or more. For example, for a given GWAS, if ≥99% of the non-palindromic variants were on the forward strand, we assumed that the palindromic variant would also be on the forward strand; otherwise, they were excluded from analyses. Details of the harmonisation workflow are provided in our GitHub pages (EBISPOT, 2020; Opentargets Inc, 2021).

Mendelian randomisation

To construct genetic instruments for MR analysis, we selected near-independent (r² = 0.05) genetic variants from across the genome (‘pan’-instruments) or from within ±1 Mbp from the transcription start site (TSS) of the gene encoding the protein (‘cis’-instruments) associated with the encoded protein abundance at p≤5 × 10–8 for pan-MR analyses and at a less stringent p ≤ 1 × 10⁻⁵ for cis-MR analyses (this p-value corrects for the number of proteins in the druggable genome Schmidt, 2020). We used the generalised summary data-based Mendelian randomisation (GSMR) approach with the heterogeneity-independent instrument (HEIDI)-outlier flag turned on to carry out the pan- and cis-MR analyses (Zhu et al., 2018). The GSMR software, using the HEIDI-outlier method, removes potentially pleiotropic instruments and accounts for the residual correlation between instruments (important as we are using near-independent genetic instruments). To select near-independent genetic instruments and account for linkage disequilibrium (LD) in the MR analyses, we used genotype data from 10,000 randomly sampled UK Biobank participants to create a reference LD matrix, which is ancestry-matched to the pQTL data we used. For each COVID-19 outcome, we used the Benjamini–Hochberg FDR (False Discovery Rate) threshold of 5% for significance, adjusting for 2042 tests in cis-MR analyses and 1286 tests in pan-MR analyses. For trans-acting instruments in pan-MR associations, variants were mapped to their respective cis-gene that had the highest overall V2G score in the Open Targets Genetics portal (Ghoussaini, 2021; Mountjoy, 2020; Open Targets Genetics, 2019a).

Colocalisation analysis and phenome-wide association study

To identify shared causal genetic signals between protein and COVID outcomes, we used the Bayesian method of genetic colocalisation implemented in the coloc R package (Giambartolomei, 2014) using the marginal association statistics for each trait (i.e. assuming one independent signal in each region). We used beta and standard errors of cis-pQTLs of phenotype pairs as inputs. The default priors in coloc were used, that is, the prior of an SNP (single nucleotide polymorphism)-trait association is 1 × 10–4, and the prior of an SNP associating with both traits is 1 × 10–5. For each COVID-19 outcome, a posterior probability for shared causal genetic signal (PP.H4) threshold of more than 0.8 was used to identify shared causal genetic variants. For colocalising signals, we carried out a phenome-wide association study (PheWAS) using GWAS summary statistics (n = ~ 3000 GWAS) from the Open Targets Genetics portal (Ghoussaini, 2021; Mountjoy, 2020).

Evidence against aptamer binding artefacts

For variants associated with proteins due to aptamer or epitope binding artefacts (which tend to be missense variants) (Joshi and Mayr, 2018), we first assessed whether genetic instruments for MR or coloc-based single-SNP MR analysis were associated with corresponding gene expression (i.e. whether they were also cis-eQTLs). This used gene expression data from the Open Targets Genetics portal (Ghoussaini, 2021). SNPs that were not cis-eQTLs were investigated further by identifying whether they were (or were in LD at r2 = 0.8 with) missense variants. To query if variants were missense or in LD with missense variants, we used the functional consequence data from Open Targets Genetics (Ghoussaini, 2021) (which used gnomAD v2 for variant effect prediction annotation, Lek, 2016). The reasoning was, if missense variants also had effects on corresponding gene expression, the causal inference using the missense variants as genetic instruments was unlikely to be biased even if the effect estimates were invalid.

Where cis-pQTLs were not cis-eQTLs and were missense variants (or in LD with missense variants at r2 = 0.8) affecting the respective genes, these proteins were flagged and excluded from any further downstream analyses on the basis that the missense variant(s) might influence aptamer binding and produce biased effect estimates. Where cis-pQTLs were also cis-eQTLs and were missense variants (or in LD with missense variants) for the respective genes, although the effect estimates would not be valid, the causal inference using the instruments is unlikely to be biased; hence, these variants were retained in supplementary files and estimates of probes represented by these variants were flagged (using an asterisk) in the main figures. The rest, where cis-pQTLs had an effect on gene expression but were not missense variants or in LD with missense variants, were included in all analyses and presented without restrictions.

Recombinant protein production

Recombinant human receptors and SARS-CoV-2 spike protein extracellular domains were expressed and purified as previously described (Shilts et al., 2021). Briefly, the full extracellular domain sequences of each were expressed as soluble secreted proteins in HEK293 cells. All proteins were affinity-purified using their hexahistidine tags. For biotinylated proteins, co-transfection of secreted BirA ligase in the presence of 100 µM D-biotin resulted in the covalent addition of a biotin group to an acceptor peptide tag, also as described previously (Kerr and Wright, 2012). The extracellular domain of CD209 (Q9NNX6) was defined as beginning at Pro114, while the full cDNA sequence was acquired from OriGene (#SC304915).

Plate-based protein binding assay

The binding of biotinylated human receptor extracellular domains to pentameric SARS-CoV-2 spike protein was measured using the avidity-based extracellular interaction assay (AVEXIS) as previously described (Bushell et al., 2008). Briefly, the wells of a streptavidin-coated 96-well plate were saturated with biotinylated bait of either CD209, ACE2, or a previously described negative-control construct consisting only of the C-terminal protein tags shared by all other recombinant proteins (rat Cd4(d3 +4)-linker-Bio-6xHis) (Voulgaraki, 2005; Galaway and Wright, 2020). Across these baits, we applied a dilution series of the full SARS-CoV-2 spike protein extracellular domain pentamerised by a peptide sequence from the cartilage oligomeric matrix protein with a beta lactamase reporter. After washing, binding was measured by hydrolysis of a colorimetric nitrocefin substrate whose product was quantified by light absorbance at 450 nm.

Cell-based receptor binding assay

HEK293 cells were transiently transfected as described previously (Bartholdson, 2012) with expression plasmids encoding full-length cDNA of CD209 (Origene SC304915), or a mock transfection lacking the expression plasmid. Separately, recombinant biotinylated spike protein was tetramerised around streptavidin conjugated to phycoerythrin as previously described (Sharma et al., 2018). Cells were incubated with tetramers of spike or a control construct of protein tags before being analysed on a flow cytometer as previously described (Shilts et al., 2021). HEK293 cell lines were provided by Yves Durcohcer (National Research Council, Canada). Cell lines were authenticated upon first receipt by DNA sequencing. All cell lines were regularly tested for mycoplasma by PCR (Surrey Diagnostics) and found to be negative all throughout experiments. These cell lines are not listed by ICLAC as 'commonly misidentified’.

Code availability

Codes used to harmonise summary statistics are provided in https://github.com/EBISPOT/gwas-sumstats-harmoniser (EBISPOT, 2020). Codes for pan- and cis-MR analyses are provided on the GSMR website (https://cnsgenomics.com/software/gcta/#GSMR). Codes for genetic colocalisation analyses are provided on the coloc GitHub page (https://github.com/chr1swallace/coloc; Wallace, 2021). All codes used in the paper to reproduce results are provided in https://github.com/mohdkarim/covid_paper (copy archived at swh:1:rev:4ab9f9b17ffde57f7831ea555394290ba240a2b9; Anisul, 2021).

Results

Pan- and cis-MR analyses support the role of circulating ABO protein concentrations and soluble IL-6R in COVID-19 risk

Our multi-instrument MR analysis used both genetic variants from across the genome (pan-MR) and genetic variants near or in the gene encoding the relevant protein (cis-MR) to investigate associations of genetically predicted plasma protein concentrations with the risk of COVID-19 outcomes. The COVID-19 outcome definitions are provided in Supplementary file 1. Although the pan-MR analysis leveraged genetic data from both cis- and trans-acting pQTLs (with a selection of pQTLs from across the genome automated by GSMR’s built-in HEIDI-outlier exclusion method), for some protein-COVID-19 pairs that were associated at 5% FDR, the associations with COVID-19 outcomes were exclusively driven by trans-acting pQTLs or cis-acting genetic instruments. For example, although six proteins were represented by both cis- and trans-acting genetic instruments, two (ABO and IL6R) were represented only by cis-acting variants and one (SELE) was driven entirely by trans-acting instruments (mainly ABO trans-pQTLs) (Supplementary file 2). Overall, the pan-MR analysis revealed nine distinct protein probes associated with four COVID outcomes at an FDR of 5% (Figure 1A). The pQTLs selected by GSMR to represent these nine probes were also cis-eQTLs (as curated for the Open Targets Genetics portal Open Targets Genetics, 2019) and, except the ABO signal via rs8176719-insertionC (will be referred to as rs8176719-insC – a frameshift mutation that inserts a guanine nucleotide in the 258th position of exon 6), were not missense variants or in LD with missense variants (Supplementary file 3), minimising the possibility that SNPs with artefactual associations with proteins were used as genetic instruments for the majority of significant pan-MR association.

Figure 1. Flowcharts illustrating the process of (A) pan-Mendelian randomisation (MR) and (B) cis-MR and genetic colocalisation.

Figure 1.

Both pan- and cis-MR methods used (Sun et al., 2018) as the source of genetic instruments and the UK Biobank downsampled 10k (UKBd10k) individual genotype data as reference panel. We selected near-independent genetic instruments and performed two sample MR analysis using generalised summary data-based Mendelian randomisation that adjusted for residual correlation between instruments. Genetic colocalisation analysis was used to estimate posterior probabilities of shared causal genetic signal between protein and outcomes. A posterior probability of shared causal genetic signal of more than 0.6 (i.e. a PP.H4 or posterior probability for hypothesis 4 > 0.6) was used as evidence of genetic colocalisation. The dashed line separates analysis (above the line) from target curation (below the line). *Only three proteins with pan-MR evidence of association with COVID also had cis-MR evidence support at nominal cis-MR p-value<0.05.

While the pan-MR analysis used genetic data from across the genome, the cis-MR analysis restricted genetic instrument selection to those near (within 1 Mb of TSS) or in the gene encoding the protein. Three proteins with pan-MR associations were supported by corresponding cis-MR associations (Figure 1A and B, Supplementary file 4): ABO, ICAM-1, and IL-6R. Among these three, only ABO and IL-6R proteins had some evidence of genetic colocalisation with posterior probabilities (PP.H4) more than 0.9 and 0.4, respectively, of a shared genetic signal between protein and COVID-19 phenotype (Figure 2). Although the PP.H4 of IL-6R was very weak (0.4), it had a positive (H4/H3 = 3.6), indicating a common signal of the IL-6R protein with the COVID-19 outcome is a more likely scenario than the association driven by two independent signals.

Figure 2. Forest plot illustrating associations of genetically predicted plasma protein concentrations with selected COVID-19 phenotypes.

Figure 2.

The black point estimates represent odds ratios (ORs) of COVID-19 outcome per standard deviation (SD) increase of genetically predicted protein abundance using genetic instruments from across the genome (pan-Mendelian randomisation [pan-MR]). The blue point estimates represent OR of COVID outcome per SD increase of genetically predicted protein abundance using genetic instruments near or in the gene encoding the protein (cis-MR). Error bars represent 95% confidence intervals (95% CI). The areas of the squares are proportional to the inverse of the variance of the log ORs. For each COVID phenotype, pan-MR associations at FDR 5% were retained. Each row under a COVID phenotype represents a pQTL and includes the number of cases in the COVID phenotype (nCases), the number of SNPs used as genetic instruments for the protein (nSNPs), the posterior probability that protein and COVID traits colocalise (PP.H4), the posterior probability evidence for vs. against shared causal variants (log2(H4/H3)), and the candidate colocalising signal (coloc_SNP). * denotes proteins that have coloc_SNP that are either missense variants or in linkage disequilibrium with missense variants, rendering their effect estimates potentially biased.

Genetically predicted ABO concentration was associated with risk in four out of seven COVID-19 outcomes (Figure 2). These four outcomes represented both susceptibility (e.g. COVID-19 vs. population, cis-MR odds ratio [OR] [95% CI] per SD genetically predicted ABO concentrations: 1.08 [1.05, 1.10], p=7 × 10–10) and severity (e.g. hospitalised COVID-19 vs. population, cis-MR OR [95% CI]: 1.12 [1.08, 1.17], p=1 × 10–8) of COVID-19. Genetically predicted soluble IL-6R was only associated with higher risk of hospitalised COVID-19 compared to population-based controls (cis-MR OR [95% CI] per SD genetically predicted IL-6R: 0.94 [0.91, 0.97], p=8 × 10–4) (Figure 2).

When examining the SNPs involved in the pan-MR associations of the nine probes, all probes except IL-6R and ABO had at least one trans-acting SNP, and in all these cases, at least one of the trans-acting SNPs were assigned to the ABO gene by the Open Targets Genetics V2G pipeline (Table 1), re-confirming the pervasive pleiotropy of the ABO genetic signal. Furthermore, when examining the consistency of pan-MR associations of these nine probes across all seven COVID-19 outcomes, the protein probes that have trans-acting ABO SNPs exhibited a similar association profile as the ABO protein probe, associated with only COVID-19 outcomes that have population-based controls (Supplementary file 5).

Table 1. Summary of proteins reported in our study and the different sources of evidence supporting their prioritisation.

Protein Supported by multi-instrument pan-MR Supported by multi-instrument cis-MR Supported by GC and single-SNP cis-MR Experimental support Existing drugs Previously reported
No. of cis-acting SNPs No. of trans-acting SNPs Trans-acting gene(s)*
ABO 93 0 None x x
QSOX2 16 63 ABO, OBP2B, ADAMTS13 x x x x x
CD209 8 45 ABO, SURF6 x x x
FAM3D 8 44 ABO, SULT2B1, FAM83E, NTN5, FUT2 x x x x x
SELE 0 60 ABO, FAM118B, RALGDS, OBP2B, ADAMTS13, SURF1 x x x x
ADGRF5 5 9 ABO, IL6ST, ADAMTS13 x x x x x
ICAM1 71 1 ABO x x x
TIE1 2 18 ABO, ST3GAL6, GBGT1, SURF6 x x x x x
IL6R 62 0 None x x
FAS No No No x x x x
OAS1 No No No x x x
THBS3 No No No x x x x

Detailed description of each column is provided in Supplementary file 9.

*

Where trans-acting SNPs are used, genes assigned to SNPs with the highest variant-to-gene scores in Open Targets Genetics were used for annotation. GC: genetic colocalisation; MR: Mendelian randomisation.

CD209/DC-SIGN: a proposed alternate receptor for SARS-CoV-2

The ABO signal (rs8176719-insC) contributes to the determination of non-O blood groups and regulates circulating levels of both ABO and several non-ABO proteins; Yamagata University Genomic Cohort Consortium (YUGCC), 2014; Arguinano et al., 2018. We explored proteome-wide associations of rs8176719-insC in three separate proteomic datasets; Emilsson, 2020. Aside from the ABO protein, the ABO signal rs8176719-insC showed the strongest association (Sun et al: p=6.03 × 10–258, Emilsson et al: p=1.00 × 10–307, Suhre et al: p=1.27 × 10–75) with higher plasma concentrations of soluble CD209 in all three datasets (associations from two datasets illustrated in Figure 3, and associations from all three datasets tabulated in Supplementary file 6). To validate this as a relevant target for COVID-19, we experimentally tested whether CD209 directly interacts with SARS-CoV-2, as had been recently proposed based on similarities to SARS-CoV-1, which was reported to bind CD209 (Yang, 2020). We used human cells to generate recombinant SARS-CoV-2 spike protein, spanning the full-length extracellular domain according to a design previously established to retain functionality (Shilts et al., 2021). We found that purified spike protein indeed could directly attach onto human cells expressing CD209 but not control cells, suggesting that CD209 could act as a receptor for viral attachment onto host cells (Figure 4A). Furthermore in a direct binding assay testing purified soluble CD209 and the viral spike protein, we could detect binding that was specific and comparable to the primary known receptor for SARS-CoV-2, ACE2 (Figure 4B).

Figure 3. Proteome-wide association of the ABO signal (rs8176719-insC) in (A) Sun et al. and (B) Emilsson et al. datasets.

Figure 3.

The x-axis represents the chromosome for the gene encoding the protein. The y-axis represents the p-value of the per-allele association of rs8176719-insC (or an SNP in high linkage disequilibrium at r2 >0.8 with rs8176719-insC) with the proteins in Sun et al. and Emilsson et al. datasets. The red triangles point downwards and denote the inverse association of the ABO signal with the protein. The blue triangles point upwards and denote the positive association of the ABO signal with the protein. Only proteins that were considered significant at the study-specific Bonferroni-corrected p-value thresholds are displayed in this plot and tabulated in Supplementary file 6. (Supplementary file 6 also reports associations from an additional protein dataset – Suhre et al.).

Figure 4. In vitro binding experiments with purified SARS-CoV-2 spike protein confirm human CD209 as a functional binding target.

Figure 4.

(A) Human cell lines overexpressing cell-surface CD209 protein gain the ability to specifically bind SARS-CoV-2 spike. The density plots represent flow cytometry measurements of HEK293 cells stained with fluorescently conjugated tetramers of SARS-CoV-2 spike protein or a tag-only protein control. Blue distributions are cells with surface CD209, while red are control-transfected cells. Light shades indicate a negative control tetramer that was used for staining, while dark shades are stained with spike protein. (B) Purified recombinant CD209 ectodomains interact with the spike protein of SARS-CoV-2 in an in vitro binding assay. A dilution series of purified spike protein was applied over immobilised CD209, ACE2 (positive control), or a negative control protein. A plot of quantified absorbance is displayed alongside a representative assay plate. Error bars are standard deviations of two replicates.

Proteome-wide genetic colocalisation implicates additional proteins in COVID-19 risk including FAS, SCARA5, and OAS1

To identify additional proteins associated with the risk of COVID-19, we conducted proteome-wide genetic colocalisation tests followed by single-SNP MR analysis (Supplementary file 7). This ‘coloc-first’ approach identified four proteins (ABO, FAS, OAS1, THBS3) with evidence of genetic colocalisation (PP.H4 >0.8) with four out of seven COVID-19 phenotypes (Figure 5). Two of these (FAS and THBS3) are, to the best of our knowledge, not reported in proteomic MR studies of COVID-19 to date which have only examined for colocalisation evidence after MR.

Figure 5. Forest plot illustrating associations of genetically predicted plasma protein concentrations that colocalised with the selected COVID-19 phenotypes (PP.H4 > 0.6).

Figure 5.

The black point estimates represent odds ratios (ORs) of COVID-19 outcome per standard deviation (SD) increase of genetically predicted protein abundance using single-SNP colocalising signals (coloc_SNP). Error bars represent the 95% confidence interval around the estimates. The areas of the squares are proportional to the inverse of the variance of the log ORs. * denotes proteins that have coloc_SNP that are either missense variants or in linkage disequilibrium with missense variants, rendering their effect estimates potentially biased.

Consistent with pan- and cis-MR findings, there was evidence of genetic colocalisation between the ABO protein and six out of seven COVID-19 phenotypes (Figure 6), with similar MR estimates when the colocalising SNP was used to perform single-SNP cis-MR.

Figure 6. Regional association plots arranged to mirror the genetic associations of the colocalising proteins (FAS, ABO, and OAS1) with their respective COVID-19 phenotypes.

Figure 6.

The top panels represent genetic associations of the selected COVID-19 phenotypes, and the bottom panels represent genetic associations of the protein from the Sun et al. dataset. The x-axis in each panel represents the genomic locations in or around the genes encoding FAS, ABO, and OAS1. The y-axis in each panel represents the p-value of the genetic associations.

The coloc-first approach revealed a common genetic signal between OAS1 and COVID-19 in two out of seven COVID-19 phenotypes (PP.H4 = 0.88 in COVID-19 vs. population, and 0.82 in hospitalised COVID-19 vs. population). However, the SNPs representing the common genetic signal between OAS1 and COVID-19 phenotypes (12:112919637:G:A and 12:112919388:G:A) were missense variants in OAS1 gene or in LD with missense variants at r2 >0.8, rendering their effect estimates potentially biased due to aptamer binding effects (see Materials and methods). Despite this, the same variants have effects on gene expression (as assessed in any of several tissues curated by Open Targets Genetics Open Targets Genetics, 2019), which is independent of aptamer binding, suggesting that causal inference regarding OAS1 protein and COVID-19 risk may still be valid. We have, therefore, presented OAS1 estimates in Figure 5 but flagged with an asterisk denoting the effect estimates as potentially biased.

Unlike ABO, OAS1 could not be tested using the multi-instrument MR approach due to insufficient number of valid instruments, highlighting the complementary value of the genetic colocalisation approach alongside multi-SNP MR methods. In single-SNP MR analyses, genetically predicted higher OAS1 was associated with lower risk of severe COVID-19 vs. population (OR [95% CI] per SD genetically predicted OAS1 concentrations: 0.52 [0.42, 0.65], p=5 × 10–9), hospitalised COVID-19 vs. population (0.63 [0.53, 0.75], p=1 × 10–7) and susceptibility to COVID-19 vs. population (0.79 [0.71, 0.87], p=2 × 10–6).

Proteins that exhibited association only in one of the COVID-19 phenotypes included circulating FAS and THBS3. Genetically predicted elevated FAS (indicated by two FAS probes: FAS.9459.7.3 and FAS.5392.73.2) and THBS3 were associated with a higher risk of severe COVID-19 (OR [95% CI] per SD genetically predicted FAS concentrations indicated by FAS.9459.7.3: 1.38 [1.17, 1.62], p=1 × 10–4) and COVID-19 vs. lab/self-reported negative COVID-19 (OR [95% CI] per SD genetically predicted THBS3 concentrations: 1.70 [1.30, 2.22], p=9 × 10–5), respectively.

PheWAS with colocalising variants provides additional biological insights for the basis of associations of the proteins with risk of COVID-19

For the proteins with evidence of genetic colocalisation between protein and COVID-19 phenotype, we used their lead variants (or variants they tag at r2 >0.6 if a lead variant was not reported in a GWAS) to identify additional associated phenotypes. At p<1 × 10–5 (Bonferroni corrected for the ~3000 phenotypes in the Open Targets Genetics portal), most of the variants exhibited associations with haematological indices, with some, like the ABO signal, also associated with other COVID-19-relevant phenotypes (Supplementary file 8). For example, the ABO signal was associated with monocyte count, deep vein thrombosis (DVT), and pulmonary embolism (PE). OAS1 and THBS3 variants were associated with platelet counts. For FAS, there were no additional phenotypic associations at p<1 × 10–5 shown by its colocalising variant.

Discussion

Our systematic proteome-wide MR and genetic colocalisation analysis supported several previously proposed proteins and suggested additional clinically actionable targets for COVID-19 (Table 1). Of particular note, we provided pan- and cis-MR evidence with strong genetic colocalisation support for the ABO signal for most COVID-19 phenotypes. Although the ABO protein itself is not clinically actionable, the ABO signal was linked to plasma concentrations of several clinically tractable targets. We demonstrated that the CD209 protein we had found to have the strongest association with this ABO signal has a direct interaction with the SARS-CoV-2 spike protein, providing further evidence for a plausible mechanism. Our analyses also supported the role of soluble IL-6R in hospitalised COVID-19, with evidence from pan- and cis-MR analyses but limited evidence of genetic colocalisation with hospitalised COVID-19 but supported by the recent COVID-19 clinical trials of tocilizumab (which is partially mimicked by the IL-6R instrument used in the present study). Using a proteome-wide ‘colocalisation-first’ approach, we recapitulated previously reported targets (e.g. OAS1) and uncovered additional novel proteins that may play causal roles in COVID-19 susceptibility (THBS3), or severity (FAS).

Our proteome-wide genetic colocalisation analysis prioritised soluble Fas (sFas, also known as soluble CD95) receptor protein in the very severe COVID-19 phenotype. This finding was not reported in previous proteomic MR studies of COVID-19 most likely because they only assessed evidence of shared signals for targets prioritised by an MR-first approach. The soluble Fas receptor is reported to act as a decoy receptor competing with the trans-membrane Fas receptor for Fas ligand (FasL) (Cheng, 1994). Genetically predicted higher circulating sFas is, therefore, likely to represent effects of lower Fas-FasL signalling and, in our study, was associated with a higher risk of very severe COVID-19. Fas-FasL signalling typically initiates a cascade of intracellular programmes that result in cell death or apoptosis. Fas-mediated apoptosis plays a central role in T- and B-cell homeostasis (Hao, 2008), preventing the emergence of autoreactive or overactive immune cells (Hao, 2008; Butt, 2015). Excessive inflammation by hyperactive T-cells and autoantibodies was reported to underlie several cases of severe COVID-19 (Khamsi, 2021). To what extent sFas contributes to the excessive pro-inflammatory response in severe COVID-19 remains to be determined. Furthermore, Fas-mediated apoptosis of virus-infected cells is a major mechanism of resolution of viral infections (Thomson, 2001). Delayed apoptosis is reported to be one of the strategies exploited by SARS-CoV-2 in the early stages of infection to facilitate viral replication (Thomson, 2001; Ivanisenko et al., 2020). Additional insights for the role of sFas in COVID-19 can be gleaned from the results of recent drug trials. For example, Fas is one of the major targets of lopinavir-ritonavir – a combination HIV protease inhibitor (Sorbera et al., 2020); clinical trials of lopinavir-ritonavir failed to provide any therapeutic benefits beyond standard care in hospitalised COVID-19 patients (Cao et al., 2020). On the other hand, clinical trials testing dexamethasone demonstrated beneficial effects on survival for COVID-19 patients who were on respiratory support (RECOVERY Collaborative Group et al., 2021). In addition to its anti-inflammatory effects, dexamethasone downregulates molecules associated with decelerating apoptosis (Achuthan, 2018), including sFas (Joashi et al., 2002). An in-depth assessment of the specific role of soluble Fas in COVID-19, including whether or not it contributes to the beneficial effects of dexamethasone, is warranted in future studies.

Observational studies were the first to report differences in risk of severe COVID-19 based on ABO blood groups, although with some conflicting reports (Zhao, 2020; Zietz et al., 2020; Bhattacharjee et al., 2020). GWAS of COVID-19 susceptibility have, however, consistently reported a signal in the ABO locus (Bhattacharjee et al., 2020; Shelton, 2020), despite prior observations that controls used in the first published GWAS of COVID-19 (Ellinghaus et al., 2020) may be over-represented for blood group O (the most common blood group) and can result in associations due to selection bias. However, in a meta-analysis of GWAS of COVID-19, the ABO signal remained even when Ellinghaus et al. was excluded.

Furthermore, using these GWAS data, we and Katz et al. (in a preprint) (Katz et al., 2020; Karim et al., 2020) had previously linked the ABO signal with CD209/DC-SIGN protein, clotting factors, coagulation disorders, and concentrations of IL-6, all potential risk factors for COVID-19. In the present study, we build on previous work and show consistent cis- and pan-MR associations of genetically predicted circulating ABO protein with an expanded list of COVID phenotypes which colocalise with the ABO signal, supporting a shared genetic signal of ABO protein and the COVID-19 phenotypes.

We show that, next to the ABO protein, the ABO signal had the strongest association with the CD209 protein relative to other proteins and present experimental evidence of binding of CD209 with the full-length spike protein of SARS-CoV-2, independently but consistent with a concurrent preprint (Amraie, 2020). CD209 is a receptor on monocyte-derived dendritic cells (moDCs) that was shown, before this binding interaction was known, to facilitate entry of replication-competent SARS-CoV-2 and demonstrated to switch off the type I interferon signalling pathways necessary for transcription of several antiviral genes (Yang, 2020). The soluble isoforms of CD209 measured in our proteome datasets are known to correlate in expression levels to the membrane isoforms (Mummidi et al., 2001; Plazolles, 2011), making it plausible that the signal we observed is associated with greater abundance of CD209 as a cell-surface viral receptor. Alternatively, in the context of other viruses known to directly bind CD209 as we show here for SARS-CoV-2, soluble CD209 has been demonstrated to modulate infection, such as by promoting endocytosis if the soluble CD209 coating the virus acts as opsonins (Plazolles, 2011). Further research would be beneficial to reveal which of these mechanisms explain the association we observed. These findings may also help interpret the clinical significance of the higher CD209 gene expression in immune cells (extracted from bronchoalveolar lavage fluid) in severe COVID patients than healthy controls (Gao, 2020). It should be noted that the present study developed a therapeutic hypothesis for CD209 based solely on the strong evidence of association of the ABO signal with plasma concentrations of CD209 and evidence from the pan-MR association of CD209 with COVID-19 phenotypes (the pan-MR associations being driven mainly by trans-acting ABO SNPs) with no corresponding support of cis-MR or colocalisation. This suggests that while cis-MR and colocalisation analyses can support pan-MR associations of a target with disease, the lack of cis-MR or colocalisation for a target is not necessarily evidence against its therapeutic relevance.

In the present study, we also found that genetically predicted higher OAS1 – an interferon-induced broad-spectrum antiviral enzyme – was associated with lower risk of both susceptibility and severity of COVID-19, consistent with findings of a recent published report (Zhou et al., 2021). A large clinical trial of systemically administered interferons failed to show any substantial therapeutic benefits for severe COVID-19 (WHO Solidarity Trial Consortium et al., 2021). However, the strong evidence from human genetics supports reconsidering the role of interferon-based therapies in a new light, especially with respect to timing of administration (which current genetic studies are unable to provide any insights on) and route (systemic vs. nebulised) (Monk, 2020).

Non-O blood group individuals generally have higher risk of DVT and other coagulation disorders than O blood group individuals (Groot, 2020). The ABO signal, which largely determines the non-O blood groups, was also associated with DVT, PE, and higher levels of vWF and F8; vWF binds to and protects F8 from biological degradation (Federici, 2003). F8 is a key protein in the intrinsic coagulation pathway that activates Factor X and induces the formation of fibrin – the central component of blood clots (Bhopale and Nanda, 2003). Both DVT and PE are reported to affect almost a third of ICU-admitted COVID-19 patients (Malas, 2020). While several clinical trials evaluating the efficacy of anticoagulants for severe COVID-19 are underway, the National Institute of Clinical Excellence in the UK has suggested screening all hospitalised COVID-19 patients for any contraindications to anticoagulant use and offering prophylactic anticoagulation to eligible patients (National Institute for Health and Care Excellence, 2020).

We found moderate evidence for the role of IL-6 signalling in COVID-19 in agreement with a previous report (Bovijn et al., 2021). However, there was ambiguous evidence of genetic colocalisation (PP.H4: 0.46). Nevertheless, there was more support for a shared genetic signal between sIL-6R and hospitalised COVID-19 than for them to be driven by independent signals (H4/ H3 = 3.6). As noted by Bovign and colleagues (Bovijn et al., 2021), with some caveats, the phenotypic consistency of associations between the IL-6R genetic instrument and pharmacological effect of tocilizumab enable potential use of the IL-6R instrument to investigate therapeutic or adverse effects of tocilizumab. Although a previous report showed largely neutral effects of tocilizumab compared to placebo in hospitalised COVID-19 patients (Stone et al., 2020), two recent trials (REMAP-CAP Anthony C and Paul R, 2021 and RECOVERY RECOVERY Collaborative Group, 2021) with a longer follow-up period showed beneficial effects on survival at 90 days, consistent with the prediction of a protective effect using the tocilizumab-mimicking IL-6R genetic instrument in the present study and the previous report.

The major strengths of our study include the use of both genome-wide and local genetic instruments for MR analysis, the proteome-wide genetic colocalisation tests to nominate additional proteins of therapeutic relevance, and the expanded list of COVID-19 phenotypes analysed. We showed consistency of the association of ABO with the different COVID-19 phenotypes for both instrument selection strategies. Proteome-wide colocalisation tests implicated additional proteins that likely lacked sufficient genetic instruments to be detected by the multi-instrument GSMR method. For our top-ranked association with the CD209 protein, we provide experimental evidence for a mechanism that implicates CD209 as having a potentially causal role in disease pathology. Our experiments provide both direct evidence of biochemical binding between the purified spike protein of SARS-CoV-2 and CD209, and verification that this interaction occurs in live human cells. Host-directed therapies involving pathogen binding receptors have previously been developed against other infectious diseases where pathogen mutations or variants stymied more traditional approaches (Zenonos et al., 2015).

Our study also has several limitations. The reliability of the MR approach depends on the selection of the appropriate genetic instruments for the exposure (Schmidt, 2020). Where proteins are the exposure, the use of genetic instruments from across the genome can result in more instruments and potentially higher power to detect associations. However, the inclusion of a broader set of genetic instruments for protein-MR analysis can lead to associations not mediated by the protein under investigation (i.e. horizontal pleiotropy). In these cases, the use of genetic variants near or in the locus encoding the protein (cis-acting SNPs) can provide more specific estimates of risk, albeit at a potential power cost, associated with genetically predicted concentrations of the protein under investigation (Schmidt, 2020). A key problem of the latter approach is the selection of correlated genetic instruments that can lead to numerical approximation errors (Gkatzionis et al., 2021). In the present study, we leveraged both pan- and cis-MR approaches and used an MR method (GSMR) that automates the selection of near-independent genetic instruments and performs MR adjusting for any residual correlation (Zhu et al., 2018). Nevertheless, horizontal pleiotropy can also affect cis-MR analyses when different variants from the same gene region represent different biological pathways, indicated by heterogeneous effect estimates, or driven by a single variant with a large effect (e.g. missense variants) (Gkatzionis et al., 2021). To prevent the selection of heterogeneous instruments and minimise the selection of variants with large effects, the multi-instrument GSMR method used in the present study implements the HEIDI test which excludes genetic variants with strong or heterogeneous effects. The exclusion of missense variants with potential aptamer binding effects is evidenced in our study, where SNPs in 96% of nominally significant protein probes associated with COVID-19 also had effects on corresponding gene expression in different tissues across gene expression datasets as curated by our portal (Open Targets Genetics, 2019). Even while using cis-acting genetic instruments, the MR associations can be confounded due to LD between cis-pQTLs and disease-associated SNPs, and this is at least partially mitigated by genetic colocalisation tests. However, the genetic colocalisation tests used in our study assumed a single causal variant in each locus and will, therefore, result in higher false-negative tests if there is more than one trait-associated causal variant. An additional issue is related to the selection of COVID-19 GWAS datasets used for analyses. Most protein-MR studies have used COVID-19 phenotypes with population-based controls, given their larger number of controls providing additional power to detect signals but at a cost of not being able to distinguish signals relevant to disease progression. While study designs with milder/asymptomatic cases as controls are useful to study disease progression, they are frequently underpowered and, because the selection of study participants are conditioned on the outcome, are susceptible to collider-stratification bias (Griffith, 2020). To enable a comprehensive assessment, we used all published COVID-19 phenotypes (October 2020 freeze), irrespective of controls used and, as expected, found most signals in COVID-19 phenotypes with population-based controls. For one of the targets (CD209), although we experimentally demonstrate binding of CD209 with spike protein of SARS-CoV-2, understanding the functional significance CD209 has on viral entry and any immunological relevance during infection requires further research. Finally, although we nominate several targets that may be therapeutically relevant for COVID-19, clinical trials are required for definitive assessments and to guide therapy. For example, the findings related to the ABO signal strongly implicated the adverse role of dysregulated coagulation in COVID-19 specifically in non-O blood group individuals; whether pre-emptive use of anticoagulants guided by blood groups can prevent severe COVID-19 is subject to findings of trials such as the ongoing ACTIV-4 trial (NCT04505774) (U.S. National Library of Medicine, 2020).

In conclusion, we integrated genetic investigation with functional assessments of CD209, a receptor in moDCs, and postulated that this target may convey the COVID-19 risk of the ABO signal. Based on proteome-wide genetic colocalisation and MR, we also prioritised sFas for more detailed investigations of its therapeutic relevance to severe COVID-19 risk.

Acknowledgements

MAK, JSc, JH, AB, DO, MC, EMM, MG, and ID were funded by Open Targets. JZ and TRG were funded by the UK Medical Research Council Integrative Epidemiology Unit (MC_UU_00011/4). JSh and GJW were funded by the Wellcome Trust Grant 206194. This research was funded in part by the Wellcome Trust (grant 206194). For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Mohd Anisul, Email: mk31@sanger.ac.uk.

Maya Ghoussaini, Email: mg29@sanger.ac.uk.

Ian Dunham, Email: dunham@ebi.ac.uk.

John W Schoggins, University of Texas Southwestern Medical Center, United States.

Jos W Van der Meer, Radboud University Medical Centre, Netherlands.

Funding Information

This paper was supported by the following grants:

  • Wellcome Trust Grant 206194 to Mohd Anisul, Jarrod Shilts, Jeremy Schwartzentruber, Gavin J Wright, Maya Ghoussaini.

  • Medical Research Council MC_UU_00011/4 to Jie Zheng, Tom R Gaunt.

  • Open Target to Mohd Anisul, Jeremy Schwartzentruber, James Hayhurst, Annalisa Buniello, David Ochoa, Miguel Carmona, Ellen M McDonagh, Maya Ghoussaini, Ian Dunham.

Additional information

Competing interests

Open Targets is a pre-competitive partnership currently involving the Wellcome Sanger Institute, EMBL-EBI, BMS, GSK, and Sanofi. Research is funded by financial and in-kind contributions from each of the partners.

none.

Open Targets is a pre-competitive partnership currently involving the Wellcome Sanger Institute, EMBL-EBI, BMS, GSK, and Sanofi. Research is funded by financial and in-kind contributions from each of the partners. ES is also a full-time employee of Bristol-Myers Squibb.

Dr Holmes has consulted for Boehringer Ingelheim, and in adherence to the University of Oxford’s Clinical Trial Service Unit & Epidemiological Studies Unit (CSTU) staff policy, did not accept personal honoraria or other payments from pharmaceutical companies.

JM is a full-time employee of Bristol-Myers Squibb and retains stock or stock options in Bristol-Myers Squibb. The author has no other competing interests to declare.

TG received grants from Biogen and GlaxoSmithKline. The author has no other competing interests to declare.

Open Targets is a pre-competitive partnership currently involving the Wellcome Sanger Institute, EMBL-EBI, BMS, GSK, and Sanofi. Research is funded by financial and in-kind contributions from each of the partners. ID also received travel costs within the last 36 months from Takeda for speaking at their Reverse Translation Symposium. The author has no other competing interests to declare.

Author contributions

Conceptualization, Data curation, Formal analysis, Visualization, Writing – original draft, Investigation.

Conceptualization, Project administration, Writing – original draft, Investigation.

Conceptualization, Supervision, Investigation.

Data curation, Resources, Software, Investigation.

Data curation, Resources, Investigation.

Conceptualization, Writing – review and editing, Supervision, Investigation.

Supervision, Investigation.

Conceptualization, Supervision, Project administration, Writing – original draft, Investigation.

Data curation, Resources, Software, Investigation.

Data curation, Validation, Resources.

Conceptualization, Supervision, Investigation.

Supervision, Investigation.

Validation, Resources, Investigation.

Resources, Supervision, Investigation.

Resources, Supervision, Investigation.

Conceptualization, Writing – review and editing, Supervision, Investigation.

Conceptualization, Supervision, Investigation.

Conceptualization, Validation, Supervision, Investigation.

Ethics

Human subjects: All institutions contributing cohorts to the COVID-19 Host Genetics Initiative and INTERVAL (Sun et al) study for proteomics received ethics approval from their respective research ethics review boards. All participants in the INTERVAL study provided informed consent before joining the INTERVAL study with approval from the National Research Ethics (11/EE/0538). Ethics statements of studies that contributed participant data to the COVID-19 Host Genetics Initiative are provided in Supplementary Table 1 of their recently published paper (https://www.nature.com/articles/s41586-021-03767-x).

Additional files

Supplementary file 1. Definitions of COVID outcomes.
elife-69719-supp1.xlsx (10.7KB, xlsx)
Supplementary file 2. Summary of proteins prioritised by pan-Mendelian randomisation.
elife-69719-supp2.xlsx (10.9KB, xlsx)
Supplementary file 3. Pan-Mendelian randomisation outcomes at p<0.05, each association divided into cis- or trans-pQTLs.
elife-69719-supp3.xlsx (241.6KB, xlsx)
Supplementary file 4. Cis-Mendelian randomisation outcomes at p<0.05.
elife-69719-supp4.xlsx (305.4KB, xlsx)
Supplementary file 5. Evaluation of pan-Mendelian randomisation association of protein probes that have passed the 5% FDR.
elife-69719-supp5.xlsx (13KB, xlsx)
Supplementary file 6. Protein-wide association studies (at study-specific Bonferroni thresholds) of the ABO signal using three proteomic datasets (Sun et al., Emilsson et al., Suhre et al.).
elife-69719-supp6.xlsx (16.7KB, xlsx)
Supplementary file 7. Proteome-wide genetic colocalisation results.
elife-69719-supp7.xlsx (4.5MB, xlsx)
Supplementary file 8. Phenome-wide association study (p<0.05) from Open Targets Genetics portal for each colocalising variant.
elife-69719-supp8.xlsx (310.7KB, xlsx)
Supplementary file 9. Key to Table 1.
elife-69719-supp9.xlsx (9.7KB, xlsx)
Transparent reporting form

Data availability

Summary data used for genetic analyses are publicly available (Sun et al can be downloaded from GWAS catalog https://www.ebi.ac.uk/gwas/downloads/summary-statistics and COVID-19 HGI summary statistics can be downloaded from their website https://www.covid19hg.org/results/). Data generated from our study are provided in the supplementary files (pan-MR and cis-MR association results filtered at p < 0.05 and no filters applied to colocalisation results).

The following dataset was generated:

Sun BB. 2018. Genomic atlas of the human plasma proteome. GWAS Catalog. GCST005806

COVID-19 Host Genetics Initiative 2021. Mapping the human genetic architecture of COVID-19 by worldwide meta-analysis. COVID GWAS meta-analysis results. release 4

References

  1. Achuthan A. Glucocorticoids promote apoptosis of proinflammatory monocytes by inhibiting ERK activity. Cell Death & Disease. 2018;9:267. doi: 10.1038/s41419-018-0332-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Amraie R. Cd209l/l-Sign and Cd209/Dc-Sign Act as Receptors for Sars-Cov-2 and Are Differentially Expressed in Lung and Kidney Epithelial and Endothelial Cells. bioRxiv. 2020 doi: 10.1101/2020.06.22.165803. [DOI]
  3. Anisul M. Covid_paper. swh:1:rev:4ab9f9b17ffde57f7831ea555394290ba240a2b9Software Heritage. 2021 https://github.com/mohdkarim/covid_paper
  4. Anthony C G, Paul R M. Interleukin-6 Receptor Antagonists in Critically Ill Patients with COVID-19 Preliminary Report. medRxiv. 2021 doi: 10.1101/2021.01.07.21249390. [DOI]
  5. Arguinano A-AA, Ndiaye NC, Masson C, Visvikis-Siest S. Pleiotropy of ABO gene: Correlation of RS644234 with e-selectin and lipid levels. Clinical Chemistry and Laboratory Medicine. 2018;56:748–754. doi: 10.1515/cclm-2017-0347. [DOI] [PubMed] [Google Scholar]
  6. Bartholdson SJ. Semaphorin-7A is an erythrocyte receptor for P. falciparum merozoite-specific TRAP homolog, MTRAP. PLOS Pathogens. 2012;8:e1003031. doi: 10.1371/journal.ppat.1003031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bhattacharjee S, Banerjee M, Pal R. ABO blood groups and severe outcomes in COVID-19: A meta-analysis. Postgraduate Medical Journal. 2020 doi: 10.1136/postgradmedj-2020-139248. [DOI] [PubMed] [Google Scholar]
  8. Bhopale GM, Nanda RK. Blood coagulation factor VIII: An overview. Journal of Biosciences. 2003;28:783–789. doi: 10.1007/BF02708439. [DOI] [PubMed] [Google Scholar]
  9. Bouhaddou M, Memon D, Meyer B, White KM, Rezelj VV, Correa Marrero M, Polacco BJ, Melnyk JE, Ulferts S, Kaake RM, Batra J, Richards AL, Stevenson E, Gordon DE, Rojc A, Obernier K, Fabius JM, Soucheray M, Miorin L, Moreno E, Koh C, Tran QD, Hardy A, Robinot R, Vallet T, Nilsson-Payant BE, Hernandez-Armenta C, Dunham A, Weigang S, Knerr J, Modak M, Quintero D, Zhou Y, Dugourd A, Valdeolivas A, Patil T, Li Q, Hüttenhain R, Cakir M, Muralidharan M, Kim M, Jang G, Tutuncuoglu B, Hiatt J, Guo JZ, Xu J, Bouhaddou S, Mathy CJP, Gaulton A, Manners EJ, Félix E, Shi Y, Goff M, Lim JK, McBride T, O’Neal MC, Cai Y, Chang JCJ, Broadhurst DJ, Klippsten S, De Wit E, Leach AR, Kortemme T, Shoichet B, Ott M, Saez-Rodriguez J, tenOever BR, Mullins RD, Fischer ER, Kochs G, Grosse R, García-Sastre A, Vignuzzi M, Johnson JR, Shokat KM, Swaney DL, Beltrao P, Krogan NJ. The global phosphorylation landscape of sars-cov-2 infection. Cell. 2020;182:685–712. doi: 10.1016/j.cell.2020.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bovijn J, Lindgren CM, Holmes MV. Genetic IL-6R variants and therapeutic inhibition of IL-6 receptor signalling in COVID-19 - Authors’ reply. The Lancet. Rheumatology. 2021;3:e97–e98. doi: 10.1016/S2665-9913(20)30415-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bushell KM, Söllner C, Schuster-Boeckler B, Bateman A, Wright GJ. Large-scale screening for novel low-affinity extracellular protein interactions. Genome Research. 2008;18:622–630. doi: 10.1101/gr.7187808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Butt D. FAS Inactivation Releases Unconventional Germinal Center B Cells that Escape Antigen Control and Drive IgE and Autoantibody Production. Immunity. 2015;42:890–902. doi: 10.1016/j.immuni.2015.04.010. [DOI] [PubMed] [Google Scholar]
  13. Cao B, Wang Y, Wen D, Liu W, Wang J, Fan G, Ruan L, Song B, Cai Y, Wei M, Li X, Xia J, Chen N, Xiang J, Yu T, Bai T, Xie X, Zhang L, Li C, Yuan Y, Chen H, Li H, Huang H, Tu S, Gong F, Liu Y, Wei Y, Dong C, Zhou F, Gu X, Xu J, Liu Z, Zhang Y, Li H, Shang L, Wang K, Li K, Zhou X, Dong X, Qu Z, Lu S, Hu X, Ruan S, Luo S, Wu J, Peng L, Cheng F, Pan L, Zou J, Jia C, Wang J, Liu X, Wang S, Wu X, Ge Q, He J, Zhan H, Qiu F, Guo L, Huang C, Jaki T, Hayden FG, Horby PW, Zhang D, Wang C. A Trial of Lopinavir–Ritonavir in Adults Hospitalized with Severe Covid-19. New England Journal of Medicine. 2020;382:1787–1799. doi: 10.1056/NEJMoa2001282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cheng J. Protection from Fas-mediated apoptosis by a soluble form of the Fas molecule. Science. 1994;263:1759–1762. doi: 10.1126/science.7510905. [DOI] [PubMed] [Google Scholar]
  15. Collier D, Anna DM, Isabella ATM, Ferreira BM, Rawling D. SARS-COV-2 b.1.1.7 Escape from Mrna Vaccine-Elicited Neutralizing Antibodies. medRxiv. 2021 doi: 10.1101/2021.01.19.21249840. [DOI]
  16. Davies NM, Holmes MV, Davey Smith G. Reading Mendelian randomisation studies: A guide, glossary, and checklist for clinicians. BMJ. 2018;362:k601. doi: 10.1136/bmj.k601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. The Lancet. Infectious Diseases. 2020;20:533–534. doi: 10.1016/S1473-3099(20)30120-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. EBISPOT Ebispot/gwas-sumstats-harmoniser. 9dcf511Github. 2020 https://github.com/EBISPOT/gwas-sumstats-harmoniser
  19. Ellinghaus D, Degenhardt F, Bujanda L, Buti M, Albillos A, Invernizzi P, Fernández J, Prati D, Baselli G, Asselta R, Grimsrud MM, Milani C, Aziz F, Kässens J, May S, Wendorff M, Wienbrandt L, Uellendahl-Werth F, Zheng T, Yi X, de Pablo R, Chercoles AG, Palom A, Garcia-Fernandez AE, Rodriguez-Frias F, Zanella A, Bandera A, Protti A, Aghemo A, Lleo A, Biondi A, Caballero-Garralda A, Gori A, Tanck A, Carreras Nolla A, Latiano A, Fracanzani AL, Peschuck A, Julià A, Pesenti A, Voza A, Jiménez D, Mateos B, Nafria Jimenez B, Quereda C, Paccapelo C, Gassner C, Angelini C, Cea C, Solier A, Pestaña D, Muñiz-Diaz E, Sandoval E, Paraboschi EM, Navas E, García Sánchez F, Ceriotti F, Martinelli-Boneschi F, Peyvandi F, Blasi F, Téllez L, Blanco-Grau A, Hemmrich-Stanisak G, Grasselli G, Costantino G, Cardamone G, Foti G, Aneli S, Kurihara H, ElAbd H, My I, Galván-Femenia I, Martín J, Erdmann J, Ferrusquía-Acosta J, Garcia-Etxebarria K, Izquierdo-Sanchez L, Bettini LR, Sumoy L, Terranova L, Moreira L, Santoro L, Scudeller L, Mesonero F, Roade L, Rühlemann MC, Schaefer M, Carrabba M, Riveiro-Barciela M, Figuera Basso ME, Valsecchi MG, Hernandez-Tejero M, Acosta-Herrera M, D’Angiò M, Baldini M, Cazzaniga M, Schulzky M, Cecconi M, Wittig M, Ciccarelli M, Rodríguez-Gandía M, Bocciolone M, Miozzo M, Montano N, Braun N, Sacchi N, Martínez N, Özer O, Palmieri O, Faverio P, Preatoni P, Bonfanti P, Omodei P, Tentorio P, Castro P, Rodrigues PM, Blandino Ortiz A, de Cid R, Ferrer R, Gualtierotti R, Nieto R, Goerg S, Badalamenti S, Marsal S, Matullo G, Pelusi S, Juzenas S, Aliberti S, Monzani V, Moreno V, Wesse T, Lenz TL, Pumarola T, Rimoldi V, Bosari S, Albrecht W, Peter W, Romero-Gómez M, D’Amato M, Duga S, Banales JM, Hov JR, Folseraas T, Valenti L, Franke A, Karlsen TH. Genomewide association study of severe covid-19 with respiratory failure. The New England Journal of Medicine. 2020;383:1522–1534. doi: 10.1056/NEJMoa2020283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Emilsson V, Ilkov M, Lamb JR, Finkel N, Gudmundsson EF, Pitts R, Hoover H, Gudmundsdottir V, Horman SR, Aspelund T, Shu L, Trifonov V, Sigurdsson S, Manolescu A, Zhu J, Olafsson Ö, Jakobsdottir J, Lesley SA, To J, Zhang J, Harris TB, Launer LJ, Zhang B, Eiriksdottir G, Yang X, Orth AP, Jennings LL, Gudnason V. Co-regulatory networks of human serum proteins link genetics to disease. Science. 2018;361:769–773. doi: 10.1126/science.aaq1327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Emilsson V. Human serum proteome profoundly overlaps with genetic signatures of disease. Cold Spring Harbor Laboratory. 2020 doi: 10.1101/2020.05.06.080440. [DOI] [Google Scholar]
  22. Erik V, Swapnil M, Meera C, Jeffrey C B. Transmission of SARS-COV-2 Lineage b.1.1.7 in England: Insights from Linking Epidemiological and Genetic Data. medRxiv. 2020 doi: 10.1101/2020.12.30.20249034. [DOI]
  23. Federici AB. The Factor Viii/von Willebrand Factor complex: Basic and clinical issues. Haematologica. 2003;88:EREP02. [PubMed] [Google Scholar]
  24. Galaway F, Wright GJ. Rapid and sensitive large-scale screening of low affinity extracellular receptor protein interactions by using reaction induced inhibition of gaussia luciferase. Scientific Reports. 2020;10:10522. doi: 10.1038/s41598-020-67468-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Gao C. SARS-CoV-2 Spike Protein Interacts with Multiple Innate Immune Receptors. Cold Spring Harbor Laboratory. 2020 doi: 10.1101/2020.07.29.227462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ghoussaini M. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Research. 2021;49:D1311–D1320. doi: 10.1093/nar/gkaa840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Giambartolomei C. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLOS Genetics. 2014;10:e1004383. doi: 10.1371/journal.pgen.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gkatzionis A, Burgess S, Newcombe PJ. Statistical Methods for Cis-Mendelian Randomization. arXiv. 2021 doi: 10.1002/gepi.22506. https://arxiv.org/abs/2101.04081 [DOI] [PMC free article] [PubMed]
  29. Gold L, Walker JJ, Wilcox SK, Williams S. Advances in human proteomics at high scale with the Somascan proteomics platform. New Biotechnology. 2012;29:543–549. doi: 10.1016/j.nbt.2011.11.016. [DOI] [PubMed] [Google Scholar]
  30. Gordon DE, Hiatt J, Bouhaddou M, Rezelj VV, Ulferts S, Braberg H, Jureka AS, Obernier K, Guo JZ, Batra J, Kaake RM, Weckstein AR, Owens TW, Gupta M, Pourmal S, Titus EW, Cakir M, Soucheray M, McGregor M, Cakir Z, Jang G, O’Meara MJ, Tummino TA, Zhang Z, Foussard H, Rojc A, Zhou Y, Kuchenov D, Hüttenhain R, Xu J, Eckhardt M, Swaney DL, Fabius JM, Ummadi M, Tutuncuoglu B, Rathore U, Modak M, Haas P, Haas KM, Naing ZZC, Pulido EH, Shi Y, Barrio-Hernandez I, Memon D, Petsalaki E, Dunham A, Marrero MC, Burke D, Koh C, Vallet T, Silvas JA, Azumaya CM, Billesbølle C, Brilot AF, Campbell MG, Diallo A, Dickinson MS, Diwanji D, Herrera N, Hoppe N, Kratochvil HT, Liu Y, Merz GE, Moritz M, Nguyen HC, Nowotny C, Puchades C, Rizo AN, Schulze-Gahmen U, Smith AM, Sun M, Young ID, Zhao J, Asarnow D, Biel J, Bowen A, Braxton JR, Chen J, Chio CM, Chio US, Deshpande I, Doan L, Faust B, Flores S, Jin M, Kim K, Lam VL, Li F, Li J, Li YL, Li Y, Liu X, Lo M, Lopez KE, Melo AA, Nguyen P, Paulino J, Pawar KI, Peters JK, Safari M, Sangwan S, Schaefer K, Thomas PV, Thwin AC, Trenker R, Tse E, Tsui TKM, Wang F, Whitis N, Yu Z, Zhang K, Zhang Y, Zhou F, Saltzberg D, QCRG Structural Biology Consortium. Hodder AJ, Shun-Shion AS, Williams DM, White KM, Rosales R, Kehrer T, Miorin L, Moreno E, Patel AH, Rihn S, Khalid MM, Vallejo-Gracia A, Fozouni P, Simoneau CR, Roth TL, Wu D, Karim MA, Ghoussaini M, Dunham I, Berardi F, Weigang S, Chazal M, Park J, Logue J, McGrath M, Weston S, Haupt R, Hastie CJ, Elliott M, Brown F, Burness KA, Reid E, Dorward M, Johnson C, Wilkinson SG, Geyer A, Giesel DM, Baillie C, Raggett S, Leech H, Toth R, Goodman N, Keough KC, Lind AL, Zoonomia Consortium. Klesh RJ, Hemphill KR, Carlson-Stevermer J, Oki J, Holden K, Maures T, Pollard KS, Sali A, Agard DA, Cheng Y, Fraser JS, Frost A, Jura N, Kortemme T, Manglik A, Southworth DR, Stroud RM, Alessi DR, Davies P, Frieman MB, Ideker T, Abate C, Jouvenet N, Kochs G, Shoichet B, Ott M, Palmarini M, Shokat KM, García-Sastre A, Rassen JA, Grosse R, Rosenberg OS, Verba KA, Basler CF, Vignuzzi M, Peden AA, Beltrao P, Krogan NJ. Comparative host-coronavirus protein interaction networks reveal pan-viral disease mechanisms. Science. 2020a;370:eabe9403. doi: 10.1126/science.abe9403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Gordon DE, Jang GM, Bouhaddou M, Xu J, Obernier K, White KM, O’Meara MJ, Rezelj VV, Guo JZ, Swaney DL, Tummino TA, Hüttenhain R, Kaake RM, Richards AL, Tutuncuoglu B, Foussard H, Batra J, Haas K, Modak M, Kim M, Haas P, Polacco BJ, Braberg H, Fabius JM, Eckhardt M, Soucheray M, Bennett MJ, Cakir M, McGregor MJ, Li Q, Meyer B, Roesch F, Vallet T, Mac Kain A, Miorin L, Moreno E, Naing ZZC, Zhou Y, Peng S, Shi Y, Zhang Z, Shen W, Kirby IT, Melnyk JE, Chorba JS, Lou K, Dai SA, Barrio-Hernandez I, Memon D, Hernandez-Armenta C, Lyu J, Mathy CJP, Perica T, Pilla KB, Ganesan SJ, Saltzberg DJ, Rakesh R, Liu X, Rosenthal SB, Calviello L, Venkataramanan S, Liboy-Lugo J, Lin Y, Huang X-P, Liu Y, Wankowicz SA, Bohn M, Safari M, Ugur FS, Koh C, Savar NS, Tran QD, Shengjuler D, Fletcher SJ, O’Neal MC, Cai Y, Chang JCJ, Broadhurst DJ, Klippsten S, Sharp PP, Wenzell NA, Kuzuoglu-Ozturk D, Wang H-Y, Trenker R, Young JM, Cavero DA, Hiatt J, Roth TL, Rathore U, Subramanian A, Noack J, Hubert M, Stroud RM, Frankel AD, Rosenberg OS, Verba KA, Agard DA, Ott M, Emerman M, Jura N, von Zastrow M, Verdin E, Ashworth A, Schwartz O, d’Enfert C, Mukherjee S, Jacobson M, Malik HS, Fujimori DG, Ideker T, Craik CS, Floor SN, Fraser JS, Gross JD, Sali A, Roth BL, Ruggero D, Taunton J, Kortemme T, Beltrao P, Vignuzzi M, García-Sastre A, Shokat KM, Shoichet BK, Krogan NJ. A SARS-COV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020b;583:459–468. doi: 10.1038/s41586-020-2286-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Griffith GJ. Collider bias undermines our understanding of COVID-19 disease risk and severity. Nature Communications. 2020;11:5749. doi: 10.1038/s41467-020-19478-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Groot HE. Genetically Determined ABO Blood Group and its Associations With Health and Disease. Arteriosclerosis, Thrombosis, and Vascular Biology. 2020;40:830–838. doi: 10.1161/ATVBAHA.119.313658. [DOI] [PubMed] [Google Scholar]
  34. Hao Z. Fas receptor expression in germinal-center B cells is essential for T and B lymphocyte homeostasis. Immunity. 2008;29:615–627. doi: 10.1016/j.immuni.2008.07.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Harvey WT, Carabelli AM, Jackson B, Gupta RK, Thomson EC, Harrison EM, Ludden C, Reeve R, Rambaut A, COVID-19 Genomics UK (COG-UK) Consortium. Peacock SJ, Robertson DL. SARS-COV-2 variants, spike mutations and immune escape. Nature Reviews. Microbiology. 2021;19:409–424. doi: 10.1038/s41579-021-00573-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. European Journal of Human Genetics. 2020;28:715–718. doi: 10.1038/s41431-020-0636-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Ivanisenko NV, Seyrek K, Kolchanov NA, Ivanisenko VA, Lavrik IN. The role of death domain proteins in host response upon sars-cov-2 infection: Modulation of programmed cell death and translational applications. Cell Death Discovery. 2020;6:101. doi: 10.1038/s41420-020-00331-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Joashi U, Tibby SM, Turner C, Mayer A, Austin C, Anderson D, Durward A, Murdoch IA. Soluble fas may be a proinflammatory marker after cardiopulmonary bypass in children. The Journal of Thoracic and Cardiovascular Surgery. 2002;123:137–144. doi: 10.1067/mtc.2002.118685. [DOI] [PubMed] [Google Scholar]
  39. Joshi A, Mayr M. In Aptamers they trust: The caveats of the Somascan biomarker discovery platform from Somalogic. Circulation. 2018;138:2482–2485. doi: 10.1161/CIRCULATIONAHA.118.036823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Karim M, Dunham I, Ghoussaini M. Mining a gwas of severe covid-19. The New England Journal of Medicine. 2020;383:2588–2589. doi: 10.1056/NEJMc2025747. [DOI] [PubMed] [Google Scholar]
  41. Katz DH, Tahir UA, Ngo D, Benson MD, Bick AG, Pampana A, Gao Y, Keyes MJ, Correa A, Sinha S, Shen D, Yang Q, Robbins JM, Chen ZZ, Cruz DE, Peterson B, Natarajan P, Vasan RS, Smith G, Wang TJ, Gerszten RE. Proteomic Profiling in Biracial Cohorts Implicates DC-SIGN as a Mediator of Genetic Risk in COVID-19. medRxiv. 2020 doi: 10.1101/2020.06.09.20125690. [DOI]
  42. Kerr JS, Wright GJ. Avidity-based extracellular interaction screening (AVEXIS) for the scalable detection of low-affinity extracellular receptor-ligand interactions. Journal of Visualized Experiments. 2012:e3881. doi: 10.3791/3881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Khamsi R. Rogue antibodies could be driving severe covid-19. Nature. 2021;590:29–31. doi: 10.1038/d41586-021-00149-1. [DOI] [PubMed] [Google Scholar]
  44. Lek M. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. doi: 10.1038/nature19057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Malas MB. Thromboembolism risk of COVID-19 is high and associated with a higher risk of mortality: A systematic review and meta-analysis. EClinicalMedicine. 2020;29:100639. doi: 10.1016/j.eclinm.2020.100639. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Monk PD. Safety and efficacy of inhaled nebulised interferon beta-1a (SNG001) for treatment of SARS-CoV-2 infection: a randomised, double-blind, placebo-controlled, phase 2 trial. The Lancet Respiratory Medicine. 2020;9:196–206. doi: 10.1016/S2213-2600(20)30511-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Mountjoy E. Open Targets Genetics: An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Cold Spring Harbor Laboratory. 2020 doi: 10.1101/2020.09.16.299271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Mummidi S, Catano G, Lam L, Hoefle A, Telles V, Begum K, Jimenez F, Ahuja SS, Ahuja SK. Extensive repertoire of membrane-bound and soluble dendritic cell-specific icam-3-grabbing nonintegrin 1 (dc-sign1) and dc-sign2 isoforms. Inter-individual variation in expression of dc-sign transcripts. The Journal of Biological Chemistry. 2001;276:33196–33212. doi: 10.1074/jbc.M009807200. [DOI] [PubMed] [Google Scholar]
  49. National Institute for Health and Care Excellence . COVID-19 Rapid Guideline: Reducing the Risk of Venous Thromboembolism in over 16s with COVID-19. London: National Institute for Health and Care Excellence, Clinical Guidelines; 2020. [PubMed] [Google Scholar]
  50. Open Targets Genetics Opentargets/genetics-v2g-data. 172d598Github. 2019 https://github.com/opentargets/genetics-v2g-data
  51. Open Targets Genetics Assigning variants to genes (v2g. Open Targets Genetics. 2019a https://genetics-docs.opentargets.org/our-approach/data-pipeline
  52. Opentargets Inc Genetics-sumstat-harmoniser. 733f3e7Github. 2021 https://github.com/opentargets/genetics-sumstat-harmoniser
  53. Pairo-Castineira E, Clohisey S, Klaric L, Bretherick AD, Rawlik K, Pasko D, Walker S, Parkinson N, Fourman MH, Russell CD, Furniss J, Richmond A, Gountouna E, Wrobel N, Harrison D, Wang B, Wu Y, Meynert A, Griffiths F, Oosthuyzen W, Kousathanas A, Moutsianas L, Yang Z, Zhai R, Zheng C, Grimes G, Beale R, Millar J, Shih B, Keating S, Zechner M, Haley C, Porteous DJ, Hayward C, Yang J, Knight J, Summers C, Shankar-Hari M, Klenerman P, Turtle L, Ho A, Moore SC, Hinds C, Horby P, Nichol A, Maslove D, Ling L, McAuley D, Montgomery H, Walsh T, Pereira AC, Renieri A, GenOMICC Investigators. ISARIC4C Investigators. COVID-19 Human Genetics Initiative. 23andMe Investigators. BRACOVID Investigators. Gen-COVID Investigators. Shen X, Ponting CP, Fawkes A, Tenesa A, Caulfield M, Scott R, Rowan K, Murphy L, Openshaw PJM, Semple MG, Law A, Vitart V, Wilson JF, Baillie JK. Genetic mechanisms of critical illness in covid-19. Nature. 2021;591:92–98. doi: 10.1038/s41586-020-03065-y. [DOI] [PubMed] [Google Scholar]
  54. Perrin-Cocon L, Diaz O, Jacquemin C, Barthel V, Ogire E, Ramière C, André P, Lotteau V, Vidalain PO. The current landscape of coronavirus-host protein-protein interactions. Journal of Translational Medicine. 2020;18:319. doi: 10.1186/s12967-020-02480-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Plazolles N. Pivotal advance: The promotion of soluble DC-SIGN release by inflammatory signals and its enhancement of cytomegalovirus-mediated cis-infection of myeloid dendritic cells. Journal of Leukocyte Biology. 2011;89:329–342. doi: 10.1189/jlb.0710386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. RECOVERY Collaborative Group Tocilizumab in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial. Lancet. 2021;397:1637–1645. doi: 10.1016/S0140-6736(21)00676-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. RECOVERY Collaborative Group. Horby P, Lim WS, Emberson JR, Mafham M, Bell JL, Linsell L, Staplin N, Brightling C, Ustianowski A, Elmahi E, Prudon B, Green C, Felton T, Chadwick D, Rege K, Fegan C, Chappell LC, Faust SN, Jaki T, Jeffery K, Montgomery A, Rowan K, Juszczak E, Baillie JK, Haynes R, Landray MJ. Dexamethasone in Hospitalized Patients with Covid-19. The New England Journal of Medicine. 2021;384:693–704. doi: 10.1056/NEJMoa2021436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Schmidt AF. Genetic drug target validation using Mendelian randomisation. Nature Communications. 2020;11:3255. doi: 10.1038/s41467-020-16969-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Shang J, Wan Y, Luo C, Ye G, Geng Q, Auerbach A, Li F. Cell entry mechanisms of SARS-COV-2. PNAS. 2020;117:11727–11734. doi: 10.1073/pnas.2003138117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Sharma S, Bartholdson SJ, Couch ACM, Yusa K, Wright GJ. Genome-scale identification of cellular pathways required for cell surface recognition. Genome Research. 2018;28:1372–1382. doi: 10.1101/gr.231183.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Shelton JF. Trans-Ethnic Analysis Reveals Genetic and Non-Genetic Associations with COVID-19 Susceptibility and Severity. medRxiv. 2020 doi: 10.1101/2020.09.04.20188318. [DOI]
  62. Shilts J, Crozier TWM, Greenwood EJD, Lehner PJ, Wright GJ. No evidence for basigin/CD147 as a direct SARS-CoV-2 spike binding receptor. Scientific Reports. 2021;11:413. doi: 10.1038/s41598-020-80464-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Sorbera L, Graul A, Dulsat C. Taking aim at a fast-moving target: Targets to watch for SARS-COV-2 and COVID-19. Drugs of the Future. 2020;45:239. doi: 10.1358/dof.2020.45.4.3150676. [DOI] [Google Scholar]
  64. Stone JH, Frigault MJ, Serling-Boyd NJ, Fernandes AD, Harvey L, Foulkes AS, Horick NK, Healy BC, Shah R, Bensaci AM, Woolley AE, Nikiforow S, Lin N, Sagar M, Schrager H, Huckins DS, Axelrod M, Pincus MD, Fleisher J, Sacks CA, Dougan M, North CM, Halvorsen Y-D, Thurber TK, Dagher Z, Scherer A, Wallwork RS, Kim AY, Schoenfeld S, Sen P, Neilan TG, Perugino CA, Unizony SH, Collier DS, Matza MA, Yinh JM, Bowman KA, Meyerowitz E, Zafar A, Drobni ZD, Bolster MB, Kohler M, D’Silva KM, Dau J, Lockwood MM, Cubbison C, Weber BN, Mansour MK, BACC Bay Tocilizumab Trial Investigators Efficacy of tocilizumab in patients hospitalized with covid-19. The New England Journal of Medicine. 2020;383:2333–2344. doi: 10.1056/NEJMoa2028836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Suhre K, Arnold M, Bhagwat AM, Cotton RJ, Engelke R, Raffler J, Sarwath H, Thareja G, Wahl A, DeLisle RK, Gold L, Pezer M, Lauc G, El-Din Selim MA, Mook-Kanamori DO, Al-Dous EK, Mohamoud YA, Malek J, Strauch K, Grallert H, Peters A, Kastenmüller G, Gieger C, Graumann J. Erratum: Connecting genetic risk to disease end points through the human blood plasma proteome. Nature Communications. 2017;8:14357. doi: 10.1038/ncomms15345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, Burgess S, Jiang T, Paige E, Surendran P, Oliver-Williams C, Kamat MA, Prins BP, Wilcox SK, Zimmerman ES, Chi A, Bansal N, Spain SL, Wood AM, Morrell NW, Bradley JR, Janjic N, Roberts DJ, Ouwehand WH, Todd JA, Soranzo N, Suhre K, Paul DS, Fox CS, Plenge RM, Danesh J, Runz H, Butterworth AS. Genomic atlas of the human plasma proteome. Nature. 2018;558:73–79. doi: 10.1038/s41586-018-0175-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Tegally H. Emergence and Rapid Spread of a New Severe Acute Respiratory Syndrome-Related Coronavirus 2 (SARS-COV-2) Lineage with Multiple Spike Mutations in South Africa. medRxiv. 2020 doi: 10.1101/2020.12.21.20248640. [DOI]
  68. Thomson BJ. Viruses and apoptosis. International Journal of Experimental Pathology. 2001;82:65–76. doi: 10.1111/j.1365-2613.2001.iep0082-0065-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. U.S. National Library of Medicine Anti-thrombotics for adults hospitalized with COVID-19 (ACTIV-4) 2020. [April 10, 2021]. https://clinicaltrials.gov/ct2/show/record/NCT04505774
  70. Voulgaraki D. Multivalent recombinant proteins for probing functions of leucocyte surface proteins such as the CD200 receptor. Immunology. 2005;115:337–346. doi: 10.1111/j.1365-2567.2005.02161.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Wallace C. Coloc. 624e94bGithub. 2021 https://github.com/chr1swallace/coloc
  72. WHO Solidarity Trial Consortium. Pan H, Peto R, Henao-Restrepo A-M, Preziosi M-P, Sathiyamoorthy V, Abdool Karim Q, Alejandria MM, Hernández García C, Kieny M-P, Malekzadeh R, Murthy S, Reddy KS, Roses Periago M, Abi Hanna P, Ader F, Al-Bader AM, Alhasawi A, Allum E, Alotaibi A, Alvarez-Moreno CA, Appadoo S, Asiri A, Aukrust P, Barratt-Due A, Bellani S, Branca M, Cappel-Porter HBC, Cerrato N, Chow TS, Como N, Eustace J, García PJ, Godbole S, Gotuzzo E, Griskevicius L, Hamra R, Hassan M, Hassany M, Hutton D, Irmansyah I, Jancoriene L, Kirwan J, Kumar S, Lennon P, Lopardo G, Lydon P, Magrini N, Maguire T, Manevska S, Manuel O, McGinty S, Medina MT, Mesa Rubio ML, Miranda-Montoya MC, Nel J, Nunes EP, Perola M, Portolés A, Rasmin MR, Raza A, Rees H, Reges PPS, Rogers CA, Salami K, Salvadori MI, Sinani N, Sterne JAC, Stevanovikj M, Tacconelli E, Tikkinen KAO, Trelle S, Zaid H, Røttingen J-A, Swaminathan S. Repurposed antiviral drugs for covid-19 - interim WHO solidarity trial results. The New England Journal of Medicine. 2021;384:497–511. doi: 10.1056/NEJMoa2023184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Yamagata University Genomic Cohort Consortium (YUGCC) Pleiotropic effect of common variants at ABO glycosyltranferase locus in 9q32 on plasma levels of pancreatic lipase and angiotensin converting enzyme. PLOS ONE. 2014;9:e55903. doi: 10.1371/journal.pone.0055903. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Yang D. Attenuated Interferon and Proinflammatory Response in SARS-CoV-2–Infected Human Dendritic Cells Is Associated With Viral Antagonism of STAT1 Phosphorylation. The Journal of Infectious Diseases. 2020;222:734–745. doi: 10.1093/infdis/jiaa356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Zenonos ZA, Dummler SK, Müller-Sienerth N, Chen J, Preiser PR, Rayner JC, Wright GJ. Basigin is a druggable target for host-oriented antimalarial interventions. The Journal of Experimental Medicine. 2015;212:1145–1151. doi: 10.1084/jem.20150032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Zhao J. Relationship between the ABO Blood Group and the COVID-19 Susceptibility. Clinical Infectious Diseases. 2020;73:328–331. doi: 10.1093/cid/ciaa1150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Zheng J, Haberland V, Baird D, Walker V, Haycock PC, Hurle MR, Gutteridge A, Erola P, Liu Y, Luo S, Robinson J, Richardson TG, Staley JR, Elsworth B, Burgess S, Sun BB, Danesh J, Runz H, Maranville JC, Martin HM, Yarmolinsky J, Laurin C, Holmes MV, Liu JZ, Estrada K, Santos R, McCarthy L, Waterworth D, Nelson MR, Smith GD, Butterworth AS, Hemani G, Scott RA, Gaunt TR. Phenome-wide mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nature Genetics. 2020;52:1122–1131. doi: 10.1038/s41588-020-0682-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Zhou S, Butler-Laporte G, Nakanishi T, Morrison DR, Afilalo J, Afilalo M, Laurent L, Pietzner M, Kerrison N, Zhao K, Brunet-Ratnasingham E, Henry D, Kimchi N, Afrasiabi Z, Rezk N, Bouab M, Petitjean L, Guzman C, Xue X, Tselios C, Vulesevic B, Adeleye O, Abdullah T, Almamlouk N, Chen Y, Chassé M, Durand M, Paterson C, Normark J, Frithiof R, Lipcsey M, Hultström M, Greenwood CMT, Zeberg H, Langenberg C, Thysell E, Pollak M, Mooser V, Forgetta V, Kaufmann DE, Richards JB. A neanderthal oas1 isoform protects individuals of european ancestry against covid-19 susceptibility and severity. Nature Medicine. 2021;27:659–667. doi: 10.1038/s41591-021-01281-1. [DOI] [PubMed] [Google Scholar]
  79. Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R, Robinson MR, McGrath JJ, Visscher PM, Wray NR, Yang J. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nature Communications. 2018;9:224. doi: 10.1038/s41467-017-02317-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Zietz M, Zucker J, Tatonetti NP. Associations between blood type and covid-19 infection, intubation, and death. Nature Communications. 2020;11:5761. doi: 10.1038/s41467-020-19623-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: John W Schoggins1

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Acceptance summary:

This paper takes advantage of publicly available genomic and proteomic data to identify host proteins that may be exploited by SARS-CoV-2 and involved in regulation of COVID-19 susceptibility and severity. These studies provide a foundation for examining the potential of these host proteins to serve as therapeutic targets.

Decision letter after peer review:

Thank you for submitting your article "A proteome-wide genetic investigation identifies several SARS-CoV-2-exploited host targets of clinical relevance" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Jos van der Meer as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential Revisions (for the authors):

Please address each of the points raised by both reviewers (8 points for reviewer #1 and 4 points for reviewer #2).

Reviewer #1 (Recommendations for the authors):

One of few novel results is the identification of CD209 as a potential alternative SARS-CoV-2 receptor; CD209 is encoded by a gene on chromosome 19 but its concentrations were associated with the lead SNP rs8176719 within the ABO region on chromosome 9, which was strongly linked with several COVID-19 outcomes. This is a good example of a trans-pQTL. The authors mentioned that the interaction of CD209 with SARS-CoV-2 spike protein was recently proposed (but without providing a citation at this point, the citation was provided only much later in the paper). The authors demonstrated an interaction between SARS-CoV-2 spike protein and recombinant CD209 and in cells expressing CD209 on the cell surface. This part is interesting but very under-developed and under-discussed.

1. Specifically, is it suggested that rs8176719 is a pQTL for CD209 through the effect on ABO or through an independent effect? Mediation analysis could be applied to test this hypothesis and a biological mechanism of this association should be discussed. Both of these proteins are measured in soluble form in the blood of controls. How does this relate to the events that occur at the cell surface, presumably during SARS-CoV-2 infection?

2. P.5. Proteins with associated missense variants were flagged and excluded from any further downstream analyses on the basis that the missense variant(s) might influence aptamer binding.

SNP rs8176719 is a protein-affecting frame-shift variant within ABO, why wasn't ABO excluded from the analysis?

3. Table 1: The list of different approaches and their combinations requires a specific dictionary and a more detailed explanation of each approach, this would be a very hard part to understand for the general readership.

4. The statistical considerations: there are clearly many more trans-pQTL interactions tested compared to cis-pQTLs. How was it adjusted for in statistical analyses?

5. P5. Colocalization analysis settings assumed one independent signal in each region. Is this a valid assumption to make and how it might affect the analysis if there is more than one signal?

6. P5. "Genotype data from 10,000 randomly sampled UK Biobank participants was used as a reference panel" – reference panel for what?

7. Supplementary Tables: providing rs numbers in all tables would be very helpful.

8. Zhou et al. (PMID: 33633408) is referenced as both MedRxiv and Nat Med publication.

Reviewer #2 (Recommendations for the authors):

1) The authors should cite work by Katz et al. (http://dx.doi.org/10.1101/2020.06.09.20125690), which was the first study to suggest that ABO-mediated change in CD209 levels may play a role in COVID-19 pathogenesis

2) The ABO variation is referred to as rs8176719-TC, which the authors call a SNP. It is not clear what "TC" means and it is not clear why they call the indel a "SNP". It would be helpful if the authors provided more information on this indel variant (which is not a SNP), its consequences on the coding sequence, and linkage to GWAS SNPs.

3) The first phrase in the DC-SIGN section of the Results stating "The ABO signal (rs8176719-TC) overlaps several genes" is unclear and should be clarified.

4) Ref. 10 (medRxiv) is now published in Nature Medicine and the citation should be updated.

eLife. 2021 Aug 17;10:e69719. doi: 10.7554/eLife.69719.sa2

Author response


Reviewer #1 (Recommendations for the authors):

One of few novel results is the identification of CD209 as a potential alternative SARS-CoV-2 receptor; CD209 is encoded by a gene on chromosome 19 but its concentrations were associated with the lead SNP rs8176719 within the ABO region on chromosome 9, which was strongly linked with several COVID-19 outcomes. This is a good example of a trans-pQTL. The authors mentioned that the interaction of CD209 with SARS-CoV-2 spike protein was recently proposed (but without providing a citation at this point, the citation was provided only much later in the paper). The authors demonstrated an interaction between SARS-CoV-2 spike protein and recombinant CD209 and in cells expressing CD209 on the cell surface. This part is interesting but very under-developed and under-discussed.

We thank the reviewer for their supportive comments and interest in our findings related to CD209. We agree that this is among our more noteworthy novel results, and thus should have been more fully discussed and developed in the text. In our revised manuscript, we have updated each main section (introduction, results, and discussion) to better present these findings. This includes references to the SARS-CoV-2 spike protein and the relevance of its receptors for viral attachment and entry, more description of the experiments we carried out to test the interaction, and an extended discussion that connects our findings with CD209 to other previously-studied viruses that have been studied in the context of both soluble CD209 binding and binding to CD209 as a receptor.

1. Specifically, is it suggested that rs8176719 is a pQTL for CD209 through the effect on ABO or through an independent effect? Mediation analysis could be applied to test this hypothesis and a biological mechanism of this association should be discussed. Both of these proteins are measured in soluble form in the blood of controls. How does this relate to the events that occur at the cell surface, presumably during SARS-CoV-2 infection?

We agree with the reviewer, that a mediation analysis testing whether the effect of the ABO signal rs8176719 on COVID risk was mediated by CD209 would be a useful addition to the manuscript. However, as we point out in the paper (and illustrated in Figure 3), the ABO signal is highly pleiotropic and is associated with the abundance of several dozen circulating proteins. The pleiotropism of the ABO signal (and the limited number of cis-acting independent genetic signals associated with CD209) makes it difficult to tease out the specific contribution of CD209 to COVID risk using the various MR approaches that test genetic mediation (e.g. multivariable or two-step MR methods, which require large numbers of independent genetic signals). Nevertheless, various lines of evidence point to at least partial mediation of the ABO signal via CD209: extensive previous literature implicates CD209 as a known viral receptor 1–5; the ABO signal shows a stronger association with abundance of soluble CD209 (relative to other ABO-signal associated proteins) as noted in our paper; and the binding experiments that we carried out provide evidence of interaction between soluble CD209 and the full-length SARS-CoV-2 spike protein. A complete and formal mediation analysis is warranted in future studies when larger and more well-powered proteomic datasets become available.

With regards to the relationship between soluble and membrane-bound forms of CD209, in our discussion we have also added citations describing the link between the soluble CD209 protein isoform measured in our datasets and the membrane-bound isoform. From these we propose two possible mechanisms: first, because both isoforms correlate in expression, soluble CD209 expression levels could inform the levels of cell-surface CD209 available to facilitate viral entry. Second, the soluble CD209 itself may influence viral infection by competitively binding the virus, analogous to mechanisms previously shown for HIV and cytomegalovirus which both can also directly bind CD2091,2.

2. P.5. Proteins with associated missense variants were flagged and excluded from any further downstream analyses on the basis that the missense variant(s) might influence aptamer binding.

SNP rs8176719 is a protein-affecting frame-shift variant within ABO, why wasn't ABO excluded from the analysis?

We thank the reviewer for raising this important point that, we believe, requires further clarification on our part. The logic we used is that only missense cis-pQTLs that were not cis-eQTLs would be excluded from analyses. The ABO signal rs8176719, although a frameshift variant, is a cis-eQTL across several eQTL datasets including GTEx and Blueprint (https://genetics.opentargets.org/variant/9_133257521_T_TC, eQTL tab under ‘Assigned genes’ section); this means that causal inference using rs8176719 as a genetic instrument remains valid even though the effect estimates may be biased. We revised parts of the ‘Evidence against aptamer binding artefacts’ section (P6-7) to better reflect this reasoning, mentioned this in the Results section (P9), and flagged ABO in Figure 2 and Figure 5 (where effect estimates were used) with an asterisk denoting that the effect estimates may be biased.

3. Table 1: The list of different approaches and their combinations requires a specific dictionary and a more detailed explanation of each approach, this would be a very hard part to understand for the general readership.

On reflection, we agree with the reviewer, the approaches used in each column of Table 1 require a description in order to be accessible for the general reader. We have provided an extra supplementary table (Supplementary File 9, mentioned in the legend of Table 1) that describes the approaches used in further detail.

4. The statistical considerations: there are clearly many more trans-pQTL interactions tested compared to cis-pQTLs. How was it adjusted for in statistical analyses?

Assuming the statistical considerations raised by the reviewer are related to pan-MR analysis, we would like to direct the reviewer to Supplementary File 2 which shows the number of cis- vs. trans-pQTLs (selected by the GSMR algorithm) used in the pan-MR analysis across 60 proteins that were associated at least one of the seven COVID phenotypes at nominal significance. At nominal significance, there were 1,092 cis-pQTLs vs. 652 trans-pQTLs across 60 protein probes and at 5% FDR, there were 265 cis-pQTLs vs. 240 trans-pQTLs across 9 protein probes.

Overall, our dual pan- and cis-MR approach resulted in 1,286 tests in pan-MR analyses and 2,042 tests in cis-MR analyses (higher than the number of pan-MR tests because of the less stringent p-value used: p <= 1 x 10-5) – both these numbers were used to adjust the 5% FDR threshold of significance in pan- and cis-MR analyses, respectively. We have edited the ‘Mendelian randomization’ section (P6) to incorporate this information.

5. P5. Colocalization analysis settings assumed one independent signal in each region. Is this a valid assumption to make and how it might affect the analysis if there is more than one signal?

To keep our analyses simple, we used marginal association statistics to perform genetic colocalisation (clarified further in P6). So yes, we assumed one independent signal in each region. The caveat of this assumption is that we are likely to miss associations which are driven by secondary (weaker) causal variants. We note this limitation in our Discussion (P15):

“However, the genetic colocalisation tests used in our study assumed a single causal variant in each locus and will, therefore, result in higher false negative tests if there is more than one trait-associated causal variant”.

6. P5. "Genotype data from 10,000 randomly sampled UK Biobank participants was used as a reference panel" – reference panel for what?

We have edited the ‘Mendelian randomization’ section (P6) to clarify this:

“To select near-independent genetic instruments and account for linkage disequilibrium (LD) in the MR analyses, we used genotype data from 10,000 randomly sampled UK Biobank participants to create a reference LD matrix, which is ancestry-matched to the pQTL data we used.”

7. Supplementary Tables: providing rs numbers in all tables would be very helpful.

We used chr:position:ref_allele:alt_allele as the variant identifier because it was a more reliable means of variant identification than rsids. But, as suggested, we have now revised Supplementary File 3, 4, and 8 to include rsids alongside the variant identifiers for ease of look up. The revised supplementary tables should now have an ‘rsid’ column in all tables where the variant IDs (‘snp’ column) are present.

8. Zhou et al. (PMID: 33633408) is referenced as both MedRxiv and Nat Med publication.

We apologise for this duplication. We have now removed all references of Zhou et al. related to the preprint and replaced the reference for the publication (PMID: 33633408)

Reviewer #2 (Recommendations for the authors):

1) The authors should cite work by Katz et al. (http://dx.doi.org/10.1101/2020.06.09.20125690), which was the first study to suggest that ABO-mediated change in CD209 levels may play a role in COVID-19 pathogenesis

We agree with the reviewer, Katz et al was the first to suggest a relationship between ABO and CD209 in their preprint. We would also like to note that, after the preprint, both Katz et al and ourselves simultaneously published this discovery as a Correspondence in NEJM (https://www.nejm.org/doi/full/10.1056/NEJMc2025747) and it is this publication that was cited in our paper. The correspondence from Katz et al included the text from their preprint (https://www.nejm.org/doi/suppl/10.1056/NEJMc2025747/suppl_file/nejmc2025747_sa2_appendix.pdf). Nevertheless, we have now mentioned the preprint in the Discussion section (P13).

2) The ABO variation is referred to as rs8176719-TC, which the authors call a SNP. It is not clear what "TC" means and it is not clear why they call the indel a "SNP". It would be helpful if the authors provided more information on this indel variant (which is not a SNP), its consequences on the coding sequence, and linkage to GWAS SNPs.

We regret this confusion. In the revised manuscript, we have edited rs8176719-TC to be rs8176719-insC (or insertionC) to make it explicit what the effect allele is and describe the variant and clarify the consequences of the variant on P9 when it is first mentioned. We have also re-labelled rs8176719 as an ABO signal rather than a ‘SNP’.

Furthermore, previously we used LDlink to annotate the functional consequences of variants and in many cases, including for the ABO variant, we discovered that the variants were not properly annotated. In this revised version, we use gnomad v2 variant effect prediction annotations from our portal Open Targets Genetics to annotate the functional consequence of all variants (revised Supplementary File 3 and 4).

3) The first phrase in the DC-SIGN section of the Results stating "The ABO signal (rs8176719-TC) overlaps several genes" is unclear and should be clarified.

We edited the text in that section (P9) and replaced "The ABO signal (rs8176719-TC) overlaps several genes" with “The ABO signal (rs8176719-insC) contributes to the determination of non-O blood groups and regulates circulating levels of both ABO and several non-ABO proteins”.

4) Ref. 10 (medRxiv) is now published in Nature Medicine and the citation should be updated.

As noted in Comment #8 from Reviewer #1, we have updated Zhou et al’s references to include only the Nature Medicine publication.

References

1. Geijtenbeek, T. B. et al. DC-SIGN, a dendritic cell-specific HIV-1-binding protein that enhances trans-infection of T cells. Cell 100, (2000).

2. Plazolles, N. et al. Pivotal advance: The promotion of soluble DC-SIGN release by inflammatory signals and its enhancement of cytomegalovirus-mediated cis-infection of myeloid dendritic cells. J. Leukoc. Biol. 89, (2011).

3. Yang, Z.-Y. et al. pH-dependent entry of severe acute respiratory syndrome coronavirus is mediated by the spike glycoprotein and enhanced by dendritic cell transfer through DC-SIGN. J. Virol. 78, 5642–5650 (2004).

4. Londrigan, S. L. et al. N-Linked Glycosylation Facilitates Sialic Acid-Independent Attachment and Entry of Influenza A Viruses into Cells Expressing DC-SIGN or L-SIGN. Journal of Virology vol. 85 2990–3000 (2011).

5. Johnson, T. R., McLellan, J. S. & Graham, B. S. Respiratory syncytial virus glycoprotein G interacts with DC-SIGN and L-SIGN to activate ERK1 and ERK2. J. Virol. 86, 1339–1347 (2012).

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Sun BB. 2018. Genomic atlas of the human plasma proteome. GWAS Catalog. GCST005806
    2. COVID-19 Host Genetics Initiative 2021. Mapping the human genetic architecture of COVID-19 by worldwide meta-analysis. COVID GWAS meta-analysis results. release 4

    Supplementary Materials

    Supplementary file 1. Definitions of COVID outcomes.
    elife-69719-supp1.xlsx (10.7KB, xlsx)
    Supplementary file 2. Summary of proteins prioritised by pan-Mendelian randomisation.
    elife-69719-supp2.xlsx (10.9KB, xlsx)
    Supplementary file 3. Pan-Mendelian randomisation outcomes at p<0.05, each association divided into cis- or trans-pQTLs.
    elife-69719-supp3.xlsx (241.6KB, xlsx)
    Supplementary file 4. Cis-Mendelian randomisation outcomes at p<0.05.
    elife-69719-supp4.xlsx (305.4KB, xlsx)
    Supplementary file 5. Evaluation of pan-Mendelian randomisation association of protein probes that have passed the 5% FDR.
    elife-69719-supp5.xlsx (13KB, xlsx)
    Supplementary file 6. Protein-wide association studies (at study-specific Bonferroni thresholds) of the ABO signal using three proteomic datasets (Sun et al., Emilsson et al., Suhre et al.).
    elife-69719-supp6.xlsx (16.7KB, xlsx)
    Supplementary file 7. Proteome-wide genetic colocalisation results.
    elife-69719-supp7.xlsx (4.5MB, xlsx)
    Supplementary file 8. Phenome-wide association study (p<0.05) from Open Targets Genetics portal for each colocalising variant.
    elife-69719-supp8.xlsx (310.7KB, xlsx)
    Supplementary file 9. Key to Table 1.
    elife-69719-supp9.xlsx (9.7KB, xlsx)
    Transparent reporting form

    Data Availability Statement

    Summary data used for genetic analyses are publicly available (Sun et al can be downloaded from GWAS catalog https://www.ebi.ac.uk/gwas/downloads/summary-statistics and COVID-19 HGI summary statistics can be downloaded from their website https://www.covid19hg.org/results/). Data generated from our study are provided in the supplementary files (pan-MR and cis-MR association results filtered at p < 0.05 and no filters applied to colocalisation results).

    The following dataset was generated:

    Sun BB. 2018. Genomic atlas of the human plasma proteome. GWAS Catalog. GCST005806

    COVID-19 Host Genetics Initiative 2021. Mapping the human genetic architecture of COVID-19 by worldwide meta-analysis. COVID GWAS meta-analysis results. release 4


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES