Abstract
Human organoids recapitulate the cell type diversity and function of their primary organs holding tremendous potentials for basic and translational research. Advances in single‐cell RNA sequencing (scRNA‐seq) technology and genome‐wide association study (GWAS) have accelerated the biological and therapeutic interpretation of trait‐relevant cell types or states. Here, we constructed a computational framework to integrate atlas‐level organoid scRNA‐seq data, GWAS summary statistics, expression quantitative trait loci, and gene–drug interaction data for distinguishing critical cell populations and drug targets relevant to coronavirus disease 2019 (COVID‐19) severity. We found that 39 cell types across eight kinds of organoids were significantly associated with COVID‐19 outcomes. Notably, subset of lung mesenchymal stem cells increased proximity with fibroblasts predisposed to repair COVID‐19‐damaged lung tissue. Brain endothelial cell subset exhibited significant associations with severe COVID‐19, and this cell subset showed a notable increase in cell‐to‐cell interactions with other brain cell types, including microglia. We repurposed 33 druggable genes, including IFNAR2, TYK2, and VIPR2, and their interacting drugs for COVID‐19 in a cell‐type‐specific manner. Overall, our results showcase that host genetic determinants have cellular‐specific contribution to COVID‐19 severity, and identification of cell type‐specific drug targets may facilitate to develop effective therapeutics for treating severe COVID‐19 and its complications.
Human self‐organizing 3D cultured systems recapitulate various core features of human organ development and biological features, holding tremendous potentials for basic and translational research. Ma and colleagues constructed a computational framework to integrate atlas‐level organoid single‐cell RNA sequencing (scRNA‐seq) data, genome‐wide association study summary statistics, expression quantitative trait loci, and gene–drug interaction data for distinguishing critical cell populations and drug targets relevant to severe coronavirus disease 2019 (COVID‐19). Together, these findings showcase that host genetic determinants have cellular‐specific contribution to COVID‐19 severity, and identification of cell type‐specific drug targets may facilitate to develop effective therapeutics for treating severe COVID‐19 and its complications.

1. INTRODUCTION
The coronavirus disease 2019 (COVID‐19), caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2), is characterized by heterogeneous clinical manifestations ranging from asymptomatic to severe disruptions. 1 Multiple lines of evidence have demonstrated that increased number of severe COVID‐19 patients have significant extrapulmonary complications, 2 , 3 which deteriorate the condition of infected patients. Although vaccines have now been developed for preventing COVID‐19 infection, it is unclear how long it will take to gain herd immunity, or if novel mutations will enable SARS‐CoV‐2 to escape the protection from current vaccines. 4 To date, there are still no specific antiviral drugs to target SARS‐CoV‐2 for alleviating established diseases. 5 Thus, it is an urgent need to rapidly highlight existing drugs that can be repurposed for management in severe COVID‐19 and its complications.
Human organoids, self‐organizing three‐dimensional (3D) cultured systems, recapitulate numerous core features of human organ development and biological functions. Hence, these 3D in vitro structures hold tremendous potential as avatars for preclinical drug developments and interventional experiments that are difficult or impossible to carry out in human subjects. 6 Although having incredibly powerful capabilities, human organoids are biomimetic and heterogeneous model systems with complicated cell types and states, which are intractable to analyse through the conventional technologies, for example, immunohistochemistry. Advancing single‐cell RNA sequencing (scRNA‐seq) technique provides an unprecedented opportunity to dissect the cellular and molecular heterogeneity in primary human organs/tissues. 7 , 8 Compared with transcriptome measurements from bulk samples, single‐cell sequencing methods not only generate cell states and transcription regulatory programs in these 3D model systems at single‐cell resolution, but also gain insights into the disease‐related processes and complex cellular interactions. 9 , 10 Since the COVID‐19 outbreak, many scRNA‐seq studies have demonstrated that numerous types of organoids, including lung, intestinal, kidney, brain, and choroid plexus organoids, enable to investigate the tropism of SARS‐CoV‐2 infection. 9 , 11
Genome‐wide association studies (GWASs) have been widely used for identifying significant genotype–phenotype associations for complex diseases or traits. 12 To date, several GWASs have reported that a large amount of genetic variants show notable associations with COVID‐19 severities. 13 , 14 , 15 Integrating GWAS summary statistics and expression quantitative trait loci (eQTL) data, recent studies have distinguished several candidates as putative drug targets for treating COVID‐19. 4 , 16 , 17 Moreover, linking genome‐wide polygenic signals with single‐cell expression measurements from scRNA‐seq data has considerable potential to unveil critical cell types or subpopulations relevant to complex diseases. 18 Our and other recent studies 19 , 20 have identified numerous immune and lung cell types that are impacted by genetic variants associated with COVID‐19; for example, alveolar type 2 cells and CD8+ T cells in lung, 20 and CD16+ monocytes, megakaryocytes and memory CD8+ T cells in peripheral blood. 19 Nevertheless, these reported studies largely focused on predefined cell type annotations, which considerably ignored the intra‐heterogeneity within cell types. To date, no atlas‐level analysis of combining scRNA‐seq data across multiple tissues and organs with GWAS summary statistics to systematically identify COVID‐19‐relevant cell populations and drug targets at a single‐cell resolution.
In light of the vital role of human organoids in drug developments, we collected and unifiedly processed numerous scRNA‐seq datasets across 10 kinds of human organoids with more than 1 million cells, and developed a computational framework to integrate these human organoids scRNA‐seq data, GWAS summary statistics, eQTL data, and gene–drug interaction data for distinguishing critical cell types/subpopulations and drug targets relevant to severe COVID‐19. We found that numerous cell types across different human organoids were remarkably associated with COVID‐19 severities. Notably, we showed that prioritizing COVID‐19‐relevant cell type‐specific gene–drug interacting pairs in lung mesenchymal stem cells (MSCs), intestinal tuft cells, and brain endothelial cells might conduce to repurpose drugs for treating severe COVID‐19 and accompanied complications.
2. MATERIALS AND METHODS
2.1. Human organoids scRNA‐seq datasets
In this study, we collected and curated 93 independent scRNA‐seq datasets of 10 kinds of widely‐adopted human organoids (i.e., brain, lung, intestine, heart, eye, liver and bile duct, pancreas, kidney, and skin) spanning 1,159,206 cells with 62 main cell types from two widely used databases of Gene Expression Omnibus (GEO) 21 and ArrayExpress. 22 Only datasets with publically available raw reads (e.g., SAR, bam file, or fastq) were included. We leveraged a unified pipeline to conduct re‐alignment, quality control, and standard analysis for facilitating the data integration and minimize the batch effects (Figure S1). Human cancer‐derived organoid scRNA‐seq datasets were excluded from our current analyses. A common dictionary of gene symbols was used to annotate genes for allowing comparison analysis across samples and datasets, and these unrecognized symbols were removed.
2.2. Human fetal scRNA‐seq datasets
To validate the reliability of human organoids‐based significant results, we also collected nine independent scRNA‐seq datasets containing eight kinds of de facto human fetal organs (i.e., brain, lung, intestine, liver, kidney, eye, pancreas, and skin) across 48 samples from the GEO and ArrayExpress databases. Analogue to organoids scRNA‐seq data, we only included datasets with publically available raw reads (e.g., .SAR, .bam file, or .fastq) and used the unified pipeline to carry out re‐alignment, quality control, and standard analysis (Figure S1). In total, there were 223,334 cells across all human fetal organs, ranging from 1745 to 63,020 cells in each dataset.
2.3. scRNA‐seq data processing
We initially applied two widely‐used tools of SRA‐toolkit (version 3.0.5) 23 and bamtofastq (version 2.31.0) 24 to convert single‐cell transcriptomic profiles in .SRA and .bam format to .fastq format. The CellRanger (version 6.1.2) 25 and STARsolo (version 2.7.10a) 26 were used for separately processing human organoid or fetal scRNA‐seq data from 10× Genomics sequencing platform and Drop‐seq sequencing platform to debarcode cells and generate a matrix of unique molecular identifiers (UMIs) for each sample. For both sequencing platforms, we used the human reference genome assembly hg38 27 to align reads tagged with a cell barcode and UMI. Subsequently, featureCounts (version 1.22.2) 28 was used for assigning tagged reads to corresponding genes, and SCANPY (version 1.9.1) 29 was utilized for filtering out cells with <500 or >20,000 detectable genes, >30,000 expressed gene counts, and >10% mitochondrial rate.
Moreover, we used the FindVariableFeatures() function in Seurat (version 4.3.0) 30 to select top 2000 high variable genes (HVGs), and employed the NormalizeData() and ScaleData() in Seurat to transform and scale human organoid and fetal scRNA‐seq data. The Harmony (version 3.8) 31 tool was adopted to integrate samples and remove batch effects, and the Principal component analysis (PCA) was applied to obtain top 30 the most different principal components (PCs), which could explain the most variance of top 2000 HVGs in the aforementioned step of finding variable features. High‐quality cells were embedded into two dimensions by using the uniform manifold approximation and projection (UMAP), and annotated to specific cell types using the transfer learning method of scArches (version 0.5.1) 32 with manually validation.
2.4. GWAS summary data on COVID‐19‐related phenotypes
The COVID‐19 meta‐analytic GWAS summary statistics were downloaded from the official website of COVID‐19 Host Genetics Initiative 33 (https://www.covid19hg.org/; COVID19‐hg GWAS meta‐analyses round 7, released date of April 8, 2022). For the current investigation, we used three of these GWAS meta‐analyses, which included 81 independent studies containing mixed population ancestries (Figures S2 and S3 and Table S2). Most cohorts were based on European ancestry. Three examined COVID‐19‐related phenotypes includes: (1) very severe respiratory confirmed COVID‐19 (Very severe, file named A2_ALL_leave_23andme, n = 18,152 cases) vs population (n = 1,145,546 controls), (2) hospitalized COVID‐19 (Hospitalization, file named B2_ALL_leave_23andme, n = 44,986 cases) versus population (n = 2,356,386 controls), and (3) susceptibility to COVID‐19 (Susceptible, file named C2_ALL_leave_23andme, n = 159,840 cases) vs population (n = 2,782,977).
As referenced in a previous study, 17 very severe COVID‐19 patients were defined as hospitalized COVID‐19 patients as the primary reason for hospital admission with laboratory‐confirmed SARS‐CoV‐2 infection and death or respiratory support. Simple supplementary oxygen (e.g., 2 L min−1 through nasal cannula) did not meet the definition of very severe status. Hospitalized COVID‐19 patients were defined as individuals hospitalized with laboratory‐confirmed SARS‐CoV‐2 infection, where the hospitalization of patients because of COVID‐19‐relevant symptoms. Susceptibility to COVID‐19 patients was defined as individuals with self‐reported infection, health‐record infections, or laboratory‐confirmed SARS‐CoV‐2 infection. In comparison, controls were defined as those individuals in the participating studies who did not qualify the definition of cases. The meta‐GWAS summary datasets contained p‐value for each single nucleotide polymorphism (SNP), effect size on log(OR) scale, standard error of effect size, minor allele frequency (MAF), and p‐value from Cochran's Q heterogeneity test. After stringent quality control, a total of 11,732,503, 12,030,868, and 14,335,927 genetic variants with MAF over 0.0001 and the imputation score (R 2) of >0.6 were satisfied in the A2, B2, and C2 meta‐GWAS datasets, respectively. Results from 23&Me cohort GWAS summary statistics were removed from the investigation. The qqman 34 R package was applied to visualize Manhattan plot and quantile–quantile (QQ) plot.
2.5. Integration of GWAS summary statistics and scRNA‐seq data
To distinguish critical cell types/subpopulations by which genetic variants influence COVID‐19 severities, we implemented our own developed pathway‐based polygenic regression method, scPagwas (version 1.1.0), 35 to integrate GWAS summary data on three COVID‐19 outcomes with human organoids and fetal scRNA‐seq datasets. Initially, scPagwas annotates SNPs to their proximal genes (a default window size of 20 kb) of the corresponding pathways, which are based on the experimentally validated canonical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. 36 Then, scPagwas leverages the singular value decomposition algorithm to transform a scaled scRNA‐seq matrix into a pathway activity score (PAS) matrix. The projection of the features of genes in a given pathway on the direction of the first principal component (PC1) eigenvalue to define PAS for the pathway in cell .
scPagwas assumes a priori that SNPs' effect sizes in the pathway follow the multi‐variable normal distribution , where is the variance of effect‐sizes for SNPs in the pathway, and I is the identity matrix. The notation is used to indicate the set of SNPs within pathway , and the notation indicates the set of genes in the pathway . The variance is estimated by using the linear weighted sum method:
where indicates an intercept term, indicates the coefficient for the pathway in cell , and is the expression level for each gene adjusted by the pathway activity in the given pathway . scPagwas estimates by the following equation:
where represents the kth diagonal element of matrix and denotes the linkage disequilibrium (LD) matrix. The 1000 Genomes Project Phase 3 Panel 27 is used to compute the LD among SNPs extracted from COVID‐19‐related GWAS summary statistics.
The genetically associated PAS (gPAS) for each pathway in a given cell is calculated by summing the product between estimated coefficient and weighted pathway activity. Then, trait‐relevant genes are prioritized by ranking the Pearson correlation coefficients between the expression of each gene and the sum of gPASs over all pathways in each cell across cells. The trait‐relevant score (TRS) for each individual cell is calculated using top 1000 trait‐relevant genes based on the AddModuleScore() function in Seurat. 30 scPagwas assesses the statistical significance of each cell by using the percent ranks of these trait‐relevant genes across individual cells. In addition, scPagwas is also used to infer COVID‐19‐relevant predefined cell types based on the block bootstrap method. 37 We only include the SNPs on autosomes with MAF >0.01. The major histocompatibility complex region (chr6: 25–35 Mbp) is removed because of the extensive LD in this genomic region. For more detailed information, please refer to the original paper. 35
2.6. Assessment the heterogeneity of a given cell type relevant to COVID‐19
Following a previous study, 38 we adopt the Geary's C method 39 measure the spatial autocorrelation of TRS across cells within a given cell type/sub‐population with regard to a cell–cell similarity matrix. The autocorrelation statistic is calculated as the following equation:
where n indicates the total number of cells within a given cell type/subpopulation, TRS indicates the TRS of each cell, , and represents the weight between cells. First, the nearest k neighbours (e.g., 5) should be determined for each cell in the latent model. Subsequently, a Gaussian kernel to the distances between nearest neighbours is used to compute the weights. Higher weights are assigned to similar cells, and zeroed weights are assigned to distant cells. In this way, the Geary's C method provides a measure of how similar the TRS ranks for neighbouring cells given a latent mapping. The value is defined as the autocorrelation effect size that a 1 indicates maximal autocorrelation and a 0 intuitively indicates no autocorrelation. The value notably close to 1 indicates strongly spatial autocorrelation, reflecting that there is a remarkable trait‐association heterogeneity across the given cell type or cell cluster. The VISION R package 38 is used to evaluate the heterogeneity of cells within three COVID‐19‐relevant cell types of lung MSCs, intestinal tuft cells, and brain endothelial cells using default parameters.
2.7. Transcriptome‐wide association analysis
To prioritize genetically‐regulatory expression of genes relevant to COVID‐19, we perform an integrative genomics analysis of incorporating GWAS summary statistics on three COVID‐19‐related phenotypes (released round 7) with eQTL data for 49 tissues from the GTEx Project (version 8) by using the S‐PrediXcan 40 method. S‐PrediXcan primarily leverages two linear regression models to analyse the association between predicted gene expression and COVID‐19‐related phenotypes:
where and are intercepts, and are stochastic environmental error terms, is the dimensional vector for individuals, indicates the allelic dosage for SNP in individuals, indicates the effect size of SNP , indicates the predicted expression calculated by and , in which is generated by using the GTEx tissue‐specific eQTL dataset, and is the effect size of . The Z‐score (Wald‐statistic) of the association between predicted gene expression and COVID‐19‐related phenotypes can be written as:
where indicates the standard deviation of , represents the effect size from GWAS on COVID‐19 and indicates the standard deviation of . For each COVID‐19‐related phenotype, S‐PrediXcan‐based integration analysis is conducted for each of 49 tissues.
To enhance the power to distinguish potential causal genes, S‐MultiXcan 41 is adopted to meta‐analyse the substantial shared eQTLs across 49 GTEx tissues. By taking into account the correlation structure across multiple panels, the multivariate linear regression model of S‐MultiXcan is fitted as the following equation:
where indicates the predicted expression of tissue , and indicates the standardization of to mean = 0 and SD = 1. indicates the effect size for the predicted gene expression in tissue , represents a stochastic environmental error term with variance , and represents the count of chosen tissues. A gene with false discover rate (FDR) ≤ 0.05 is considered to be of significance.
2.8. Multi‐marker Analysis of GenoMic Annotation‐based gene‐level association analysis
To conduct gene‐level genetic association analyses of meta‐GWAS summary statistics on three phenotypes of COVID‐19, we apply the updated version SNP‐wise Mean Model of the Multi‐marker Analysis of GenoMic Annotation (MAGMA). 42 Using this model, MAGMA computes a test statistic as the following algorithm:
where M is the number of variants (e.g., SNP1, SNP2, …, SNP i , ) in a given gene . N is total number of genes annotated in the GWAS summary dataset. We assign a specific SNP to a given gene according to the location of the SNP whether located into the gene body or within an extended ±20 kb upstream or downstream region of the gene. Notably, , where indicates the cumulative normal distribution function, and indicates the marginal p‐value for a specific SNP i. Moreover, the gene‐level converging model assumes , where is the LD matrix among SNPs. The LD matrix can be diagonalized and written as , where is an orthogonal matrix and with being the mth eigenvalue of . The 1000 Genomes Project Phase 3 Panel 27 is adopted as a reference for calculating the LD matrix. indicates a random variable, where . Thus, the sum of squared SNP Z‐statistics can be calculated:
where and . follows a mixture distribution of independent random variables. The Benjamini–Hochberg FDR method is used to adjust for multiple testing correction, and a gene with FDR ≤ 0.05 is interpreted as significance.
2.9. In silico permutation analysis
As referenced in previous methods, 19 , 43 , 44 an in silico permutation analysis of 100,000 times of random selections is leveraged for assessing the concordance of findings between S‐MultiXcan and MAGMA analyses across three COVID‐19 outcomes. The notation of G 1 represents the number of genes identified from the S‐MultiXcan analysis, and G 2 is the number of genes identified from the MAGMA analysis. At first, we count the overlapped genes between G 1 and G 2 (. Then, we adopt the total genes in the MAGMA analysis as background genes (G Background). By randomly selecting the same number of genes as gene set G 2 from the background genes G Background, and after repeating it 100,000 times (N Total), we count the overlapped genes between gene set G 1 and the sample randomly selected each time (N Random).We compute the empirically permuted P value as follows: , and empirical p‐value ≤ 0.05 is treated as significance. To measure the similarity between gene sets from S‐MultiXcan and MAGMA analyses, we further leverage the Jaccard Similarity Index (JSI), 45 which is defined as the intersection size divided by the union size of both gene sets:
where 1.
2.10. Functional enrichment analysis
To elucidate the biological functions of S‐MultiXcan‐ and MAGMA‐identified risk genes for COVID‐19 outcomes, we conduct functional enrichment analyses by using the WEB‐based Gene SeT AnaLysis Toolkit (WebGestlat, http://www.webgestalt.org/) 46 with default parameters based on the KEGG 36 and Gene Ontology (GO) database. 47 The biological process category, which removes the redundant GO terms, is used in the GO‐based functional enrichment analysis. Moreover, we also performed KEGG pathway enrichment analyses by using significantly up‐regulated genes in scPagwas‐identified positive cells among lung MSCs, intestinal tuft cells, and brain endothelial cells. The over‐representation algorithm is leveraged to compute the significant level for each enrichment analysis, and the Benjamini–Hochberg FDR method is applied for multiple correction.
2.11. LDSC analysis
The linkage disequilibrium score regression (LDSC, version 1.0.1) method 48 is used to evaluate the genetic correlations between each of three COVID‐19 phenotypes and each of 66 complex diseases/traits from 6 main disease categories, as well as 24 common tumor phenotypes (Table S2). Differences in genetic correlations are computed with a block jackknife method to compute their corresponding standard errors. The significant association threshold is set to p < 0.00025 (0.05/198) after stringent Bonferroni correction, and p < 0.05 is considered to be suggestively significant.
2.12. Cell‐type‐specific prioritization analysis of gene–drug interacting pairs for COVID‐19
To identify cell type‐specific drug targets relevant to severe COVID‐19, we developed a computational method of scDrugHunter (version 1.1.0) 49 to integrate multiple layers of omics evidence, including human organoid scRNA‐seq data, GWAS summary statistics on very severe COVID‐19, eQTL data from the GTEx project, 50 and gene–drug interactions from the Drug Gene Interaction database (DGIdb v4.2.0, https://www.dgidb.org/). 51 In reference to previous methods, 52 , 53 scDrugHunter employs multiple computational algorithms to extract 4D features, which include cell type specificity scores of genes, gene relevance score (reflecting the relevance of genes for traits of interest in a given cellular context), gene significance scores (reflecting the association between genes whose genetically predicted expression levels and interested traits), and gene–drug interaction scores. scDrugHunter then ranks and scales the descending order of gene‐specific scores for each feature in a particular cell type and uses a synthetic measures method 54 to combine the scaled ranks from the 4D features to compute the area of the patch in the Radar Chart for each gene‐drug pair (called the single‐cell druggable gene score [scDGS]), according to the following equation:
where is a ranking vector for a gene‐drug pair, and n is the number of extracted features (in this case, 4D features). The threshold of scDGS ≥ 120 with permutation p‐value ≤ 0.05 is employed to repurpose cell‐type‐specific gene–drug pairs associated with the trait of interest.
2.13. Cell‐to‐cell interaction analysis
To uncover potential cell‐to‐cell interactions of intestinal tuft positive cells and brain endothelial positive cells with other cells, we leveraged the CellChat (version 1.6.0) 55 R package to infer the predicted cellular communications based on two intestinal and brain organoids scRNA‐seq datasets. 56 , 57 The method of CellChat could examine the significant level of ligand–receptor interactions among different types of cells depended on the expression of soluble agonist, soluble antagonist, and stimulatory and inhibitory membrane‐bound co‐receptors. By summing the probabilities of these ligand–receptor pairs among a given pathway, CellChat could compute the communication probability for the pathway. The incoming (i.e., treating cells as target) and outgoing (i.e., treating cells as resource) interaction strength for each cell type was calculated by counting the number of significant ligand–receptor pairs.
2.14. Statistical analysis
The Wilcoxon sum‐rank test is utilized to calculate the significant level between positive cells and negative cells of lung MSCs, intestinal tuft cells, and brain endothelial cells. The hypergeometric test is used in KEGG‐pathway‐based and GO‐term‐based enrichment analyses to identify notable pathways and biological processes. 46 The Pearson correlation method is applied to compute the correlation coefficients of scPagwas TRSs 35 with scDRS TRSs, 58 genetic risk scores, pseudotimes, and fibroblast cell scores, respectively. The paired Student's t‐test is used to assess the difference in the number of ligand/receptor interactions with other cells between positive cells and negative cells in intestinal tuft and brain endothelial cells. The RISmed (version 2.3.0) 59 is used to perform a PubMed search for resorting to reported evidence supporting the association between COVID‐19 and a given cell type or drug (see Supplementary Methods S1).
3. RESULTS
3.1. Computational framework of COVID‐19‐relevant cell types and drug repositioning
To facilitate the data integration and minimize the batch effects, we have built a unified pipeline to conduct re‐alignment, quality control, and standard analysis of all human organoids (n = 1,159,206 cells) and fetal scRNA‐seq datasets (n = 223,334 cells; Figure S1 and Table S1). To distinguish critical cell types/subpopulations and repurpose potential drugs and interacting targets for the treatment of severe COVID‐19, we devise a computational framework to incorporate these organoids and fetal scRNA‐seq data and large‐scale meta‐GWAS summary statistics on three COVID‐19 phenotypes (i.e., very severe, hospitalized, and susceptible COVID‐19; Figure 1, S2, S3, and Table S2). There are three main sections: (1) integrating GWAS summary statistics with human organoids scRNA‐seq datasets to genetically map trait‐relevant single‐cell landscapes for three COVID‐19 outcomes (Figure 1A); (2) combining GWAS summary statistics with eQTL data in the GTEx database to identify putative risk genes and critical pathways associated with COVID‐19 severities (Figure 1B); and (3) prioritization of cell‐type‐specific gene–drug interaction pairs for treating severe COVID‐19 and related complications at a fine‐grained resolution (Figure 1C).
FIGURE 1.

The workflow of integrative genomics analyses for coronavirus disease 2019 (COVID‐19)‐relevant drug repositioning. (A) Integration analysis of single‐cell transcriptomic profiles in the scHOB database with genome‐wide association study (GWAS) summary statistics on three COVID‐19 phenotypes. There were ~1.2 million cells from 10 kinds of human organoids (i.e., brain, eye, heart, lung, liver and bile duct, pancreas, kidney, intestine, and skin), and three GWAS datasets with more than 2 million samples downloaded from the COVID‐19 Host Genetics Initiative. (B) An increase in genetics‐risk pathways and comorbidities for COVID‐19 severities. (C) Prioritization of druggable genes and interacting drugs for treating COVID‐19 using the scDrugHunter method. Three COVID‐19‐relevant risk cell types (i.e., lung mesenchymal stem cell (MSCs), intestinal tuft cells, and brain endothelial cells) were leveraged as representative examples for searching druggable genes and interacting drugs, and comparison analysis were performed to find cell‐type‐common and cell‐type‐specific druggable genes for severe COVID‐19. scDGS, single‐cell druggable gene score; SNP, single nucleotide polymorphism; TRS, trait‐relevant score; UMAP, uniform manifold approximation and projection.
3.2. Systematic integrative analysis for discerning COVID‐19‐relevant cell types
We initially applied the scPagwas‐based polygenic regression model 35 to incorporate genetic signals from GWAS summary statistics on three COVID‐19 outcomes with single‐cell transcriptomic profiles from 10 kinds of human organoids scRNA‐seq data for identifying critical cell types relevant to COVID‐19 severities. Among them, 39 cell types in 8 human organoids showed notable associations with at least one COVID‐19‐related phenotype (Figure 2A and Table S3). Notably, there existed highly consistent results among very severe, hospitalized, and susceptible COVID‐19 (ρ = 0.99 and p = 2.58 × 10−6, ρ = 0.948 and p = 3.4 × 10−4; Figures 2B,C and S4). As for lung organoids, the cell type of mesenchymal stem cells (MSCs) was significantly enriched for all three COVID‐19 phenotypes (Figure 2A). Previous studies 60 , 61 have suggested that MSCs have a substantially therapeutic potential to improve the outcomes of COVID‐19 patients by facilitating to repair lung‐tissue injury for relieving acute pulmonary edema. Several recent clinical trials have been conducted to determine the positive effects of MSCs on the treatment of critically ill patients with coronavirus infection (Identifiers: NCT04898088, NCT04336254, and NCT04573270).
FIGURE 2.

Significant associations between human organoids cell types and coronavirus disease 2019 (COVID‐19) severities. (A) Summary of 39 significant cell types in eight kinds of human organoids for three COVID‐19 phenotypes. Bar plot represents the significant percent of each cell type in corresponding organoid with different single‐cell RNA sequencing (scRNA‐seq) datasets. (B) Correlation results of the number of significant cell types in eight human organoids between very severe COVID‐19 (x‐axis) and hospitalized COVID‐19 (y‐axis). (C) Correlation results of the number of significant cell types in eight human organoids between very severe COVID‐19 (x‐axis) and susceptible COVID‐19 (y‐axis). The Pearson correlation analysis was used to calculate the correlation coefficients (ρ). See also Table S3.
There were six cell types including membranous cell, enterocyte, and tuft in intestine organoids associated with three COVID‐19 phenotypes. Earlier studies have demonstrated that the angiotensin‐converting enzyme 2 (ACE2) as a direct mediator regulates the SARS‐CoV‐2 entry into enterocytes in the gastrointestinal tract, 62 and COVID‐19 patients often show gastrointestinal symptoms including vomiting, belly pain, and diarrhoea. 63 For brain organoids, eight cell types, including endothelial cell and microglia, exhibited notable associations with severe COVID‐19. Previous evidence has documented that cerebral endothelial dysfunction may be the cause of increased rates of cerebrovascular pathology relevant to COVID‐19, 64 and severe COVID‐19 patients experiencing a severe cytokine storm have considerable potential to induce microglia activation that leads to neurotoxicity. 65 In addition, there existed seven cell types in eye organoids significantly associated with very severe COVID‐19, including horizontal cells, rod, RPC, and cone. Our recent study 66 has indicated that host genetic factors play critical roles in facilitating SARS‐CoV‐2 infection in the ocular surface cells. For other organoids, we found that two cell types of nephron progenitor cell and differentiating nephron in kidney organoids, three cell types of stellate, cholangiocyte, and hepatocyte in liver organoid, three cell types including endothelial cell and alpha in pancreas organoid, and two cell types of endothelial cell and pluripotent cell in heart organoids were significantly associated with COVID‐19 severities (Figure 2A).
For validation, we used the RISmed method 59 that performs a PubMed search for resorting to reported evidence concerning the association between the trait of interest and a particular cell type. By counting the number of reported publications using the keyword pairs between COVID‐19 and specific cell type, we computed the correlation between the number of publications and the significant percent of each cell type identified by scPagwas, and found significantly or suggestively positive correlations between scPagwas‐identified cell‐type results and PubMed search results across three COVID‐19 phenotypes (Figure S5A–F). Moreover, to replicate the biological findings from human organoids, we applied the same regression model to integrate GWAS summary data on very severe COVID‐19 with human fetal scRNA‐seq data with multiple tissues. The aforementioned observations remained reproducible in analysing human fetal scRNAs‐seq data (Figure S6). For example, lung MSCs, intestinal tuft and enterocyte cells, eye cone and horizontal cells, and brain endothelial cells and microglia were notably associated with very severe COVID‐19 in human fetal tissues. Taken together, we provide new insights for inferring critical cell types by which genetic variants influence COVID‐19 severities.
3.3. Transcriptome‐wide association analysis identifies causal genes for three COVID‐19 outcomes
To identify putative causal genes for COVID‐19 severities, we applied the S‐MultiXcan method 41 to integrate GWAS summary statistics and eQTL datasets based on 49 GTEx tissues. There were 243, 277, and 158 genes identified to be significantly associated with susceptible, hospitalized, and very severe COVID‐19, respectively (total N = 438 genes, FDR < 0.05; Figures 3A, S7 and Tables S4–S6). Many of these identified genes, including ACE2, SLC6A20, OAS3, CCR1, CXCR6, IFNAR2, IL10RB, and DPP9, have been reported to be associated with COVID‐19 susceptibility in previous studies. 15 , 19 , 67 , 68 , 69 , 70 , 71 By overlapping these three COVID‐19‐associated gene sets, we found that 67 common genes whose genetically regulated expression have potentially important roles in COVID‐19 initiation and progression (FDR < 0.05; Figures 3B, S8A, and Table S7).
FIGURE 3.

Risk genes and pathways associated with coronavirus disease 2019 (COVID‐19) severities. (A) Circus plot showing the results of the S‐MultiXcan‐based integrative analysis. The inner ring represents the 22 autosomal chromosomes (Chr1–22). In the out ring, a circular symbol demonstrates a specific gene with different colour to mark the statistical significance of the gene for very severe COVID‐19 (red marks false discover rate (FDR) < 1E−05, orange marks 5.24E−10 ≤ p < 0.001, light blue marks 0.001 ≤ p ≤ 0.05, and dark blue marks p > 0.05). (B) Venn diagram showing the overlapped risk genes across three COVID‐19 phenotypes. (C). Protein–protein interaction network of 67 common risk genes based on the STRING database (v11.5, https://string‐db.org/). (D) Bar plot showing the counts of significant pathways enriched by using S‐MultiXcan‐identified risk genes in three COVID‐19 phenotypes. (E) Venn plot indicating the overlapped significant pathways across three COVID‐19 phenotypes. (F) Radar plot showing the significant level of 23 common pathways across three COVID‐19 phenotypes. The p‐value of each pathway was negatively log‐transformed (−Log2(p)) for visualization. See also Tables S4–S6.
Network enrichment analysis exhibited that 40 of 67 common genes were significantly enriched in a protein–protein interaction (PPI) subnetwork (enriched p = 9.1 × 10−15; Figures 3C and S8B,C), which is in line with the consensus that disease‐causing genes are more likely to be interacted. 72 , 73 By conducting S‐PrediXcan analyses of lung and blood tissues that were most relevant to SARS‐CoV‐2 infection, 280 of 438 risk genes (63.93%) identified from S‐MultiXcan‐based analyses were validated to be relevant to at least one COVID‐19 outcome (p < 0.05; Figure S9 and Tables S8 and S9). Moreover, using MAGMA as an independent technique for validation (see Supplementary Methods S1), we found that there was a high consistence between results from MAGMA and S‐MultiXcan analyses for three COVID‐19 phenotypes (JSI = 0.28–0.31, empirical p < 1 × 10−5; Figures S10–S12 and Table S10).
Furthermore, we performed pathway‐based enrichment analyses for three S‐MultiXcan‐identified gene sets to enrich critical pathways implicated in COVID‐19 severities. We observed that the number of significant pathways was elevated with increased severities of COVID‐19 (Figures 3D and S13A–C), which is consistent with the findings in an earlier study. 19 There was a large proportion of significant pathways (n = 23) in common among susceptible, hospitalized, and very severe COVID‐19 (Figure 3E and Table S11). We also noticed that the significant level of these common pathways showed an increased notable pattern with the increase of COVID‐19 severities (FDR <0.05; Figure 3F). Consistently, a large proportion of these common pathways (87% = 20/23) remain to be significantly enriched by using all 438 genes (FDR < 0.05; Figure S14 and Table S12). Several of these pathways, including cytokine–cytokine receptor interaction and chemokine signalling pathway, have been documented to involve in the COVID‐19 susceptibility in previous studies. 15 , 19 , 74 In sum, our integrative genomic analysis identifies that 438 risk genes involved in critical biological pathways show notable associations with COVID‐19 severities.
3.4. Genetic correlations between three COVID‐19 outcomes and complex diseases
Previous epidemiologic and clinical studies have documented that the clinical manifestations of COVID‐19 are heterogeneous, and many of COVD‐19 cases are identified as having at least one comorbidity, including hypertension, diabetes, and other cerebrovascular, cardiovascular, and gastrointestinal complications, which may lead to poorer clinical outcomes. 75 Given the high genetic heritability of these putative complications, we calculated the genetic correlations of 66 diseases/traits from seven main disease categories with three COVID‐19 phenotypes using the LDSC method. 48 We found that 29 of them (43.94%), including anorexia nervosa, attention deficit hyperactivity disorder, multiple sclerosis, neuroticism, ischemic stroke, cognitive performance, hypertension, type 2 diabetes, and pulmonary embolism, exhibited significantly genetic correlations with COVID‐19 severities (p < 0.05; Figure 4 and Table S12). In addition, we also conducted the similar LDSC analysis to investigate the genetic correlations of COVID‐19 outcomes with 24 kinds of human cancers, and only found four cancers (i.e., thyroid gland cancer, melanoma, endometrial carcinoma, and endocrine carcinoma) showing significant correlations (Figure S15 and Table S14). Together, these results suggest that the shared genetic risk factors of these comorbidities may aggravate the severities of COVID‐19.
FIGURE 4.

LDSC analysis identifies the genetic correlations between three coronavirus disease 2019 (COVID‐19) phenotypes and complex diseases. Heatmap plot showing the results of genetic correlations between 66 diseases or traits from six main disease categories (i.e., Neuropsychiatric disorders, neurodegenerative disorders, cognitive‐related behaviours, cardiovascular diseases, autoimmune diseases, metabolic diseases, and respiratory diseases) and three COVID‐19 outcomes (i.e., very severe COVID‐19, hospitalized COVID‐19, and susceptible COVID‐19 using the LDSC method). The asterisk represents the significance of genetic correlation between COVID‐19 and complex disease. See also Table S13.
Given that the primary goal of this study was to characterize the context‐specific genetic aetiology of COVID‐19 severities, we concentrated the subsequent analyses on identifying severe COVID‐19‐relevant cell subpopulations across three main human organoids (i.e., lung, intestine, and brain), and used these 438 risk genes to reposition drug targets for treating severe COVID‐19 and related complications.
3.5. Identifying severe COVID‐19‐relevant cell subpopulations in lung organoids
Respiratory failure is the leading cause of death in severe COVID‐19 patients. 76 It is important to study pathologic cells associated with COVID‐19 in human lung organoids for facilitating to explore key features of viral biology and drug repositioning. 77 Thus, we sought to identify severe COVID‐19‐relevant cell subpopulations by integrating GWAS summary statistics with human lung organoid scRNA‐seq data 78 using the scPagwas method. Among three main cell types, we found that the MSCs with higher TRSs exhibited striking enrichments in very severe COVID‐19 (Figures 5A–C and S16), reminiscing that the cell type of MSCs was identified to be associated with COVID‐19 severities in human fetal lung tissue (Figure S6). There was a prominently higher proportion of scPagwas positive cells in MSCs (42.14%) compared with other two cell types (Figure 5D). Because of the binary trait settings of very severe COVID‐19 and healthy population, these scPagwas positive cells should be associated with COVID‐19 severity, and scPagwas negative cells should be relevant to the normal phenotype. Moreover, we used the recent cell‐scoring method, scDRS, 58 to re‐analyse the same data, and found that these results were remarkably consistent (ρ = 0.926, p < 2.2 × 10−16; Figure S17A,B).
FIGURE 5.

Identification of lung mesenchymal stem cells (MSCs) associated with coronavirus disease 2019 (COVID‐19) severities. (A) Uniform manifold approximation and projection (UMAP) projections of human lung organoids cells coloured by three annotated cell types. (B) UMAP embedding of all cells among three cell types in lung organoids coloured by the trait‐relevant scores (TRSs) for the phenotype of very severe COVID‐19. (C) Dotplot showing the significant associations of three cell types in lung organoids for very severe COVID‐19. Y‐axis indicates the log‐transformed p‐value (−Log10(p)), and x‐axis indicates the cell‐type‐level inference using the scPagwas method. (D) Bar plot showing the proportion of positive cells in three lung organoid cell types. (E) UMAP projections of lung MSCs coloured by five cell clusters. (F) UMAP plot showing the distribution of lung MSC positive cells and negative cells. The C′ value significantly lower than 1 indicates a high level of disease‐association heterogeneity across the set of cells (C′ value = 0.924, heterogeneity false discover rate [FDR] = 3.332 × 10−4). (G) CytoTRACE differentiation continuum across the lung MSCs. The colour legend indicates the degree of differentiation that is gradually increased from more differentiation (blue) to less differentiation (red). (H) Unsupervised trajectory inference of lung MSCs functional state transitions. Colour legend indicates the pseudotimes of individual cells calculated by using the Monocle2 method. (I,J) Visualization of the distribution of five cell clusters (I) and MSC positive cells (J) in the inferred trajectory. (K) Volcano plot showing significantly up‐regulated genes between MSC positive cells and negative cells. A two‐side Wilcoxon test was used for assessing the significance. (L) Notably enriched pathways by 1142 up‐regulated genes in MSC positive cells. Colour legend represents the log‐transformed FDR value (−Log10(FDR)). (M) Chord diagram of scDrugHunter‐identified top 10 druggable genes and relevant interacting drugs for very severe COVID‐19 in lung MSCs. The width of each line is determined by the number of drugs (n = 1–5) known to interact with each gene. Genes are ordered by the degree of scDGS at the top of the diagram. See also Tables S15 and S16.
As shown in Figure 5E, MSCs were clustered into five cell clusters. Among the 9795 MSCs cells, scPagwas identified 4128 positive cells that are most relevant to severe COVID‐19 (Bonferroni‐corrected p < 0.05; Figure 5F). These severe COVID‐19‐relevant positive cells with higher TRSs were over‐represented in clusters 0 and 1, whereas Cluster 2 exhibited the lowest TRSs (heterogeneous FDR = 3.332 × 10−4, C′ value = 0.924; Figures 5F and S18A,B), which is consistent with the results from the scDRS analysis (concordance rate = 88.25%; Figure S17C–E). Furthermore, the per‐cell genetic risk scores using the 438 COVID‐19‐relevant genes showed a notable correlation with scPagwas TRSs across all MSCs (p < 2.2 × 10−16; Figure S18C,D). On CytoTRACE analysis 79 for predicting differentiation states from MSCs, we found that cells in clusters 0 and 1 were predicted to be more differentiated than that in Cluster 2 (Figures 5G and S18E). By performing an unsupervised trajectory inference analysis, 80 MSC positive cells in Clusters 0 and 1 were largely distributed in the middle and end positions of the trajectory (Figure 5H–J). The pseudotimes of MSCs were positively correlated with corresponding TRSs (ρ = 0.664, p < 2.2 × 10−16; Figure S18F,G). Notably, these top branch‐dependent genes related to MSC positive cells exhibited notable enrichments in several critical biological processes, which are relevant to lung and respiratory proliferation and growth (Figure S18H). Based on UMAP visualization, these top branch‐dependent genes, including FN1, VEGFA, EGFR, WNT5A, IGFBP5, and CDKN1A were highly expressed in MSC positive cells compared with negative cells (Figure S19).
Recent evidence 81 suggested that increased numbers of MSCs and fibroblasts concomitant with increased proximity between these two cell types during the COVID‐19 progresses, which probably reflects a response to repair the damaged lung tissue. Thus, we further sought to examine whether MSC positive cells have higher proximity with fibroblasts than negative cells. As expected, we found that the fibroblast‐relevant cell state scores by collapsing the expression levels of fibroblast marker genes were significantly higher among MSC positive cells compared with negative cells (p < 2.2 × 10−16; Figure S20A–C). These results indicate that MSC positive cells tend to have differentiation potentials for facilitating to repair COVID‐19‐induced lung‐tissue injury. Compared with negative cells, there were 1142 significantly up‐expressed genes in MSC positive cells, such as FN1, VEGFA, IL1R1, TNFAIP6, and PHC2 (Figures 5K and S18I). The gene of FN1, known to be a driver of pulmonary fibrosis, was reported to be up‐regulated in COVID‐19 survivors. 82 Functionally, these up‐regulated genes were significantly over‐represented in 40 biological pathways (FDR <0.05; Figure 5L and Table S15), including human papillomavirus infection, PI3K‐AKT signalling pathway, and JAK–STAT signalling pathway, recalling that many of them have been strikingly enriched in aforementioned genetics‐based pathway analyses (Figure 3F).
To prioritize critical gene–drug pairs, we applied the scDrugHunter method 49 to reposition MSC‐specific druggable genes and interacting drugs for treating severe COVID‐19. Among 438 genetic risk genes (Table S8), we found that 98 genes (22.4%) were targeted at least one known drug, and 15.3% of these 98 genes were documented to be targets for potential COVID‐19‐relevant drugs based on registers of clinical trials for COVID‐19, 4 which is notably higher than that from random selections based on in silico permutation analysis (permuted p < 0.001; Figure S21A,B). Of note, there were 19 druggable genes with 117 targeting drugs yielding remarkably higher single‐cell druggable gene scores (scDGSs >120 and FDR < 0.05) in lung MSCs for treating severe COVID‐19, including CCR1, TNFRSF4, PDE4A, and IFNAR2 (Figures 5M, S22A,B, and Table S16). Notably, we found that 12 of these interacting drugs, including IBUDILAST, ILOPROST, INTERFERON ALFA‐2B, and INTERFERON BETA‐1B, were tested in 60 double‐blind and placebo‐controlled clinical trials for the treatment of COVID‐19 (Clinicaltrials.gov; Figure S23A). Consistently, we performed evidence‐driven analysis using the RISmed method, 59 and found that a high proportion of these prioritized drugs have been associated with COVID‐19 (proportion = 42.74%; Figure S23B). Collectively, these results demonstrate that cell subsets of MSCs are highly relevant to severe COVID‐19, and these highlighted druggable genes potentially have therapeutic functions in MSCs for severe COVID‐19.
3.6. Discerning severe COVID‐19‐relevant cell subpopulations in intestine organoids
Although COVID‐19 primarily manifests pulmonary infection, it has significant extrapulmonary complications to damage other organ systems, including the intestinal tract. 83 Due to the extensive surface area of intestinal capillaries, intestinal epithelial cells are more likely to be infected by SARS‐CoV‐2 than other extrapulmonary organs. 84 To understand the mechanism underlying severe COVID‐19‐associated intestinal injury, we performed an integrative analysis by incorporating the GWAS summary dataset and human intestinal organoids scRNA‐seq data. 56 Among the five cell types, we found that severe COVID‐19‐relevant cells with higher TRSs were mainly from tuft cells (n = 2167 cells; Figures 6A–C, S24, and S25). At cell‐type level inference, two cell types of tuft cells and membranous cells (M cells) demonstrated a significant association with severe COVID‐19 (Figure 6D), which is consistent with the results based on human fetal intestine tissue (Figure S6). This observation remained reproducible by using the scDRS method 58 with the inclusion of the same single‐cell dataset (ρ = 0.981, p < 2.2 × 10−16; Figure S26A,B). While tuft cells are chemosensory epithelial cells, they serve as the primary physiologic target of viral infection and drive an inflammatory adaptive immune response, which is classically correlated with allergy and parasitic infection. 85 , 86
FIGURE 6.

Discerning intestinal tuft cells relevant to coronavirus disease 2019 (COVID‐19) severities. (A) Uniform manifold approximation and projection (UMAP) projections of human intestine organoids cells coloured by five predefined cell types. (B) UMAP embedding of all cells among five cell types in intestine organoids coloured by the trait‐relevant scores (TRSs) for the phenotype of very severe COVID‐19. (C) Violin plot showing the TRSs in five cell types among intestine organoids. (D) Forest plot showing the associations of intestinal cell types with very severe COVID‐19. Effect parameter indicates the strength of association, and range specifies the empirical bounds of the 95% confidence interval. The p‐value of each cell type is shown in the right panel. (E) UMAP showing three cell clusters of intestinal tuft cells. (F) UMAP visualization of intestinal tuft cells coloured by TRSs. (G) Violin plot showing the TRSs in three cell clusters among intestinal tuft cells. (H) UMAP visualization of intestinal tuft cells coloured by tuft positive and negative cells. The C′ value significantly lower than 1 indicates a high level of disease‐association heterogeneity across the set of cells (C′ value = 0.812, heterogeneity false discover rate [FDR] = 3.33 × 10−4). (I) Bar plot showing the proportion of positive cells in three cell clusters of intestinal tuft cells. (J) Boxplot showing a notable increase in cellular interactions of tuft positive cells with other cells among intestinal organoids compared with tuft negative cells. (K) Volcano plot showing significantly up‐regulated genes between tuft positive cells and negative cells. A two‐side Wilcoxon test was used for assessing the significance. (L) Notably enriched pathways by 758 up‐regulated genes in tuft positive cells. X‐axis indicates the log‐transformed FDR value (−Log10(FDR)). (M) Dotplot showing the results of scDrugHunter‐identified 17 druggable genes and interacting drugs with high scDGS >120 in intestinal tuft cells. See also Tables S17 and S18.
As shown in Figure 6E, tuft cells were grouped into three cell clusters. Among them, we found that severe COVID‐19‐associated genetic signals were highly enriched in cluster 0 (heterogeneous FDR = 3.33 × 10−4, C′ value = 0.812, Figure 6F,G). Consistently, clusters 0 and 2 had a higher proportion of positive cells relevant to COVID‐19 severities than that in cluster 1 (Figure 6H,I), which is in concordance with tuft positive cells identified using the scDRS method (concordance rate = 0.984; Figure S26C–E). Moreover, this result was also validated by using the per‐cell genetic risk scores of 438 COVID‐19‐relevant genes (p = 2.86 × 10−8; Figure S27A,B). Cellular communication analysis indicated that tuft positive cells had a significantly higher number of receptor–ligand interactions with other intestinal cell types than that of tuft negative cells (p = 0.0032; Figures 6J and S27C–E). For example, tuft positive cells showed relatively high communications with M cells, containing 32 significant receptor–ligand interactions; for example, several unique interacted pairs of WNT5A‐FZD5, SEMA3A‐(NRP1 + PLXNA3), and PTN‐SDC3 (Figure S27F,G).
By performing a differential expression analysis, we found that 758 genes showed significantly higher expressions in tuft positive cells compared with negative cells, including COL3A1, COL1A2, IFITM3, RPL10, VIM, and LGALS1 (Figure 6K). These extracellular matrix (ECM) genes, including COL3A1 and COL1A2, were reported to be up‐regulated in COVID‐19 microvessels and lung lower lobes. 87 , 88 , 89 Genetic variants in the interferon‐induced transmembrane protein (IFITM3) have been demonstrated to be associated with SARS‐CoV‐2 infection and COVID‐19 severities. 90 Functionally, these highly expressed genes showed notable enrichments in several critical pathways, including ribosome, TNF signalling pathway, and relaxin signalling pathway (Figure 6L and Table S17), of which several have been reported to be implicated in COVID‐19 infection. 19 For example, previous evidence has suggested that ribosomal proteins potentially play crucial roles in blocking viral replication by binding to the specific phosphoproteins for the host immune factors, 91 and the immunosuppression and low expression of ribosomal protein genes were related to the persistence of the viral infection in COVID‐19 patients. 92
Moreover, we also repurposed tuft‐specific druggable genes and interacting drugs for treating severe COVID‐19 and intestinal comorbidities. Among 438 genetic risk genes, we found that 17 druggable genes with 151 interacting drugs yielded higher scDGSs (> 120, and FDR <0.05) in tuft cells for treating severe COVID‐19, including IL10RB, ICAM1, TYK2, SENP7, and VIPR2 (Figures 6M, S28, S29, and Table S18). Among these identified gene–drug pairs, 14 drugs, including PEGINTERFERON LAMBDA‐1A, TOFACITINIB, TADALAFIL, and PENTOXIFYLLINE, have been examined in 89 clinical trials for treating COVID‐19 patients (Clinicaltrials.gov; Figure S30A). Furthermore, the RISmed analysis consistently demonstrated that a large number of these identified drugs (n = 64) were relevant to the treatment of COVID‐19 (proportion = 42.38%; Figure S30B). Together, our results indicate that subset of tuft cells exhibit notable associations with severe COVID‐19, and critical drug targets, including IL10RB, ICAM1, and VIPR2, are prioritized for treating severe COVID‐19 and concomitant intestinal symptoms.
3.7. Distinguishing severe COVID‐19‐relevant cell sub‐populations in brain organoids
Accompanied with respiratory and gastrointestinal symptoms, severe COVID‐19 patients often present with short‐ and long‐term neuropsychiatric symptoms and brain sequelae. 93 Brain organoids provide a promising tool for uncovering the pathophysiologic mechanisms and potential therapeutic options for neuropsychiatric complications of severe COVID‐19. 94 We leveraged the scPagwas method 35 to integrate the GWAS summary dataset on very severe COVID‐19 and human cerebral organoids scRNA‐seq data. 57 Among eight main cell types, we identified that both endothelial cells (p = 6.96 × 10−6) and microglia (p = 5.29 × 10−5) yielding higher TRSs were significantly associated with very severe COVID‐19 compared with other cell populations (Figure 7A–C), recalling that these two cell types were identified to be associated with COVID‐19 severities in human fetal brain tissue (Figure S6). Consistently, these results were notably reproduced by using the scDRS method 58 in the same dataset (ρ = 0.98, p < 2.2 × 10−16; Figure S31A,B). Earlier studies 93 , 95 have indicated that SARS‐CoV‐2 invade into central nervous system via endothelial cells resulting in inflammation, thrombi, and brain damage.
FIGURE 7.

Distinguishing brain endothelial cells contribute risk to coronavirus disease 2019 (COVID‐19) severities. (A) Uniform manifold approximation and projection (UMAP) projections of all cells coloured by eight predefined cell types in human brain organoids. (B) UMAP embedding of all cells in brain organoids coloured by the TRSs for the phenotype of very severe COVID‐19. (C) Violin plot showing the TRSs in eight cell types among intestine organoids. The significant level (p‐values) of associations of brain cell types with very severe COVID‐19 is shown in top‐panel of the violin plot. (D) UMAP visualization of five cell clusters in brain endothelial cells. (E) UMAP plot highlighting the brain endothelial positive cells. The C′ value significantly lower than 1 indicates a high level of disease‐association heterogeneity across the set of cells (C′ value = 0.841, heterogeneity false discover rate [FDR] = 3.33 × 10−4). (F) Bar plot showing the proportion of positive cells in five cell clusters of brain endothelial cells. (G) Volcano plot showing significantly up‐regulated genes between endothelial positive cells and negative cells. A two‐side Wilcoxon test was used. (H) Notably enriched pathways by 341 up‐regulated genes in endothelial positive cells. Y‐axis indicates the log‐transformed FDR value (−Log10(FDR)), and x‐axis indicates the enrichment ratio of each pathway. (I) Scatter plot exhibiting the dominant senders (sources) and receivers (targets) in a 2D space. Y‐axis represents incoming interaction strength, and x‐axis represents outgoing interaction strength. The size of each node indicates the count of cellular interactions. (J) A notable increase in cellular interactions of endothelial positive cells with other cells among brain organoids compared with endothelial negative cells. (K) Dotplot exhibiting the results of scDrugHunter‐identified 18 druggable genes and interacting drugs with high scDGS >120 in brain endothelial cells. See also Tables S19–S21.
Among endothelial cells with five clusters, we identified 3443 positive cells that were significantly associated with very severe COVID‐19 (Bonferroni‐corrected p < 0.05, proportion = 56.6%, Figures 7D,E and S32). Remarkably heterogeneous associations between brain endothelial cells and severe COVID‐19 were uncovered (heterogeneous FDR = 3.33 × 10−4, C′ value = 0.841, Figure 7E). Of note, Clusters 0 and 3 exhibited a higher proportion of positive cells than other clusters (Figures 7F and S33A,B), which is in accordance with the results from the scDRS analysis (concordance rate = 0.74; Figure S28C–E). Compared with endothelial negative cells, we found that 341 genes, including NRP1, CEBPD, and EGR1, were significantly up‐regulated in positive cells (Figure 7G). The cell surface receptor of neuropilin‐1 (NRP1) was reported to serve as an entry factor and potentiate SARS‐CoV‐2 infectivity, and it up‐regulated expression is critical in angiogenesis, viral entry, immune function, and axonal guidance. 96 , 97 Functionally, these highly up‐regulated genes were enriched in multiple critical pathways and biological processes, including PI3K‐AKT signalling pathway, focal adhesion, TNF signalling pathway, ECM–receptor interaction, and angiogenesis (Figures 7H and S33C and Tables S19 and S20).
To gain refined insights into endothelial positive cells, we conducted a cell‐to‐cell interaction analysis among cell populations in human brain organoids. Through constructing the aggregated cellular interaction network based on the count of receptor–ligand pairs, endothelial positive cells exhibited the highest incoming interaction strength than other cell types (Figure 7I). Compared with endothelial negative cells, we found a significant increase in cell‐to‐cell interactions with other brain cell types (p = 1.28 × 10−6; Figures 7J and S34A,B). By summarizing the communication probability among cellular interactions, there were 25 significant ligand–receptor interactions of endothelial positive cells, including CXCL12‐CXCR4, FGF7‐FGFR1/2, PTN‐NCL, and MDK‐NCL (Figure S34C,D). For communicating with microglia, three unique ligand–receptor pairs of MIF‐ACKR3, NAMPT‐INSR, and NAMPT‐(ITGA5 + ITGB1) were detected in endothelial positive cells compared with negative cells. SARS‐CoV‐2 infection enable to damage endothelial cells leading to inflammation that further induce the activation of microglia, which may result in region‐ and neurotransmitter‐specific neuropsychiatric symptoms. 93 , 98 , 99 Collectively, our results indicate that both endothelial cells and microglia have considerable potential to contribute risk to severe COVID‐19.
Subsequently, the scDrugHunter method was used to discern brain endothelial cell‐specific druggable genes and interacting drugs for treating severe COVID‐19 and corresponding neuropsychiatric complications. Among these putative COVID‐19‐risk genes, we uncovered that 18 druggable genes with 154 interacting drugs obtained notably higher scDGSs (>120, FDR < 0.05) in brain endothelial cells, including top‐ranked genes of PLRKHA4, LTF, ICMA1, and P4HA2 (Figures 7K, S35, S36, and Table S21). Of note, 16 of these prioritized drugs have been demonstrated to be tested in 96 clinical trials for the treatment of COVID‐19 (Clinicaltrials.gov; Figure S37A). Consistently, the RISmed analysis indicated that 74 drugs have been associated with the treatment of COVID‐19 (48.05%; Figure S37B).
By performing a comparison analysis, we further found that seven druggable genes of IFNAR2, TYK2, VIPR2, PLEKHA4, PDE4A, P4HA2, and PTGFR were identified to be common targets across three COVID‐19‐relevant cell types of lung MSCs, intestinal tufts, and brain endothelial cells (Figure S38A). Eight druggable genes of COL11A2, SACM1L, HCN3, CA11, SLC22A4, CLK2, IMPG3, and SLC5A3 were specific to lung MSCs, four druggable genes of DBP, CLK3, BGLAP, and THRA were specific to intestinal tufts, and seven brain endothelial cell‐specific druggable genes of CSF3, LTF, PSORS1C1, SPARC, CCR9, CPOX, and CYP3A43. Collectively, we repurposed 33 putative druggable genes and 215 interacting drugs for the treatment of severe COVID‐19 and corresponding complications, and these 33 druggable genes were jointly enriched in a functional subnetwork (Figure S38B).
4. DISCUSSION
Multiple lines of evidence 18 , 19 , 35 , 58 , 59 have demonstrated that integrating scRNA‐seq data and polygenic risk signals from GWAS is a promising approach to uncover the cellular mechanisms through which these variants drive complex diseases. In this study, we sought to identify critical cell types/sub‐populations relevant to COVID‐19 severities by combining large‐scale GWAS summary statistics and human organoids single‐cell sequencing data. Crucially, 39 main cell types in eight kinds of organoids were identified to be associated with COVID‐19 severities. We further concentrated on unveiling the functions of COVD‐19‐relevant cell subpopulations across three main organoids of lung, intestine, and brain, which contribute to characterize important features of viral biology and facilitate to the identification of repurposable drug candidates against SARS‐CoV‐2 infection and its related comorbidities.
Although vaccines have been developed to prevent SARS‐CoV‐2 infection, no specific antiviral drug exists to mitigate the established disease of severe COVID‐19. 5 As developing a new drug takes years to a decade and substantial cost, drug repurposing is an effective way that can notably accelerate the development cycle of therapeutic strategies for treating COVID‐19. 4 There are two main approaches, virus‐based and host‐based treatment options, to test candidate targets in clinical trials. Of them, the host‐based approaches target critical host factors that are used by SARS‐CoV‐2 for viral replication or stimulate host innate antiviral responses. 100 The key to host‐based drug repurposing for the treatment of COVID‐19 infection is to distinguish the true host risk genes. GWAS‐identified disease risk genes were more prone to code for proteins that are ‘biopharmable’ or ‘druggable’ than the rest of the human genome. 101 In the present investigation, we leveraged integrative genomic analyses to analyse large‐scale GWAS data and prioritized 438 COVID‐19‐relevant risk genes, including IFNAR2, CCR1, ICAM1, VIPR2, and IL10RB, which are attributable to search genuine drug targets for COVID‐19 severities.
Despite the success of GWASs, nearly 90% of disease‐associated variants are identified to be located in the non‐coding regions, which are enriched in cell‐type‐specific transcriptional regulatory elements relevant to disease risk. 102 , 103 , 104 Integration of GWAS summary data and eQTL data has been extensively used to discern novel candidate genes and yield functional insights into disease‐relevant pharmacological effects 4 , 15 , 16 ; however, few of these insights has considered the cell‐type‐specific effects of drug targets. Thus, in this study, we repositioned drugs and their interacting targets for treating severe COVID‐19 in a cell‐type‐specific context. Collectively, we found that 33 druggable genes and 215 interacting drugs were considered as putative candidates for severe COVID‐19 and relevant complications. Large proportions of these drugs have been experimented for the treatment of severe COVID‐19. For example, the FDA‐approved drugs of INTERFERON ALFA‐2B and INTERFERON BETA‐1B exhibited agonist–receptor interactions with IFNAR2, which could be used alone or in conjunction with other anti‐virus drugs for against COVID‐19 initiation and progression. 105 , 106
Several limitations of this study should be cautious. First, the power of the cell‐type‐level integration analysis is limited by the lack of scRNA‐seq data and matched genetic information in each sample for discerning COVID‐19‐relevant cell types. To diminish the impact of this limitation, we adopted a powerful approach by incorporating a large‐scale GWAS summary dataset and human organoids scRNA‐seq data with a large amount of cells, as reference to previous studies. 7 , 18 , 19 , 35 , 58 Second, the identification of COVID‐19‐relevant cell types or subpopulations does not imply causality but may reflect indirect discovery of causal phenotype‐cell associations, analogous to earlier studies. 19 , 58 Third, we removed the MHC region from all genomic analyses to reduce the influence of the complex genetic architecture and extensively high levels of LD, parallel to previous studies. 7 , 19 However, it should be noted that COVID‐19‐relevent genetic signals in this locus might be ignored. Finally, we adopted a default strategy that linking SNPs into genes based on the proximal distance of a 20 kb window. Other powerful strategies, including the enhancer‐gene linking approaches from Roadmap and Activity‐By‐Contact models, 18 , 107 can also be used to establish the link between SNPs and genes.
In summary, we provide systematic insights that the effects of host genetic factors on COVID‐19 initiation and progression in a cellular context, and first repurpose COVID‐19‐relevant cell‐type‐specific druggable targets and interacting drugs. Numerous critical cell types or subpopulations, including lung MSCs, intestinal tuft cells, and brain endothelial cells, contribute higher risk to COVID‐19 severities. The integration of human genetics, single‐cell transcriptomic data, and large‐scale compound resources should improve in silico pharmacology for drug repurposing, which will provide novel insights in therapy discovery and development for the infection pandemic.
AUTHOR CONTRIBUTIONS
Yunlong Ma and Jianzhong Su conceived and designed the study. Yunlong Ma, Yijun Zhou, Wei Dai, Fei Qiu, Chunyu Deng, Jingjing Li, Yaru Zhang, Dingping Jiang, Gongwei Zheng, Yinghao Yao, Haojun Sun, Shilai Xing, and Haijun Han contributed to management of data collection. Yunlong Ma, Fei Qiu, Yijun Zhou, Haojun Sun, and Yinghao Yao conducted bioinformatics analysis and data interpretation. Yunlong Ma, Jianzhong Su, Nan Wu, and Jia Qu wrote the article. All authors reviewed and approved the final article.
FUNDING INFORMATION
This study was funded by the National Natural Science Foundation of China (32200535 to Yunlong Ma and 61871294 and 82172882 to Jianzhong Su), the Scientific Research Foundation for Talents of the Wenzhou Medical University (KYQD20201001 to Yunlong Ma), the Natural Science Foundation of Zhejiang Province (LR19C060001 to Jianzhong Su), and the National High Level Hospital Clinical Research Funding (2022‐PUMCH‐C‐033 to Nan Wu).
CONFLICT OF INTEREST STATEMENT
The authors declare no competing interests.
Supporting information
Data S1. Supporting information.
Data S2. Supporting Figures.
Data S3. Supporting Tables.
Ma Y, Zhou Y, Jiang D, et al. Integration of human organoids single‐cell transcriptomic profiles and human genetics repurposes critical cell type‐specific drug targets for severe COVID‐19. Cell Prolif. 2024;57(3):e13558. doi: 10.1111/cpr.13558
Yunlong Ma, Yijun Zhou, Dingping Jiang, and Wei Dai contributed equally to this work.
Contributor Information
Yinghao Yao, Email: yaoyinghao@ojlab.ac.cn.
Jianzhong Su, Email: sujz@wmu.edu.cn.
DATA AVAILABILITY STATEMENT
All the GWAS summary statistics used in this study can be accessed in the official websites (www.covid19hg.org/results). The GTEx eQTL data (version 8) were downloaded from Zenodo repository (https://zenodo.org/record/3518299#.Xv6Z6igzbgl). All the human Organoids scRNA‐seq data were downloaded from two databases of GEO (https://www.ncbi.nlm.nih.gov/gds) and ArrayExpress (https://www.ebi.ac.uk/biostudies/arrayexpress). We have assembled a comprehensive pan‐organoids single‐cell RNA‐seq dataset, which is available through the Curated scHOB website (https://schob.su-lab.org/function/). The code to reproduce the results is available in a dedicated GitHub repository (https://github.com/mayunlong89/scHuman_organoids_COVID19). scDrugHunter is implemented as an R package and is available on GitHub (https://github.com/x-burner-ux/scDrugHunter).
REFERENCES
- 1. Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID‐19) outbreak in China: summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention. Jama. 2020;323:1239‐1242. [DOI] [PubMed] [Google Scholar]
- 2. Xu L, Ma Y, Yuan J, et al. COVID‐19 quarantine reveals that behavioral changes have an effect on myopia progression. Ophthalmology. 2021;128(11):1652‐1654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Zhang B, Zhang Z, Koeken VA, et al. Altered and allele‐specific open chromatin landscape reveals epigenetic and genetic regulators of innate immunity in COVID‐19. Cell Genomics. 2023;3(2):100232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Gaziano L, Giambartolomei C, Pereira AC, et al. Actionable druggable genome‐wide mendelian randomization identifies repurposing opportunities for COVID‐19. Nat Med. 2021;27(4):668‐676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Chen KG, Park K, Spence JR. Studying SARS‐CoV‐2 infectivity and therapeutic responses with complex organoids. Nat Cell Biol. 2021;23(8):822‐833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Hirt CK, Booij TH, Grob L, et al. Drug screening and genome editing in human pancreatic cancer organoids identifies drug‐gene interactions and candidates for off‐label therapy. Cell Genomics. 2022;2(2):100095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Xiang B, Deng C, Qiu F, et al. Single cell sequencing analysis identifies genetics‐modulated ORMDL3+ cholangiocytes having higher metabolic effects on primary biliary cholangitis. J Nanobiotechnol. 2021;19(1):1‐21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Li K, Ouyang M, Zhan J, Tian R. CRISPR‐based functional genomics screening in human‐pluripotent‐stem‐cell‐derived cell types. Cell . Genomics. 2023;3(5):100300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Han Y, Duan X, Yang L, et al. Identification of SARS‐CoV‐2 inhibitors using lung and colonic organoids. Nature. 2021;589(7841):270‐275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Luo C, Liu H, Xie F, et al. Single nucleus multi‐omics identifies human cortical cell regulatory genome diversity. Cell Genomics. 2022;2(3):100107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Monteil V, Kwon H, Prado P, et al. Inhibition of SARS‐CoV‐2 infections in engineered human tissues using clinical‐grade soluble human ACE2. Cell. 2020;181(4):905‐913. e907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Mallard TT, Linnér RK, Grotzinger AD, et al. Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross‐cutting genetic liabilities. Cell Genomics. 2022;2(6):100140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Pairo‐Castineira E, Clohisey S, Klaric L, et al. Genetic mechanisms of critical illness in COVID‐19. Nature. 2021;591(7848):92‐98. [DOI] [PubMed] [Google Scholar]
- 14. Ellinghaus D, Degenhardt F, Bujanda L, et al. Genomewide association study of severe Covid‐19 with respiratory failure. N Engl J Med. 2020;383(16):1522‐1534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Ma Y, Huang Y, Zhao S, et al. Integrative genomics analysis reveals a 21q22. 11 locus contributing risk to COVID‐19. Hum Mol Genet. 2021;30(13):1247‐1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Zheng J, Zhang Y, Zhao H, et al. Multi‐ancestry mendelian randomization of omics traits revealing drug targets of COVID‐19 severity. EBioMedicine. 2022;81:104112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Zhou S, Butler‐Laporte G, Nakanishi T, et al. A neanderthal OAS1 isoform protects individuals of European ancestry against COVID‐19 susceptibility and severity. Nat Med. 2021;27(4):659‐667. [DOI] [PubMed] [Google Scholar]
- 18. Jagadeesh KA, Dey KK, Montoro DT, et al. Identifying disease‐critical cell types and cellular processes by integrating single‐cell RNA‐sequencing and human genetics. Nat Genet. 2022;54(10):1479‐1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Ma Y, Qiu F, Deng C, et al. Integrating single‐cell sequencing data with GWAS summary statistics reveals CD16+ monocytes and memory CD8+ T cells involved in severe COVID‐19. Genome Med. 2022;14(1):1‐21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Delorey TM, Ziegler CG, Heimberg G, et al. COVID‐19 tissue atlases reveal SARS‐CoV‐2 pathology and cellular targets. Nature. 2021;595(7865):107‐113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Clough E, Barrett T. The Gene Expression Omnibus database. Methods Mol Biol. 2016;1418:93‐110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Athar A, Füllgrabe A, George N, et al. ArrayExpress update–from bulk to single‐cell expression data. Nucleic Acids Res. 2019;47(D1):D711‐D715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Leinonen R, Sugawara H, Shumway M; Collaboration INSD . The sequence read archive. Nucleic Acids Res. 2010;39(Suppl 1):D19‐D21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841‐842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Zheng GX, Terry JM, Belgrader P, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8(1):1‐12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Dobin A, Davis CA, Schlesinger F, et al. STAR: ultrafast universal RNA‐seq aligner. Bioinformatics. 2013;29(1):15‐21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. 1000 Genomes Project Consortium , Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM. A global reference for human genetic variation. Nature. 2015;526(7571):68‐74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2013;30(7):923‐930. [DOI] [PubMed] [Google Scholar]
- 29. Wolf FA, Angerer P, Theis FJ. SCANPY: large‐scale single‐cell gene expression data analysis. Genome Biol. 2018;19:1‐5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single‐cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol. 2018;36(5):411‐420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Korsunsky I, Millard N, Fan J, et al. Fast, sensitive and accurate integration of single‐cell data with harmony. Nat Methods. 2019;16(12):1289‐1296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Lotfollahi M, Naghipourfar M, Luecken MD, et al. Mapping single‐cell data to reference atlases by transfer learning. Nat Biotechnol. 2022;40(1):121‐130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. The COVID‐19 host genetics initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS‐CoV‐2 virus pandemic. Eur J Hum Genet. 2020;28(6):715‐718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Turner SD. qqman: an R package for visualizing GWAS results using QQ and Manhattan plots. Biorxiv 2014:005165.
- 35. Ma Y, Deng C, Zhou Y, et al. Polygenic regression uncovers trait‐relevant cellular contexts through pathway activation transformation of single‐cell RNA sequencing data. Cell Genom. 2023;3(9):100383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27‐30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Finucane HK, Reshef YA, Anttila V, et al. Heritability enrichment of specifically expressed genes identifies disease‐relevant tissues and cell types. Nat Genet. 2018;50(4):621‐629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. DeTomaso D, Jones MG, Subramaniam M, Ashuach T, Ye CJ, Yosef N. Functional interpretation of single cell similarity maps. Nat Commun. 2019;10(1):4376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Geary RC. The contiguity ratio and statistical mapping. Incorporated Stat. 1954;5(3):115‐146. [Google Scholar]
- 40. Barbeira AN, Dickinson SP, Bonazzola R, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun. 2018;9(1):1825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Barbeira AN, Pividori M, Zheng J, Wheeler HE, Nicolae DL, Im HK. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 2019;15(1):e1007889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene‐set analysis of GWAS data. PLoS Comput Biol. 2015;11(4):e1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Ma X, Wang P, Xu G, Yu F, Ma Y. Integrative genomics analysis of various omics data and networks identify risk genes and variants vulnerable to childhood‐onset asthma. BMC Med Genomics. 2020;13(1):123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Dong Z, Ma Y, Zhou H, et al. Integrated genomics analysis highlights important SNPs and genes implicated in moderate‐to‐severe asthma based on GWAS and eQTL datasets. BMC Pulm Med. 2020;20(1):270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Chung NC, Miasojedow B, Startek M, Gambin A. Jaccard/Tanimoto similarity test and estimation methods for biological presence‐absence data. BMC Bioinformatics. 2019;20(15):1‐11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Wang J, Duncan D, Shi Z, Zhang B. WEB‐based GEne SeT AnaLysis toolkit (WebGestalt): update 2013. Nucleic Acids Res. 2013;41(Web Server issue):W77‐W83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015;43(Database issue):D1049‐D1056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Bulik‐Sullivan BK, Loh P‐R, Finucane HK, et al. LD score regression distinguishes confounding from polygenicity in genome‐wide association studies. Nat Genet. 2015;47(3):291‐295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Ma Y, Zhou Y, Su J. https://github.com/x-burner-ux/scDrugHunter. Github 2023.
- 50. Barbeira AN, Bonazzola R, Gamazon ER, et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 2021;22(1):49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Freshour SL, Kiwala S, Cotto KC, et al. Integration of the drug–gene interaction database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res. 2021;49(D1):D1144‐D1151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Fang H, De Wolf H, Knezevic B, et al. A genetics‐led approach defines the drug target landscape of 30 immune‐related traits. Nat Genet. 2019;51(7):1082‐1091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Mountjoy E, Schmidt EM, Carmona M, et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait‐associated loci. Nat Genet. 2021;53(11):1527‐1533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Borkowski B, Wiliński A, Szczesny W, Binderman Z. Mathematical analysis of synthetic measures based on radar charts. Math Model Anal. 2020;25(3):473‐489. [Google Scholar]
- 55. Jin S, Guerrero‐Juarez CF, Zhang L, et al. Inference and analysis of cell‐cell communication using CellChat. Nat Commun. 2021;12(1):1088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Holloway EM, Wu JH, Czerwinski M, et al. Differentiation of human intestinal organoids with endogenous vascular endothelial cells. Dev Cell. 2020;54(4):516‐528. e517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Cruceanu C, Dony L, Krontira AC, et al. Cell‐type‐specific impact of glucocorticoid receptor activation on the developing brain: a cerebral organoid study. Am J Psychiatry. 2022;179(5):375‐387. [DOI] [PubMed] [Google Scholar]
- 58. Zhang MJ, Hou K, Dey KK, et al. Polygenic enrichment distinguishes disease associations of individual cells in single‐cell RNA‐seq data. Nat Genet. 2022;54:1572‐1580. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Wang R, Lin D‐Y, Jiang Y. EPIC: inferring relevant cell types for complex traits by integrating genome‐wide association studies and single‐cell RNA sequencing. PLoS Genet. 2022;18(6):e1010251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Zhu R, Yan T, Feng Y, et al. Mesenchymal stem cell treatment improves outcome of COVID‐19 patients via multiple immunomodulatory mechanisms. Cell Res. 2021;31(12):1244‐1262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Xu R, Feng Z, Wang F‐S. Mesenchymal stem cell treatment for COVID‐19. EBioMedicine. 2022;77:103920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Ding S, Liang TJ. Is SARS‐CoV‐2 also an enteric pathogen with potential fecal–oral transmission? A COVID‐19 virological and clinical review. Gastroenterology. 2020;159(1):53‐61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Xu E, Xie Y, Al‐Aly Z. Long‐term gastrointestinal outcomes of COVID‐19. Nat Commun. 2023;14(1):983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Sashindranath M, Nandurkar HH. Endothelial dysfunction in the brain: setting the stage for stroke and other cerebrovascular complications of COVID‐19. Stroke. 2021;52(5):1895‐1904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Fullard JF, Lee H‐C, Voloudakis G, et al. Single‐nucleus transcriptome analysis of human brain immune response in patients with severe COVID‐19. Genome Med. 2021;13(1):1‐13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Yuan J, Fan D, Xue Z, Qu J, Su J. Co‐expression of mitochondrial genes and ACE2 in cornea involved in COVID‐19. Invest Ophthalmol Vis Sci. 2020;61(12):13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Horowitz JE, Kosmicki JA, Damask A, et al. Genome‐wide analysis provides genetic evidence that ACE2 influences COVID‐19 risk and yields risk scores associated with severe disease. Nat Genet. 2022;54(4):382‐392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Pathak GA, Singh K, Miller‐Fleming TW, et al. Integrative genomic analyses identify susceptibility genes underlying COVID‐19 hospitalization. Nat Commun. 2021;12(1):1‐11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Roberts GH, Partha R, Rhead B, et al. Expanded COVID‐19 phenotype definitions reveal distinct patterns of genetic association and protective effects. Nat Genet. 2022;54(4):374‐381. [DOI] [PubMed] [Google Scholar]
- 70. Downes DJ, Cross AR, Hua P, et al. Identification of LZTFL1 as a candidate effector gene at a COVID‐19 risk locus. Nat Genet. 2021;53(11):1606‐1615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Kousathanas A, Pairo‐Castineira E, Rawlik K, et al. Whole‐genome sequencing reveals host factors underlying critical COVID‐19. Nature. 2022;607(7917):97‐103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Ma Y, Li MD. Establishment of a strong link between smoking and cancer pathogenesis through DNA methylation analysis. Sci Rep. 2017;7(1):1‐13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Menche J, Sharma A, Kitsak M, et al. Uncovering disease‐disease relationships through the incomplete interactome. Science. 2015;347(6224):1257601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Auwul MR, Rahman MR, Gov E, Shahjaman M, Moni MA. Bioinformatics and machine learning approach identifies potential drug targets and pathways in COVID‐19. Brief Bioinform. 2021;22(5):bbab120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Guan W‐j, Liang W‐h, Zhao Y, et al. Comorbidity and its impact on 1590 patients with COVID‐19 in China: a nationwide analysis. Eur Respir J. 2020;55(5):2000547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Zhou F, Yu T, Du R, et al. Clinical course and risk factors for mortality of adult inpatients with COVID‐19 in Wuhan, China: a retrospective cohort study. The Lancet. 2020;395(10229):1054‐1062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77. Samuel RM, Majd H, Richter MN, et al. Androgen signaling regulates SARS‐CoV‐2 receptor levels and is associated with severe COVID‐19 symptoms in men. Cell Stem Cell. 2020;27(6):876‐889. e812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Kathiriya JJ, Wang C, Zhou M, et al. Human alveolar type 2 epithelium transdifferentiates into metaplastic KRT5+ basal cells. Nat Cell Biol. 2022;24(1):10‐23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Gulati GS, Sikandar SS, Wesche DJ, et al. Single‐cell transcriptional diversity is a hallmark of developmental potential. Science. 2020;367(6476):405‐411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Qiu X, Mao Q, Tang Y, et al. Reversed graph embedding resolves complex single‐cell trajectories. Nat Methods. 2017;14(10):979‐982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Rendeiro AF, Ravichandran H, Bram Y, et al. The spatial landscape of lung pathology during COVID‐19 progression. Nature. 2021;593(7860):564‐569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82. Li H, Li X, Wu Q, et al. Plasma proteomic and metabolomic characterization of COVID‐19 survivors 6 months after discharge. Cell Death Dis. 2022;13(3):1‐12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Sun J‐K, Liu Y, Zou L, et al. Acute gastrointestinal injury in critically ill patients with COVID‐19 in Wuhan, China. World J Gastroenterol. 2020;26(39):6087‐6097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Zang R, Castro MFG, McCune BT, et al. TMPRSS2 and TMPRSS4 promote SARS‐CoV‐2 infection of human small intestinal enterocytes. Sci Immunol. 2020;5(47):eabc3582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. Wilen CB, Lee S, Hsieh LL, et al. Tropism for tuft cells determines immune promotion of norovirus pathogenesis. Science. 2018;360(6385):204‐208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Baldridge MT, Nice TJ, McCune BT, et al. Commensal microbes and interferon‐λ determine persistence of enteric murine norovirus infection. Science. 2015;347(6219):266‐269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87. Jyothula SS, Peters A, Liang Y, et al. Fulminant lung fibrosis in non‐resolvable COVID‐19 requiring transplantation. EBioMedicine. 2022;86:104351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Leng L, Ma J, Zhang P‐P, et al. Spatial region‐resolved proteome map reveals mechanism of COVID‐19‐associated heart injury. Cell Rep. 2022;39(11):110955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Melms JC, Biermann J, Huang H, et al. A molecular single‐cell lung atlas of lethal COVID‐19. Nature. 2021;595(7865):114‐119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Li Y, Wei L, He L, Sun J, Liu N. Interferon‐induced transmembrane protein 3 gene polymorphisms are associated with COVID‐19 susceptibility and severity: a meta‐analysis. J Infect. 2022;84:825‐833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Rofeal M, El‐Malek FA. Ribosomal proteins as a possible tool for blocking SARS‐COV 2 virus replication for a potential prospective treatment. Med Hypotheses. 2020;143:109904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Yang B, Fan J, Huang J, et al. Clinical and molecular characteristics of COVID‐19 patients with persistent SARS‐CoV‐2 infection. Nat Commun. 2021;12(1):1‐13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93. Boldrini M, Canoll PD, Klein RS. How COVID‐19 affects the brain. JAMA Psychiatry. 2021;78(6):682‐683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94. Ng J‐H, Sun A, Je HS, Tan E‐K. Unravelling pathophysiology of neurological and psychiatric complications of COVID‐19 using brain organoids. Neuroscientist. 2021;29:30‐40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95. Meinhardt J, Radke J, Dittmayer C, et al. Olfactory transmucosal SARS‐CoV‐2 invasion as a port of central nervous system entry in individuals with COVID‐19. Nat Neurosci. 2021;24(2):168‐175. [DOI] [PubMed] [Google Scholar]
- 96. Cantuti‐Castelvetri L, Ojha R, Pedro LD, et al. Neuropilin‐1 facilitates SARS‐CoV‐2 cell entry and infectivity. Science. 2020;370(6518):856‐860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Daly JL, Simonetti B, Klein K, et al. Neuropilin‐1 is a host factor for SARS‐CoV‐2 infection. Science. 2020;370(6518):861‐865. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98. Liddelow SA, Guttenplan KA, Clarke LE, et al. Neurotoxic reactive astrocytes are induced by activated microglia. Nature. 2017;541(7638):481‐487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99. Varga Z, Flammer AJ, Steiger P, et al. Endothelial cell infection and endotheliitis in COVID‐19. The Lancet. 2020;395(10234):1417‐1418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100. Wang J, Liu J, Luo M, et al. Rational drug repositioning for coronavirus‐associated diseases using directional mapping and side‐effect inference. Iscience. 2022;25(11):105348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101. Reay WR, Cairns MJ. Advancing the use of genome‐wide association studies for drug repurposing. Nat Rev Genet. 2021;22(10):658‐671. [DOI] [PubMed] [Google Scholar]
- 102. Anderson AG, Rogers BB, Loupe JM, et al. Single nucleus multiomics identifies ZEB1 and MAFB as candidate regulators of Alzheimer's disease‐specific cis‐regulatory elements. Cell . Genomics. 2023;3(3):100263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103. Jeong R, Bulyk ML. Blood cell traits' GWAS loci colocalization with variation in PU. 1 genomic occupancy prioritizes causal noncoding regulatory variants. Cell Genomics. 2023;3:100327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104. Wu Y, Qi T, Wray NR, Visscher PM, Zeng J, Yang J. Joint analysis of GWAS and multi‐omics QTL summary statistics reveals a large fraction of GWAS signals shared with molecular phenotypes. Cell Genomics. 2023;3:100344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105. Hung IF‐N, Lung K‐C, Tso EY‐K, et al. Triple combination of interferon beta‐1b, lopinavir–ritonavir, and ribavirin in the treatment of patients admitted to hospital with COVID‐19: an open‐label, randomised, phase 2 trial. Lancet. 2020;395(10238):1695‐1704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106. Abdo‐Cuza A, Castellanos‐Gutiérrez R, Treto‐Ramirez J, et al. Safety and efficacy of intranasal recombinant human interferon alfa 2b as prophylaxis for COVID‐19 in patients on a hemodialysis program. J Renal Endocrinol. 2020;7(1):e05. [Google Scholar]
- 107. Dey KK, Gazal S, van de Geijn B, et al. SNP‐to‐gene linking strategies reveal contributions of enhancer‐related and candidate master‐regulator genes to autoimmune disease. Cell Genomics. 2022;2(7):100145. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data S1. Supporting information.
Data S2. Supporting Figures.
Data S3. Supporting Tables.
Data Availability Statement
All the GWAS summary statistics used in this study can be accessed in the official websites (www.covid19hg.org/results). The GTEx eQTL data (version 8) were downloaded from Zenodo repository (https://zenodo.org/record/3518299#.Xv6Z6igzbgl). All the human Organoids scRNA‐seq data were downloaded from two databases of GEO (https://www.ncbi.nlm.nih.gov/gds) and ArrayExpress (https://www.ebi.ac.uk/biostudies/arrayexpress). We have assembled a comprehensive pan‐organoids single‐cell RNA‐seq dataset, which is available through the Curated scHOB website (https://schob.su-lab.org/function/). The code to reproduce the results is available in a dedicated GitHub repository (https://github.com/mayunlong89/scHuman_organoids_COVID19). scDrugHunter is implemented as an R package and is available on GitHub (https://github.com/x-burner-ux/scDrugHunter).
