Abstract
The human serine protease serine 2 TMPRSS2 is involved in the priming of proteins of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and represents a possible target for COVID-19 therapy. The TMPRSS2 gene may be co-expressed with SARS-CoV-2 cell receptor genes angiotensin-converting enzyme 2 (ACE2) and Basigin (BSG), but only TMPRSS2 demonstrates tissue-specific expression in alveolar cells according to single-cell RNA sequencing data. Our analysis of the structural variability of the TMPRSS2 gene based on genome-wide data from 76 human populations demonstrates that a functionally significant missense mutation in exon 6/7 in the TMPRSS2 gene is found in many human populations at relatively high frequencies, with region-specific distribution patterns. The frequency of the missense mutation encoded by rs12329760, which has previously been found to be associated with prostate cancer, ranged between 10% and 63% and was significantly higher in populations of Asian origin compared with European populations. In addition to single-nucleotide polymorphisms, two copy number variants were detected in the TMPRSS2 gene. A number of microRNAs have been predicted to regulate TMPRSS2 and BSG expression levels, but none of them is enriched in lung or respiratory tract cells. Several well-studied drugs can downregulate the expression of TMPRSS2 in human cells, including acetaminophen (paracetamol) and curcumin. Thus, the interactions of TMPRSS2 with SARS-CoV-2, together with its structural variability, gene–gene interactions, expression regulation profiles, and pharmacogenomic properties, characterize this gene as a potential target for COVID-19 therapy.
Keywords: TMPRSS2, ACE2, BSG, COVID-19, SARS-CoV-2, SNV, expression, pharmacotranscriptomics
1. Introduction
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a pandemic of coronavirus disease (COVID-19) that has led to a global public health crisis. Infection of human cells with viral particles occurs through the binding of viral spike proteins to the receptors of the host cell and their subsequent priming with proteases. Angiotensin-converting enzyme 2 (ACE2) is considered to be the classic receptor for SARS-CoV-2, but there is evidence that the virus can also use the Basigin (BSG) receptor (CD147) [1]. The priming of viral proteins is carried out by the TMPRSS2. No specific therapy has yet been developed for SARS-CoV-2. However, blockers of all three proteins have been shown to prevent cell infection [2].
In addition to protein blocking, there are other mechanisms that can alter the expression levels of specific proteins or the affinity of their interactions with viral particles. Possible causes of expression differentiation include alterations of protein structure due to genetic variants: single nucleotide variants (SNVs), indels, copy number variations (CNVs), variants affecting regulatory regions (expression quantitative trait loci; eQTLs), and epigenetic regulation: methylation, microRNAs (miRNAs).
The TMPRSS2 gene in humans encodes a transmembrane protein of the serine protease family. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. The TMPRSS2 protein is involved in prostate carcinogenesis via overexpression of Erythroblast Transformation Specific transcription factors (ETS), such as ERG and ETV1, through gene fusion. TMPRSS2–ERG gene fusion, which is present in 40–80% of prostate cancers in humans, is a molecular subtype that has been associated with predominantly poor prognosis [3,4].
The TMPRSS2 protease proteolytically cleaves and activates glycoproteins of many viruses, including spike proteins of human coronaviruses 229E (HCoV-229E) and EMC (HCoV-EMC), and the fusion glycoproteins of Sendai virus, human metapneumovirus, and human parainfluenza 1, 2, 3, 4a, and 4b viruses [2,5,6,7,8,9,10]. Both the coronavirus responsible for 2003 SARS outbreak in Asia (SARS-CoV) and SARS-CoV-2 are activated by TMPRSS2 and can thus be inhibited by TMPRSS2 inhibitors [2]. Here, we report the genetic variability of the TMPRSS2 gene in 76 human populations of North Eurasia in comparison with worldwide populations, and analyze the data with respect to the expression and regulation of TMPRSS2, its interactions with SARS-CoV-2 receptors, and its pharmacogenetic properties.
2. Materials and Methods
2.1. Structural Variability Data
Allele frequency for worldwide populations were downloaded from the GnomAD database, which contains information on the frequencies of genomic variants from more than 120,000 exomes and 15,000 whole genomes [11]. These data were used to search for SNVs and indels in the TMPRSS2 gene. CNV data were obtained from the CNV Control Database [12].
Data on allele frequencies in 76 populations of North Eurasia were extracted from our own unpublished dataset of population genomics data obtained by genotyping using Illumina Infinium genome-wide microarrays. In brief, 1836 samples from 76 human populations were genotyped for 1,748,250 SNVs and indels using a Infinium Multi-Ethnic Global-8 kit. The populations represent various geographic regions of North Eurasia (Eastern Europe, Caucasus, Central Asia, Siberia, North-East Asia) and belong to various linguistic families (Indo-European, Altaic, Uralic, North Caucasian, Chukotko-Kamchatkan, Sino-Tibetan, Yeniseian). DNA samples were collected with informed consent and deposited in the DNA bank of the Research Institute for Medical Genetics, Tomsk National Medical Research Center, Tomsk, Russia, and the DNA bank of the Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences. The study was approved by the Ethical Committee of the Research Institute for Medical Genetics, Tomsk National Medical Research Center. Data on four missense mutations in the TMPRSS2 gene were extracted from the dataset. The CNV search was performed using a Markov model algorithm for high-resolution CNV detection in whole-genome implemented with the PennCNV tool [13]. To determine the possible functional impact of the detected SNVs, the Polymorphism Phenotyping v2 (PolyPhen-2) tool was used [14]. PolyPhen estimates the impact of a mutation on the stability and function of the protein using structural and evolutionary analyses with amino acid substitution. The tool evaluates the mutation as probably damaging, possibly damaging, benign, or of unknown significance, using quantitative prediction with a score.
2.2. Bioinformatics Analysis of Gene Expression, Mirna Interactions, and Pharmacogenomics
Protein–protein interaction analysis of SARS-CoV-2-interacting proteins was carried out using the GeneMANIA and STRING web resources [15,16]. Single-cell RNA sequencing (RNA-seq) data were downloaded from the PanglaoDB database, which contains more than 1300 single-cell sequencing samples [17]. Lung single-cell RNA-seq data were obtained from the Sequence Read Archive [18] and processed in the R software environment using the Seurat package [19].
Analysis of the interactions of miRNAs with target proteins was performed using information from two databases: miRTarBase, which contains information from more than 8000 referenced sources about experimentally confirmed miRNA–protein interactions [20]; and miRPathDB, which contains experimentally confirmed and predicted miRNA–protein interactions [21]. Data on the differential expression of miRNAs in various cell cultures were downloaded from the database of the FANTOM5 project [22]. The DrugBank database [23] was searched for drugs that could change the expression levels of proteins.
3. Results and Discussion
3.1. Protein–Protein Interaction Networks of Sars-Cov-2-Interacting Genes
The protein–protein interaction networks obtained with two different tools GeneMANIA and STRING (Figure 1 and Figure 2) demonstrated that TMPRSS2 was co-expressed with other SARS-CoV-2-interacting genes, despite showing contradictory co-expression patterns. According to GeneMANIA, TMPRSS2 was co-expressed with BSG, whereas STRING indicated co-expression of ACE2 and TMPRSS2. Interestingly, BSG showed the maximum number of protein–protein interactions in both networks.
BSG, or extracellular matrix metalloproteinase inducer, also known as cluster of differentiation 147 (CD147), encoded by the BSG gene, is a transmembrane glycoprotein of the immunoglobulin superfamily and a determinant of the Ok blood group system. The BSG protein has an important role in targeting monocarboxylates (MCT) transporters SLC16A1, SLC16A3, SLC16A8, SLC16A11, and SLC16A12 to the plasma membrane by interaction with MCT molecules via its transmembrane and cytoplasmic domains.
BSG is involved in spermatogenesis, embryo implantation, neural network formation, and tumor progression. It stimulates adjacent fibroblasts to produce matrix metalloproteinases. BSG seems to be a receptor for oligomannosidic glycans and, according to in vitro experiments, can promote outgrowth of astrocytic processes [24,25,26]. BSG is also involved in tumor development, plasmodium invasion, and viral infection [27,28,29,30,31].
Previous data on SARS indicate that BSG has a functional role in facilitating SARS-CoV invasion of host cells, and CD147-antagonistic peptide-9 has a high binding rate to HEK293 cells and an inhibitory effect on SARS-CoV [32]. Based on the similarity of SARS-CoV and SARS-CoV-2, the function of BSG in invasion of host cells by SARS-CoV-2 can be assumed. The exact role of BSG in COVID-19 is still unknown; however, it was recently shown that CD147 may bind to the spike protein of SARS-CoV-2 [1]. Preliminary data from a small sample of COVID-19 patients demonstrated that meplazumab, a humanized anti-CD147 antibody, efficiently improved the recovery of patients with SARS-CoV-2 pneumonia and showed a favorable safety profile [33].
3.2. Expression of ACE2, BSG, and TMPRSS2 in Single Cells
Expression profiles of SARS-CoV-2-interacting genes in various tissues demonstrated that ACE2 had high level of expression only in testicles in peritubular myoid cells (Figure 3). The highest expression levels of BSG were found in germ cells, endothelia of various localizations, fibroblasts, and some other cell types (Figure 4). TMPRSS2 showed high levels of expression in the prostate, intestines, and lungs (Figure 5).
In addition, the expression levels of these proteins were analyzed in a single sample (SRS2769051) of proximal stromal lung cells (Figure 6, Figure 7, Figure 8 and Figure 9). ACE2 had low levels of expression in pulmonary alveolar cells and fibroblasts. BSG was characterized by average levels of expression in fibroblasts and alveolar cells. Only the TMPRSS2 gene demonstrated tissue-specific expression in alveolar cells. Given the high specificity of the expression of TMPRSS2 in lung tissue, we further studied genomic and epigenomic properties that may affect its expression levels and the affinity of its interactions with viral particles.
3.3. Snv and Indel Variants of the Tmprss2 Gene
According to information in the GnomAD database, 1025 SNVs and indels of various frequencies, functional impact, and localization have been described in the TMPRSS2 gene. This list includes 332 missense variants, 17 frameshifts, 64 splice site variants, 14 stop codon mutations, and three in-frame indels. Among frequent variants (minor allele frequency >0.01), there were only 13 intronic polymorphisms, five synonymous variants, and two missense mutations (rs12329760 and rs75603675). Both missense variants had high frequencies (24.8% and 35.0% in GnomAD, respectively) (Table 1).
Table 1.
Variant Type | Maf > 0.01 | Maf < 0.01 |
---|---|---|
3_prime_UTR_variant | 0 | 22 |
5_prime_UTR_variant | 0 | 5 |
frameshift_variant | 0 | 17 |
inframe_deletion | 0 | 1 |
inframe_insertion | 0 | 2 |
intron_variant | 13 | 409 |
missense_variant | 2 | 332 |
splice_acceptor_variant | 0 | 4 |
splice_donor_variant | 0 | 5 |
splice_region_variant | 0 | 54 |
stop_gained | 0 | 13 |
stop_lost | 0 | 1 |
synonymous_variant | 5 | 140 |
The variant rs12329760 is a mutation of C to T in the 589 position of the gene, which leads to a change from valine to methionine at amino acid position 197 (exon 7) of transmembrane protease serine 2 isoform 1, or at position 160 (exon 6) of isoform 2. This mutation is predicted by Poly-Phen-2 to be probably damaging, with a score of 0.997 (sensitivity: 0.41; specificity: 0.98). The T allele of the TMPRSS2 rs12329760 variant was positively associated with TMPRSS2-ERG fusion by translocation; it was also associated with an increased risk of prostate cancer in European and Indian populations [34,35]. The rs75603675 variant (C to A transition in position 23, Gly8Val) was not reported to be associated with prostate cancer or any other clinical condition. An interesting feature of both frequent missense variants was the difference in prevalence between European and Asian populations; rs12329760 was 15% more frequent in populations of East Asia (38%) than in European populations (23%). For rs75603675, the difference was even more significant: the minor allele reached 42% in European populations and about 1% in populations of East Asia. Regarding CNVs, the controlDB database contained only one deletion in the TMPRSS2 gene (one copy variant) with relatively low frequency (1.2%) (Table 2).
Table 2.
Region | N | Frequency % | Copy Number | Gene Name |
---|---|---|---|---|
Chr21:42857241-42863723 | 164 | 1.2195122 | 1 | TMPRSS2 |
The structural variability of the TMPRSS2 gene in relation to COVID-19 has recently been investigated by many research groups. In particular, Paniri et al. used various bioinformatics approaches to predict the functional consequences of TMPRSS2 SNPs with respect to susceptibility to SARS-CoV-2, the functional effects of SNPs on splicing, and the influence of polymorphisms on miRNA function. According to Paniri et al., rs12329760 showed the highest scores in various analyses and was considered deleterious by three tools, indicating its negative functional impact. On the contrary, rs75603675 (G8D) was considered deleterious only by polyphen-2, whereas other tools such as Phyre2, GOR IV, and PSIPRED predicted that both variants would have functional effects on the secondary structure of the TMPRSS2 protein [36].
3.4. Frequency of Protein-Changing Allelic Variants of the Tmprss2 Gene in Populations of North Eurasia
In order to study the population differentiation in TMPRSS2 functional variants in more detail, we searched for TMPRSS2 allele frequencies in our own unpublished data from 76 populations of North Eurasia, based on 1836 samples genotyped using genome-wide microarrays. Four missense mutations and two CNVs in the TMPRSS2 gene were found in our dataset. We compared the frequency of TMPRSS2 missense mutations in the North Eurasian population (Table 3) with the worldwide data (Table 4). Three missense mutations (rs148125094, rs143597099, and rs201093031) were very rare variants, whereas rs12329760, which was previously shown to be associated with prostate cancer, was found with high frequency in all populations. Data on the second high-frequency missense variant in the TMPRSS2 gene according to the GnomAD database (rs75603675) were not available because of the absence of this SNP from the microarray used in our study. The minor allele of variant rs148125094 was found on only two chromosomes (total frequency 0.00054) in single heterozygous individuals from the Karelian and Abkhaz populations. The variant rs143597099 was present only in one heterozygote from the Veps population. The variant rs201093031 was found in North-East Asian Nivkh and Udege populations with a frequency of 7%, and in a single Tuvan individual from Siberia. The frequency of the probably damaging minor allele of the rs12329760 polymorphism ranged from 10% (in the Khvarshi population from Dagestan) to 63% (in Sagays Khakas). In general, the minor allele T had higher frequencies in Siberia and Central Asia (both around 35%), whereas the lowest frequencies of the damaging variant were found in North Caucasus (19%), Dagestan (22%), and Eastern Europe (29%). This distribution is consistent with the worldwide data, which demonstrates a much higher frequency of the minor allele in Asian populations (36–41%) than in Europeans (22–24%) (Table 4).
Table 3.
Population | N | rs148125094 | rs143597099 | rs12329760 | rs201093031 |
---|---|---|---|---|---|
Easten Europe | 419 | 0.0012 | 0.0012 | 0.2983 | 0.0000 |
Bashkirs Burzyan | 29 | 0.0000 | 0.0000 | 0.1552 | 0.0000 |
Bashkirs Perm | 15 | 0.0000 | 0.0000 | 0.3000 | 0.0000 |
Bashkirs Salavat | 15 | 0.0000 | 0.0000 | 0.3667 | 0.0000 |
Besermians | 16 | 0.0000 | 0.0000 | 0.1563 | 0.0000 |
Chuvash | 26 | 0.0000 | 0.0000 | 0.3077 | 0.0000 |
Karelians | 29 | 0.0172 | 0.0000 | 0.3966 | 0.0000 |
Komi | 30 | 0.0000 | 0.0000 | 0.3333 | 0.0000 |
Mari | 30 | 0.0000 | 0.0000 | 0.2500 | 0.0000 |
Mordvins Erzya | 16 | 0.0000 | 0.0000 | 0.3125 | 0.0000 |
Mordvins Moksha | 30 | 0.0000 | 0.0000 | 0.3000 | 0.0000 |
Mordvins Shoksha | 14 | 0.0000 | 0.0000 | 0.3214 | 0.0000 |
Russians | 33 | 0.0000 | 0.0000 | 0.3333 | 0.0000 |
Tatars Kazan | 30 | 0.0000 | 0.0000 | 0.3000 | 0.0000 |
Udmurts | 30 | 0.0000 | 0.0000 | 0.3000 | 0.0000 |
Udmurts Balezino | 28 | 0.0000 | 0.0000 | 0.3214 | 0.0000 |
Udmurts Sharkan | 18 | 0.0000 | 0.0000 | 0.2500 | 0.0000 |
Veps | 30 | 0.0000 | 0.0167 | 0.3333 | 0.0000 |
North Caucasus (excl. Dagestan) | 274 | 0.0018 | 0.0000 | 0.1989 | 0.0000 |
Abkhaz | 30 | 0.0167 | 0.0000 | 0.3000 | 0.0000 |
Adyghe | 10 | 0.0000 | 0.0000 | 0.1500 | 0.0000 |
Balkars | 50 | 0.0000 | 0.0000 | 0.1800 | 0.0000 |
Chechens | 27 | 0.0000 | 0.0000 | 0.2222 | 0.0000 |
Cherkess | 30 | 0.0000 | 0.0000 | 0.2167 | 0.0000 |
Ingush | 30 | 0.0000 | 0.0000 | 0.1500 | 0.0000 |
Karachays | 22 | 0.0000 | 0.0000 | 0.2045 | 0.0000 |
Mingrelians | 28 | 0.0000 | 0.0000 | 0.1607 | 0.0000 |
North Ossetians | 30 | 0.0000 | 0.0000 | 0.1833 | 0.0000 |
South Ossetians | 17 | 0.0000 | 0.0000 | 0.2059 | 0.0000 |
Dagestan | 538 | 0.0000 | 0.0000 | 0.2309 | 0.0000 |
Aghuls | 24 | 0.0000 | 0.0000 | 0.2292 | 0.0000 |
Akhvakhs | 24 | 0.0000 | 0.0000 | 0.3125 | 0.0000 |
Andis | 17 | 0.0000 | 0.0000 | 0.2353 | 0.0000 |
Archins | 24 | 0.0000 | 0.0000 | 0.3333 | 0.0000 |
Avars | 24 | 0.0000 | 0.0000 | 0.1875 | 0.0000 |
Bagulals | 23 | 0.0000 | 0.0000 | 0.3261 | 0.0000 |
Bezhtins | 22 | 0.0000 | 0.0000 | 0.2273 | 0.0000 |
Botlikhs | 16 | 0.0000 | 0.0000 | 0.1250 | 0.0000 |
Chamalals | 24 | 0.0000 | 0.0000 | 0.2083 | 0.0000 |
Dargins | 28 | 0.0000 | 0.0000 | 0.2321 | 0.0000 |
Ginukhs | 19 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
Gunzibians | 17 | 0.0000 | 0.0000 | 0.0294 | 0.0000 |
Karanogais | 19 | 0.0000 | 0.0000 | 0.2368 | 0.0000 |
Karatins | 24 | 0.0000 | 0.0000 | 0.3333 | 0.0000 |
Khvarshins | 15 | 0.0000 | 0.0000 | 0.1000 | 0.0000 |
Kumyks | 37 | 0.0000 | 0.0000 | 0.2703 | 0.0000 |
Laks | 24 | 0.0000 | 0.0000 | 0.3125 | 0.0000 |
Lezgins | 28 | 0.0000 | 0.0000 | 0.2037 | 0.0000 |
Nogais | 20 | 0.0000 | 0.0000 | 0.2750 | 0.0000 |
Rutuls | 22 | 0.0000 | 0.0000 | 0.1818 | 0.0000 |
Tabasarans | 21 | 0.0000 | 0.0000 | 0.2619 | 0.0000 |
Tindins | 18 | 0.0000 | 0.0000 | 0.2222 | 0.0000 |
Tsakhurs | 24 | 0.0000 | 0.0000 | 0.2292 | 0.0000 |
Tsez | 24 | 0.0000 | 0.0000 | 0.2708 | 0.0000 |
Central Asia | 128 | 0.0000 | 0.0000 | 0.3565 | 0.0000 |
Dungans | 23 | 0.0000 | 0.0000 | 0.4130 | 0.0000 |
Kazakh Junior Horde | 29 | 0.0000 | 0.0000 | 0.2931 | 0.0000 |
Kazakh Great Horde | 26 | 0.0000 | 0.0000 | 0.4423 | 0.0000 |
Kyrgyz | 28 | 0.0000 | 0.0000 | 0.3704 | 0.0000 |
Uzbeks | 22 | 0.0000 | 0.0000 | 0.2619 | 0.0000 |
Siberia | 404 | 0.0000 | 0.0000 | 0.3540 | 0.0013 |
Altaians Maymalar | 24 | 0.0000 | 0.0000 | 0.3958 | 0.0000 |
Altaians Kizhi | 25 | 0.0000 | 0.0000 | 0.3600 | 0.0000 |
Buryats Aginskoe | 23 | 0.0000 | 0.0000 | 0.4130 | 0.0000 |
Buryats Kurumkan | 28 | 0.0000 | 0.0000 | 0.3929 | 0.0000 |
Chulyms | 22 | 0.0000 | 0.0000 | 0.3636 | 0.0000 |
Evenks Yakutia | 28 | 0.0000 | 0.0000 | 0.2857 | 0.0000 |
Evenks Zabaykalsky Krai | 25 | 0.0000 | 0.0000 | 0.3200 | 0.0000 |
Kalmyks | 29 | 0.0000 | 0.0000 | 0.3103 | 0.0000 |
Kets | 15 | 0.0000 | 0.0000 | 0.3333 | 0.0000 |
Khakas Kachins | 26 | 0.0000 | 0.0000 | 0.4423 | 0.0000 |
Khakas Sagays | 29 | 0.0000 | 0.0000 | 0.6379 | 0.0000 |
Khanty Kazym | 30 | 0.0000 | 0.0000 | 0.1333 | 0.0000 |
Khanty Russkinskie | 26 | 0.0000 | 0.0000 | 0.2500 | 0.0000 |
Tomsk Tatas | 20 | 0.0000 | 0.0000 | 0.3250 | 0.0000 |
Tuvans | 28 | 0.0000 | 0.0000 | 0.3036 | 0.0185 |
Yakuts | 26 | 0.0000 | 0.0000 | 0.4038 | 0.0000 |
North East Asia | 73 | 0.0000 | 0.0000 | 0.2671 | 0.0284 |
Chukchi | 25 | 0.0000 | 0.0000 | 0.3000 | 0.0000 |
Koryaks | 20 | 0.0000 | 0.0000 | 0.3500 | 0.0000 |
Nivkhs | 13 | 0.0000 | 0.0000 | 0.1538 | 0.0769 |
Udege | 15 | 0.0000 | 0.0000 | 0.2000 | 0.0714 |
Table 4.
rs148125094 | rs12329760 | rs201093031 | ||||
---|---|---|---|---|---|---|
Population | N | Frequency | N | Frequency | N | Frequency |
European | 77147 | 0.0014 | 76846 | 0.2549 | 77117 | 0.0000 |
Finnish | 12560 | 0.0016 | 12544 | 0.3725 | 12561 | 0.0000 |
Estonian | 2416 | 0.0060 | 2394 | 0.3074 | 2406 | 0.0000 |
Southern European | 5805 | 0.0013 | 5778 | 0.1748 | 5802 | 0.0000 |
North-western European | 25402 | 0.0012 | 25348 | 0.2212 | 25391 | 0.0000 |
Other non-Finnish European | 16562 | 0.0012 | 16439 | 0.2286 | 16557 | 0.0000 |
Swedish | 13067 | 0.0010 | 13013 | 0.2722 | 13066 | 0.0000 |
Bulgarian | 1335 | 0.0007 | 1330 | 0.1970 | 1334 | 0.0000 |
South Asian | 15308 | 0.0007 | 15298 | 0.2477 | 15303 | 0.0000 |
Latino | 17718 | 0.0003 | 17705 | 0.1533 | 17697 | 0.0000 |
African | 12480 | 0.0002 | 12448 | 0.2918 | 12480 | 0.0000 |
Ashkenazi Jewish | 5185 | 0.0000 | 5163 | 0.1352 | 5179 | 0.0000 |
East Asian | 9196 | 0.0000 | 9188 | 0.3810 | 9193 | 0.0024 |
Japanese | 76 | 0.0000 | 76 | 0.4013 | 76 | 0.0000 |
Korean | 1909 | 0.0000 | 1909 | 0.3675 | 1909 | 0.0018 |
Other East Asian | 7211 | 0.0000 | 7203 | 0.3844 | 7208 | 0.0026 |
In addition, we detected CNVs in two samples. In the first case, an increase in the number of copies covering the entire gene was found in a Karanogai individual. The second CNV, affecting exons 3–7, was found in a single Kumyk individual.
Thus, potentially functionally significant variants in the TMPRSS2 gene were found in many human populations with relatively high frequencies, demonstrating region-specific distribution patterns. Both variants—the probably damaging SNV and heterozygous deletion of the gene—may significantly affect the interactions of this human serine protease with viral spike proteins, thereby changing the efficacy of the priming of viral proteins by TMPRSS2. However, the roles of the TMPRSS2 gene and its variants in interactions with SARS-CoV-2 and in viral infectivity still need to be elucidated.
3.5. Regulation of Expression of Tmprss2
3.5.1. Eqtls
According to the GTEx Analysis V8 database, the TMPRSS2 gene contains 136 eQTLs (including 60 downregulating and 76 upregulating variants) that significantly alter its expression in lung tissues (Table 5). However, in general, these eQTLs have only minor effects on gene expression. The average slope of the regression line (the value that characterizes the strength of an eQTL’s effect) was around 0.09 both for down- and upregulating variants. The strongest single variant could change the gene’s expression by 13%.
Table 5.
N SNP | Average Maf | Average Slope | |
---|---|---|---|
down | 60 | 0.3722896 | −0.09795966 |
up | 76 | 0.4537386 | 0.09709619 |
3.5.2. Mirnas
According to the miRTarBase and miRPathDB databases, no experimentally proven miRNAs regulating TMPRSS2 have been detected. It is worth noting that the TMPRSS2 and BSG genes have the same predicted regulatory miRNAs.
The top 30 miRNAs predicted to regulate TMPRSS2 and BSG were analyzed for enrichment in various cell types using the FANTOM5 database. None of the top miRNAs was enriched in lung or respiratory tract cells, but three miRNAs showed slight expression in immune and endothelial cells (Table 6).
Table 6.
miRNA | Cell Ontology |
---|---|
hsa-miR-4476 | B cell |
hsa-miR-5187-3p | myeloid leukocyte |
hsa-miR-5187-3p | hematopoietic cell |
hsa-miR-7849-3p | endothelial cell |
hsa-miR-7849-3p | blood vessel endothelial cell |
hsa-miR-7849-3p | endothelial cell of vascular tree |
hsa-miR-7849-3p | neutrophil |
3.5.3. Pharmacotranscriptomics of Tmprss2
According to the DrugBank database, nine drugs can reduce the level of expression of TMPRSS2. For five of them (acetaminophen/paracetamol, curcumin, cyclosporine, and ethinylestradiol), this effect has been clinically proved (Table 7). Information on the direction of the effect of estradiol is conflicting; in different experiments it has been found to either downregulate or upregulate TMPRSS2 expression.
Table 7.
Drug | Drug Groups | Change | References |
---|---|---|---|
Acetaminophen | Approved | downregulated | 21420995 |
Acyline | Investigational | downregulated | 17510436 |
Stanolone | Illicit Investigational | downregulated | 12711008 |
Stanolone | Illicit Investigational | upregulated | 20601956, 23708653 |
Estradiol | Approved Investigational Vet Approved | downregulated | 24758408 |
Estradiol | Approved Investigational Vet Approved | upregulated | 19619570 |
Curcumin | Approved Experimental Investigational | downregulated | 18719366, 22258452 |
Cyclosporine | Approved Investigational Vet Approved | downregulated | 20106945 |
Calcitriol | Approved Nutraceutical | upregulated | 21592394, 26485663 |
Entinostat | Investigational | upregulated | 26272509 |
Ethinylestradiol | Approved | downregulated | 18936297 |
Genistein | Investigational | downregulated | 15378649, 26865667 |
Metribolone | Experimental | downregulated | 12711008 |
Metribolone | Experimental | upregulated | 17010675, 21440447 |
Resveratrol | Investigational | downregulated | 18586690 |
Selenium | Approved Investigational Vet Approved | upregulated | 19244175 |
Testosterone | Approved Investigational | upregulated | 21592394 |
Tretinoin | Approved Investigational Nutraceutical | upregulated | 23830798 |
Valproic acid | Approved Investigational | upregulated | 23179753, 24383497, 26272509 |
Zoledronic acid | Approved | upregulated | 24714768 |
Two drugs from the list above (acetaminophen/paracetamol and curcumin) have also been considered as possible therapies for COVID-19 [37]. Acetaminophen is currently being discussed as a possible drug for the correction of fever in patients with COVID-19. The ability of this drug to reduce the level of expression of TMPRSS2 may be an additional argument in favor of its use, compared with non-steroidal anti-inflammatory drugs. Curcumin, a widely used food supplement, has the predicted ability to block the main protease (Mpro) of SARS-CoV-2 [38] and may be studied further in relation to COVID-19 therapy. However, only pentanal, which enhances the expression of ACE2, is described in DrugBank as a drug that can change the expression level of ACE2. According to this database, expression of BSG1 is affected by eight drugs, five of which can reduce the level of the protein. One of them, valproic acid, can also reduce the expression of TMPRSS2 (Table 8).
Table 8.
Drug | Drug Groups | Change | References |
---|---|---|---|
Amiodarone | Approved Investigational | upregulated | 19774075 |
Arsenic trioxide | Approved Investigational | downregulated | 23232515 |
Estradiol | Approved Investigational Vet Approved | upregulated | 19167446 |
Methotrexate | Approved | downregulated | 25339124 |
Quercetin | Experimental Investigational | upregulated | 21632981 |
Isotretinoin | Approved | downregulated | 20436886 |
Silicon dioxide | Approved | downregulated | 25895662 |
Valproic acid | Approved Investigational | downregulated | 23179753 |
4. Conclusions
The TMPRSS2 protein plays a crucial part in the process of SARS-CoV-2 activation in human cells. The gene encoding this protease demonstrates a high level of genetic variability, as well as having many variants that may regulate its expression levels. Although very few of the potentially functionally significant variants of this gene are of relatively high frequency, population-specific patterns of TMPRSS2 variability may contribute to some extent to the different viral infectivity of SARS-CoV-2 in populations of various geographic origins.
TMPRSS2 is probably co-expressed with SARS-CoV-2 receptors (ACE2 and BSG), but only the TMPRSS2 protease demonstrates tissue–specific expression in alveolar cells, the target cell type of SARS-CoV-2. Thus, TMPRSS2 is potentially the most promising target for COVID-19 therapy, based on its specific expression in lung, its important role in the process of cell infection, and its interactions with other proteins involved in the infection process. Several well-studied drugs can downregulate the expression of TMPRSS2 in human cells, including acetaminophen and curcumin. Both deserve close attention as possible anti-COVID-19 drugs, owing to their confirmed effects on TMPRSS2 expression, as well the long history of their use, their known side effects, and their wide availability.
Author Contributions
A.Z. and V.S. designed the study and wrote the manuscript. A.Z. and A.M. (Anton Markov) performed bioinformatics analyses. N.K., V.K., S.L., N.E., M.D., N.M., A.S., O.S., E.K., A.M. (Andrey Marusin), M.S., I.K. and M.R. collected and prepared samples. N.K. and V.K. genotyped samples. A.Z., V.S. and A.M. (Anton Markov) revised the methodologies and the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding
This work was partially supported by the Russian Foundation for Basic Research (project # 18-29-13045).
Institutional Review Board Statement
The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of Research Institute for Medical Genetics, Tomsk National Medical Research Center (protocol code 6 date 12.06.2020).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found here: https://panglaodb.se/, https://gnomad.broadinstitute.org/, https://fantom.gsc.riken.jp/5/, https://gtexportal.org/home/datasets.
Conflicts of Interest
The authors declare no conflict of interest.
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Wang K., Chen W., Zhou Y.S., Lian J.Q., Zhang Z., Du P., Gong L., Zhang Y., Cui H.Y., Geng J.J., et al. SARS-CoV-2 invades host cells via a novel route: CD147-spike protein. bioRxiv. 2020 doi: 10.1101/2020.03.14.988345v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hoffmann M., Kleine-Weber H., Schroeder S., Krüger N., Herrler T., Erichsen S., Schiergens T.S., Herrler G., Wu N.H., Nitsche A., et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020;16:271–280. doi: 10.1016/j.cell.2020.02.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Pettersson A., Graff R.E., Bauer S.R., Pitt M.J., Lis R.T., Stack E.C., Martin N.E., Kunz L., Penney K.L., Ligon A.H., et al. The TMPRSS2: ERG rearrangement, ERG expression, and prostate cancer outcomes: A cohort study and meta-analysis. Cancer Epidemiol. Biomarkers Prev. 2012;21:1497–1509. doi: 10.1158/1055-9965.EPI-12-0042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Saramäki O.R., Harjula A.E., Martikainen P.M., Vessella R.L., Tammela T.L., Visakorpi T. TMPRSS2: ERG fusion identifies a subgroup of prostate cancers with a favorable prognosis. Clin. Cancer Res. 2008;14:3395–3400. doi: 10.1158/1078-0432.CCR-07-2051. [DOI] [PubMed] [Google Scholar]
- 5.Shulla A., Heald-Sargent T., Subramanya G., Zhao J., Perlman S., Gallagher T. A transmembrane serine protease is linked to the severe acute respiratory syndrome coronavirus receptor and activates virus entry. J. Virol. 2011;85:873–882. doi: 10.1128/JVI.02062-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Glowacka I., Bertram S., Müller M.A., Allen P., Soilleux E., Pfefferle S., Steffen I., Tsegaye T.S., He Y., Gnirss K., et al. Evidence that TMPRSS2 activates the severe acute respiratory syndrome coronavirus spike protein for membrane fusion and reduces viral control by the humoral immune response. J. Virol. 2011;85:4122–4134. doi: 10.1128/JVI.02232-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Bertram S., Dijkman R., Habjan M., Heurich A., Gierer S., Glowacka I., Welsch K., Winkler M., Schneider H., Hofmann-Winkler H., et al. TMPRSS2 activates the human coronavirus 229E for cathepsin-independent host cell entry and is expressed in viral target cells in the respiratory epithelium. J. Virol. 2013;87:6150–6160. doi: 10.1128/JVI.03372-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Abe M., Tahara M., Sakai K., Yamaguchi H., Kanou K., Shirato K., Kawase M., Noda M., Kimura H., Matsuyama S., et al. TMPRSS2 is an activating protease for respiratory parainfluenza viruses. J. Virol. 2013;87:11930–11935. doi: 10.1128/JVI.01490-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Shirato K., Kawase M., Matsuyama S. Middle East respiratory syndrome coronavirus infection mediated by the transmembrane serine protease TMPRSS2. J. Virol. 2013;87:12552–12561. doi: 10.1128/JVI.01890-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Heurich A., Hofmann-Winkler H., Gierer S., Liepold T., Jahn O., Pöhlmann S. TMPRSS2 and ADAM17 cleave ACE2 differentially and only proteolysis by TMPRSS2 augments entry driven by the severe acute respiratory syndrome coronavirus spike protein. J. Virol. 2014;88:1293–1307. doi: 10.1128/JVI.02202-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P., et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019:531210. doi: 10.1101/531210. [DOI] [Google Scholar]
- 12.Koike A., Nishida N., Yamashita D., Tokunaga K. Comparative analysis of copy number variation detection methods and database construction. BMC Genet. 2011;12:29. doi: 10.1186/1471-2156-12-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wang K., Li M., Hadley D., Liu R., Glessner J., Grant S.F., Hakonarson H., Bucan M. PennCNV: An integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 2007;17:1665–1674. doi: 10.1101/gr.6861907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Adzhubei I., Jordan D.M., Sunyaev S.R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Human Genet. 2013;76:7–20. doi: 10.1002/0471142905.hg0720s76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zuberi K., Franz M., Rodriguez H., Montojo J., Lopes C.T., Bader G.D., Morris Q. GeneMANIA prediction server 2013 update. Nucleic Acids Res. 2013;41:W115–W122. doi: 10.1093/nar/gkt533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Szklarczyk D., Gable A.L., Lyon D., Junge A., Wyder S., Huerta-Cepas J., Simonovic M., Doncheva N.T., Morris J.H., Bork P., et al. STRING v11: Protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–D613. doi: 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Franzén O., Gan L.M., Björkegren J.L. PanglaoDB: A web server for exploration of mouse and human single-cell RNA sequencing data. Database. 2019;2019 doi: 10.1093/database/baz046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Leinonen R., Sugawara H., Shumway M., International Nucleotide Sequence Database Collaboration The sequence read archive. Nucleic Acids Res. 2010;39:D19–D21. doi: 10.1093/nar/gkq1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Stuart T., Butler A., Hoffman P., Hafemeister C., Papalexi E., Mauck W.M., III, Hao Y., Stoeckius M., Smibert P., Satija R. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Huang H.Y., Lin Y.C.D., Li J., Huang K.Y., Shrestha S., Hong H.C., Tang Y., Chen Y.G., Jin C.N., Yu Y., et al. miRTarBase 2020: Updates to the experimentally validated microRNA–target interaction database. Nucleic Acids Res. 2020;48:D148–D154. doi: 10.1093/nar/gkz896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kehl T., Kern F., Backes C., Fehlmann T., Stöckel D., Meese E., Lenhof H.P., Keller A. miRPathDB 2.0: A novel release of the miRNA Pathway Dictionary Database. Nucleic Acids Res. 2020;48:D142–D147. doi: 10.1093/nar/gkz1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.De Rie D., Abugessaisa I., Alam T., Arner E., Arner P., Ashoor H., Åström G., Babina M., Bertin N., Burroughs A.M., et al. An integrated expression atlas of miRNAs and their promoters in human and mouse. Nat. Biotechnol. 2017;35:872. doi: 10.1038/nbt.3947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wishart D.S., Feunang Y.D., Guo A.C., Lo E.J., Marcu A., Grant J.R., Sajed T., Johnson D., Li C., Sayeeda Z., et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46:D1074–D1082. doi: 10.1093/nar/gkx1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Manoharan C., Manoharan C., Wilson M.C., Manoharan C., Wilson M.C., Sessions R.B., Halestrap A.P. The role of charged residues in the transmembrane helices of monocarboxylate transporter 1 and its ancillary protein basigin in determining plasma membrane expression and catalytic activity. Mol. Membr. Biol. 2006;23:486–498. doi: 10.1080/09687860600841967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Castorino J.J., Gallagher-Colombo S.M., Levin A.V., FitzGerald P.G., Polishook J., Kloeckener-Gruissem B., Ostertag E., Philp N.J. Juvenile cataract-associated mutation of solute carrier SLC16A12 impairs trafficking of the protein to the plasma membrane. Invest. Ophthalmol. Vis. Sci. 2011;52:6774–6784. doi: 10.1167/iovs.10-6579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Rusu V., Hoch E., Mercader J.M., Tenen D.E., Gymrek M., Hartigan C.R., DeRan M., von Grotthuss M., Fontanillas P., Spooner A., et al. Type 2 diabetes variants disrupt function of SLC16A11 through two distinct mechanisms. Cell. 2017;170:199–212. doi: 10.1016/j.cell.2017.06.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Huang Q., Li J., Xing J., Li W., Li H., Ke X., Zhang J., Ren T., Shang Y., Yang H., et al. CD147 promotes reprogramming of glucose metabolism and cell proliferation in HCC cells by inhibiting the p53-dependent signaling pathway. J. Hepatol. 2014;61:859–866. doi: 10.1016/j.jhep.2014.04.035. [DOI] [PubMed] [Google Scholar]
- 28.Zhao P., Zhang W., Wang S.J., Yu X.L., Tang J., Huang W., Li Y., Cui H.Y., Guo Y.S., Tavernier J., et al. HAb18G/CD147 promotes cell motility by regulating annexin II-activated RhoA and Rac1 signaling pathways in hepatocellular carcinoma cells. Hepatology. 2011;54:2012–2024. doi: 10.1002/hep.24592. [DOI] [PubMed] [Google Scholar]
- 29.Lu M., Wu J., Hao Z.W., Shang Y.K., Xu J., Nan G., Li X., Chen Z.N., Bian H. Basolateral CD147 induces hepatocyte polarity loss by E-cadherin ubiquitination and degradation in hepatocellular carcinoma progress. Hepatology. 2018;68:317–332. doi: 10.1002/hep.29798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhang M.Y., Zhang Y., Wu X.D., Zhang K., Lin P., Bian H.J., Qin M.M., Huang W., Wei D., Zhang Z., et al. Disrupting CD147-RAP2 interaction abrogates erythrocyte invasion by Plasmodium falciparum. Blood. 2018;131:1111–1121. doi: 10.1182/blood-2017-08-802918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Castro A.P.V., Carvalho T.M., Moussatché N., Damaso C.R. Redistribution of cyclophilin A to viral factories during vaccinia virus infection and its incorporation into mature particles. J. Virol. 2003;77:9052–9068. doi: 10.1128/JVI.77.16.9052-9068.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen Z., Mi L., Xu J., Yu J., Wang X., Jiang J., Xing J., Shang P., Qian A., Li Y., et al. Function of HAb18G/CD147 in invasion of host cells by severe acute respiratory syndrome coronavirus. J. Infect. Dis. 2005;191:755–760. doi: 10.1086/427811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bian H., Zheng Z.H., Wei D., Zhang Z., Kang W.Z., Hao C.Q., Dong K., Kang W., Xia J.L., Miao J.L., et al. Meplazumab treats COVID-19 pneumonia: An open-labelled, concurrent controlled add-on clinical trial. MedRxiv. 2020 doi: 10.1101/2020.03.21.20040691. [DOI] [Google Scholar]
- 34.Bhanushali A., Rao P., Raman V., Kokate P., Ambekar A., Mandva S., Bhatia S., Das B. Status of TMPRSS2–ERG fusion in prostate cancer patients from India: Correlation with clinico-pathological details and TMPRSS2 Met160Val polymorphism. Prostate Int. 2018;6:145–150. doi: 10.1016/j.prnil.2018.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.FitzGerald L.M., Agalliu I., Johnson K., Miller M.A., Kwon E.M., Hurtado-Coll A., Fazli L., Rajput A.B., Gleave M.E., Cox M.E., et al. Association of TMPRSS2-ERG gene fusion with clinical characteristics and outcomes: Results from a population-based study of prostate cancer. BMC Cancer. 2008;8:230. doi: 10.1186/1471-2407-8-230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Paniri A., Hosseini M.M., Akhavan-Niaki H. First comprehensive computational analysis of functional consequences of TMPRSS2 SNPs in susceptibility to SARS-CoV-2 among different populations. J. Biomol. Struct. Dyn. 2020:1–18. doi: 10.1080/07391102.2020.1767690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Alhazzani W., Møller M.H., Arabi Y.M., Loeb M., Gong M.N., Fan E., Oczkowski S., Levy M.M., Derde L., Dzierba A., et al. Surviving Sepsis Campaign: Guidelines on the management of critically ill adults with Coronavirus Disease 2019 (COVID-19) Intensive Care Med. 2020:1–34. doi: 10.1097/CCM.0000000000004363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Khaerunnisa S., Kurniawan H., Awaluddin R., Suhartati S., Soetjipto S. Potential inhibitor of COVID-19 main protease (Mpro) from several medicinal plant compounds by molecular docking study. [(accessed on 10 June 2020)];Preprints. 2020 20944:1–14. Available online: https://www.preprints.org/manuscript/202003.0226/v1. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found here: https://panglaodb.se/, https://gnomad.broadinstitute.org/, https://fantom.gsc.riken.jp/5/, https://gtexportal.org/home/datasets.