Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2023 Jan 18;58:101303. doi: 10.1016/j.coviro.2023.101303

The role of influenza-A virus and coronavirus viral glycoprotein cleavage in host adaptation

Miriam R Heindl 1, Eva Böttcher-Friebertshäuser 1
PMCID: PMC9847222  PMID: 36753938

Abstract

While receptor binding is well recognized as a factor in influenza-A virus (IAV) and coronavirus (CoV) host adaptation, the role of viral glycoprotein cleavage has not been studied in detail so far. Interestingly, recent studies suggest that host species may differ in their protease repertoire available for cleavage. Furthermore, it was shown for certain bat-derived CoVs that proteolytic activation provides a critical barrier to infect human cells. Understanding the role of glycoprotein cleavage in different species and how IAV and CoVs adapt to a new protease repertoire may allow evaluating the zoonotic potential and risk posed by these viruses. Here, we summarize the current knowledge on the emergence of a multibasic cleavage site (CS) in the glycoproteins of IAVs and CoVs in different host species. Additionally, we discuss the role of transmembrane serine protease 2 (TMPRSS2) in virus activation and entry and a role of neuropilin-1 in acquisition of a multibasic CS in different hosts.


Current Opinion in Virology 2023, 58:101303

This review comes from a themed issue on Adaptation of viruses to new host

Edited by Silke Stertz and Xander de Haan

For complete overview about the section, refer “Adaptation of viruses to new host (2023)

https://doi.org/10.1016/j.coviro.2023.101303

1879–6257/© 2023 Elsevier B.V. All rights reserved.

Introduction

Influenza-A viruses (IAVs) and coronaviruses (CoVs) are pathogens with a broad host spectrum ( Figure 1b, c) and a significant potential for zoonotic transmission. Both IAVs and CoVs are enveloped viruses and possess the major surface glycoproteins hemagglutinin (HA) and spike (S), respectively, that initiate infection by facilitating receptor binding and fusion of viral and cellular membranes 1, 2. HA and S are class-I viral fusion proteins and are synthesized as inactive precursors that must be cleaved post-translationally by a host cell protease to gain their fusion capacity. Cleavage exposes the fusion peptide (FP) and is essential for virus infectivity 2, 3, 4. The IAV HA has to be cleaved at one cleavage site (CS) to be primed for membrane fusion (Figure 1a). Proteolytic cleavage of CoV S has been a puzzling question, but it is now appreciated that CoV S must be proteolytically primed to mediate membrane fusion and several (if not all) CoV S proteins require sequential processing at two sites, S1/S2 and S2′ (Figure 1a, c) 2, 5, 6.

Figure 1.

Figure 1

Cleavage of IAV and CoV fusion proteins and CS motifs common in different host species. (a) IAV HA and CoV S protein are synthesized as precursors and need to be cleaved by host cell proteases to gain their fusion capacity. HA is cleaved at one distinct site immediately upstream of the FP, while S needs to be processed successively at two sites, S1/S2 and S2′, to expose the FP. RBD: receptor-binding domain, TMD: transmembrane domain. (b) Wild aquatic birds are the natural reservoir of IAV subtypes H1–H16 and provide a source of zoonotic transmission to a broad range of avian and mammalian hosts. Genome sequences of subtypes H17 and H18 have only been found in bats. LPAIV and mammalian IAVs possess a monobasic HA CS (R↓). LPAIV can convert into HPAIV via acquisition of a multibasic CS (RXR/KR↓) in poultry. (c) CoV are classified into four genera: alpha, beta, gamma, and delta CoV [75]. Beta-CoV are further divided into four lineages (A–D). CoVs are found in diverse mammalian and avian animal species. Bat and rodent CoV are suggested to serve as sources of alpha- and beta-CoVs, while wild bird CoV are sources of gamma and delta CoVs. The table compares the S1/S2 and S2′ CS motifs of human-pathogenic CoV originating from bats or rodents via potential intermediate hosts with different animal-derived CoV. MHV: murine hepatitis virus.

Single arginine or lysine residues designated as monobasic CS (R/K↓) are cleaved by trypsin-like proteases. Most IAVs, including low pathogenic avian influenza-A viruses (LPAIVs) and human IAVs, possess a monobasic CS. The type-II transmembrane serine protease 2 (TMPRSS2) has been identified as major IAV-activating protease in human airway cells [7]. In contrast, multibasic CSs of the consensus sequence R–X–R/K–R↓ are processed by ubiquitously expressed furin and related proprotein convertases. Highly pathogenic avian influenza-A viruses (HPAIVs) are activated at a multibasic CS, supporting systemic spread in poultry 8, 9.

Priming of CoV S is more complex compared with IAV HA but also more flexible and two CSs appear to offer more possibilities for additional proteases. CoVs show a high variety in CS motifs with different combinations of mono-, di-, and multibasic motifs at the S1/S2 and S2‘ sites (Figure 1c) [2]. It is believed that mono- (and most likely dibasic) motifs are cleaved by TMPRSS2 and related proteases, whereas multibasic CSs are processed by furin. However, lack of expression of TMPRSS2 and other appropriate trypsin-like proteases in cell cultures enables CoVs to use an alternative entry route via the late endosome facilitating S cleavage by endosomal cathepsins 10, 11. Additionally, metalloproteases, including ADAM10, ADAM17, MMP-2, and MMP-9, have been shown to activate severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) S in vitro 11, 12, 13, 14.

Not much attention was drawn on the role of viral glycoprotein cleavage in adaptation of IAVs and CoVs to a new host until recently. In general, it is assumed that orthologous proteases support virus activation in different host species and, therefore, no adaptation is necessary. TMPRSS2 from chicken, duck, swine, and nonhuman primates has been shown to support proteolytic activation of IAV in vitro 15•, 16, 17, 18. However, expression levels and tissue distribution as well as substrate specificity of TMPRSS2 have not been investigated in these species and its role in IAV activation remains to be demonstrated. Even less is known about furin orthologs in different species. It is believed that the important physiological role of furin leads to a highly conserved expression and activity in vertebrates and, therefore, also in virus activation. Studies in fruit bat cells, however, revealed that differences in subcellular localization or activity of furin orthologs may exist [19]. With the emergence of SARS-CoV-2, it has become more recognized that viral glycoprotein cleavage may play a role in CoV transmission and adaptation to new host species, primarily due to the fact that SARS-CoV-2 has acquired a multibasic S1/S2 CS that is not found in the closely related SARS-CoV and BatRaTG13 (Figure 1c) 20, 21.

In this review, we focus on the current knowledge on the role of TMPRSS2 in IAV and CoV activation and beyond, the prevalence and emergence of a multibasic CS in IAV and CoVs in different host species, and neuropilin-1 (NRP1) as a host factor that may be involved in the emergence of a multibasic CS and tissue spreading.

Transmembrane serine protease 2: virus activation and beyond

TMPRSS2 was identified as IAV HA-cleaving protease in 2006 and since then has been shown to activate the fusion proteins of a broad number of respiratory viruses, including influenza-B virus, human metapneumovirus, and various CoVs at monobasic CSs [4]. Lack of TMPRSS2 expression prevents multicycle replication and pathogenesis of IAVs as well as SARS-CoV, Middle East respiratory syndrome coronavirus (MERS-CoV), and SARS-CoV-2 in mice 22, 23, 24, 25. This demonstrates that TMPRSS2 provides a promising drug target for therapeutic treatment [26]. Interestingly, multicycle replication of a chimeric MERS-CoV bearing the S protein of a MERS-like virus from Ugandan bats required addition of exogenous trypsin to human Caco-2 cells, whereas MERS-CoV replicated trypsin independent in the cells, most likely due to S cleavage by TMPRSS2. The data indicated that S cleavage may be a critical barrier that needs to be overcome by CoVs to infect a new host [27].

Recent studies suggest that human and murine lung may differ in their protease repertoire available for HA cleavage. Activation of certain human H3N2 IAVs was independent of TMPRSS2 expression in mice, whereas TMPRSS2 was crucial for virus activation in primary human airway cells 6, 23, 24, 28. This may be due to expression of a larger number of appropriate proteases in murine lung and hence cleavage of H3 by a mouse-specific protease. The mouse degradome (complete set of proteases present in an organism) consists of more protease genes (672) compared with human (588) and chicken (460) degradomes [29]. Furthermore, differences in the substrate specificity of orthologous proteases from human and mouse have been described and might contribute to the observed differences in H3 activation 28, 30. Even though mice are not a natural host for IAV, the broader repertoire of trypsin-like proteases could play a role in CoV evolution in rodents.

TMPRSS2 was identified as major activating protease of LPAIV HA of almost all subtypes in human airway cells, suggesting that the transmission of IAV from avian species to humans does not require adaptation to a new protease repertoire [15]. Interestingly, infection studies in murine lung explants showed that the protease dependency can differ between human- and avian-derived HAs. While a virus with human-derived H3 replicated in lung explants of both TMPRSS2-deficient mice and wild-type animals, a virus containing a duck-derived H3 was not able to replicate in lung explants lacking TMPRSS2 expression [15]. However, this has not been analyzed for other HA subtypes. Further investigations are needed to conclude whether avian-derived HAs rely more on TMPRSS2 in proteolytic activation compared with human-derived HAs and if so, whether this affects avian IAV adaptation to new host species.

Some IAV HA subtypes such as H3, H4, LPAIV H5, H9, and H14 possess a basic amino acid in position P4 of the CS motif (K/R–X–X–R↓) that may facilitate activation by additional proteases. The type-II transmembrane serine protease matriptase/ST14 that is broadly expressed among epithelial tissues preferentially cleaves substrates at a R–X–X–R motif 31, 32. Accordingly, H9 with R–S–S–R but not V–S–S–R CS was cleaved by matriptase in addition to TMPRSS2 in vitro and virus activation by matriptase was associated with replication of H9N2 IAV in primary chicken embryo kidney cells [31].

Cleavage of IAV HA by TMPRSS2 takes place intracellularly before release of progeny virus from the infected cell [33]. In contrast, priming of CoV S by TMPRSS2 occurs upon virus entry and is believed to facilitate virus-membrane fusion at or close to the plasma membrane (early entry) 34, 35. In the absence of TMPRSS2, SARS-CoV-2 is taken up by endocytosis and entry occurs via fusion in late endosomes upon S2′ site cleavage by cathepsins (late entry) 10, 11. Using the early entry route may allow CoVs to avoid endosomal restriction by interferon-induced transmembrane (IFITM) proteins, which block viral membrane fusion by preventing hemifusion 36, 37. However, both antiviral and proviral effects have been described for IFITMs in CoV replication 37, 38, 39, 40•. Interestingly, TMPRSS2 has been shown to be beneficial for the virus in either way. Thus, in addition to avoiding endosomal restriction by IFITMs, TMPRSS2 expression was shown to switch IFITM3 activities at the plasma membrane toward enhancement of SARS-CoV-2 infection via yet unknown mechanisms [40].

Rather unexpectedly, the SARS-CoV-2 Omicron variants BA.1 and BA.2 were found to be less efficiently activated by TMPRSS2 compared with other variants in vitro and seem to favor an endosomal entry via cathepsins 41•, 42, 43. Altered TMPRSS2 usage is believed to contribute to the change in tissue tropism of Omicron that replicates well in human nasal cells, but demonstrates significantly less replication in human lung cells compared with the Delta variant 41•, 42. Replication of Omicron BA.1 in nasal and lung cells of TMPRSS2-knockout mice, however, was significantly reduced compared with wild-type animals, indicating that TMPRSS2 is involved in virus activation in mice [44]. Whether or not there are discrepancies in utilizing TMPRSS2 for Omicron S priming in humans and mice remains open. Interestingly, Omicron lineage BA.5 shows increased TMPRSS2 usage when compared with BA.1 and BA.2 [45]. Thus, one may speculate that switching from the TMPRSS2-facilitated entry to a cathepsin-dependent entry route was disadvantageous (in respect of efficient S priming or avoiding restriction by IFITMs or both) and ongoing evolution of Omicron shows a reversion back to TMPRSS2 usage for S priming.

Origin of the multibasic cleavage site in coronaviruses

A multibasic CS is not necessarily linked to enhanced pathogenicity. Two out of four human coronaviruses (HCoVs) associated with common cold contain a multibasic CS at the S1/S2 junction, while the other two HCoVs but also SARS-CoV that causes severe disease possess two monobasic CSs (Figure 1c). Feline enteric coronaviruses (FECVs) possess a multibasic S1/S2 CS and can chronically infect cats for long periods. Feline infectious peritonitis (FIP) viruses that cause deadly FIP arise from FECVs by mutation. Interestingly, among other mutations, amino acid substitutions at the S1/S2 CS modifying the furin cleavage motif, are observed upon development of FIP [46]. The avian infectious bronchitis virus (IBV) harbors a multibasic S1/S2 CS and some strains possess a tribasic S2′ CS motif. IBV causes a highly contagious respiratory disease in chickens. Mortality is usually low, but can vary depending on the strain [47].

The fact that SARS-CoV-2 acquired a multibasic S1/S2 CS due to insertion of four amino acids started a debate of whether the emergence of a furin-cleavable CS appears to be specific for different host species. Bats and rodents are a major natural reservoir of CoV and a source of zoonotic transmission to other host species. While a multibasic S1/S2 CS is common in rodent-derived CoVs (78%), it is rare in bat-derived CoVs (6%) [48]. SARS-CoV-2 probably originated from bats via a potential, yet unknown, intermediate host and probably involving recombination events between different CoVs 25, 38, 49. Thus, its multibasic S1/S2 CS represents rather an exception for a bat-derived CoV.

Importantly, the multibasic CS of SARS-CoV-2 was identified as a critical determinant of virus transmission in ferrets. Loss of the multibasic S1/S2 CS attenuated SARS-CoV-2 and prevented transmission [50]. In agreement with that, deletions of the S1/S2 CS have been shown to arise naturally only at very low levels, and all variants of concern have retained the R–R–A–R motif at the S1/S2 junction 26, 50••. The underlying molecular mechanism remains to be determined. S1/S2 cleavage by furin occurs during egress and may allow for near-complete cleavage and thus efficient receptor binding upon infection of a new host cell. Notably, S1/S2 cleavage is required for angiotensin-converting enzyme 2 (ACE2) binding [51]. ACE2 binding, in turn, enables exposure of the S2′ site and subsequent cleavage by TMPRSS2 [52]. Of note, the P–R–R–A–R↓ motif is suboptimal for furin cleavage due to alanine in P2 position. Several SARS-CoV-2 variants acquired mutations (P681R/H) at P5 predicted to enhance furin cleavage 50••, 53. Whether this represents ongoing adaptation to humans warrants further investigation.

Interestingly, serial infection of Vero cells leads to a loss of the multibasic S1/S2 CS already at low passage numbers, indicating that the multibasic CS has a selective disadvantage for the virus in Vero cells. Loss of the multibasic CS is associated with lack of TMPRSS2 expression. Although it appears illogical that absence of the protease that cleaves at the S2′ site results in mutations at the S1/S2 site, a recent study suggested that it might support more efficient cleavage of the S1/S2 site by cathepsins upon endosomal uptake of the virus 54, 55. Intriguingly, passaging of the Vero-adapted SARS-CoV-2 with mutated S1/S2 CS resulted in prompt reversion to the original multibasic sequence in human Calu-3 airway cells or Caco-2 colon carcinoma cells both expressing TMPRSS2 [55]. The data indicate that SARS-CoV-2 is not critically dependent on TMPRSS2 and furin but able to adapt to another host protease repertoire. Nevertheless, furin and TMPRSS2 seem to be the best option in human cells.

The role for cathepsins in CoV S activation in natural infections is still under debate. Although cathepsins seem to be not crucial for CoV activation in human cells, they may play an important role in virus activation in other host species. Interestingly, a number of bat-derived viruses (e.g. Hendra virus, Nipah virus, and Ebola virus) utilize endosomal cathepsins during their replication cycle 56, 57. A cathepsin-L orthologous protease has been shown to support Hendra virus F-protein activation in fruit bat cells [19]. Additionally, furin-like proteases were described in the cells and shown to support proteolytic activation of parainfluenza virus 5. However, differences in the response of fruit bat cells versus Vero cells to a potent furin inhibitor indicated that subtle differences in subcellular localization or activity of furin may exist between different mammalian species [19]. Whether expression or activity of cathepsins and furin orthologs drives the evolution of specific CS motifs in bats remains unknown, but should be investigated in future studies.

Neuropilin-1 and the multibasic cleavage site

Some viruses that require proteolytic activation of their envelope proteins by furin such as the herpesvirus Epstein–Barr virus or human T-cell lymphotropic virus 1 (HTLV-1) and HTLV-2 have been shown to utilize NRP1 as an additional entry factor [58]. NRP1 is a cell surface receptor with disseminated expression that plays important roles in growth factor signaling, vascular angiogenesis, axonal guidance, and immune function 59, 60. Intriguingly, NRP1 binds peptides with a multibasic sequence (R/K–X–X–R/K) at the C-terminal end (C-end rule) and executes cellular uptake via a mechanism similar to micropinocytosis 59, 61, 62. The C-terminus of the cleaved S1 subunit of SARS-CoV-2 also conforms to the C-end rule and can therefore interact with NRP1 ( Figure 2a). Correspondingly, NRP1 was shown to enhance entry of SARS-CoV-2 into human cells but not that of a mutant virus lacking the multibasic CS 63•, 64•. Virus entry via NRP1 may compensate for the relatively low expression levels of ACE2 in the human respiratory tract and may facilitate infection of multiple organs and tissues, including neurons and endothelial cells 64•, 65•, 66. NRP1, not ACE2, was shown to mediate astrocyte infection by SARS-CoV-2 in brain organoids [65]. Additionally, binding of S1 to NRP1 was predicted to stimulate separation of S1 and S2 and thereby may increase virus infectivity [67]. Cells isolated from human bronchoalveolar lavage fluid from COVID-19 patients showed that NRP1 was upregulated in SARS-CoV-2-infected cells [64]. Mutations at P5 and P7 position of the S1/S2 CS of SARS-CoV-2 variants, namely P681H (Alpha), P681R (Delta), or N679K + P681H (Omicron) enhance the overall basicity of the C-terminus of S1 and may affect NRP1 binding. Based on an in silico molecular docking analysis, Omicron S but not Delta S shows increased binding to NRP1 in comparison to Wuhan S [68].

Figure 2.

Figure 2

NRP1 acts as an alternative entry receptor for SARS-CoV-2 and HPAIV. (a) Both SARS-CoV and SARS-CoV-2 utilize ACE2 as entry receptor. Following S1/S2 cleavage, the multibasic motif at the C-terminus of S1 conforms to the C-end rule and is bound by NRP1 and taken up by the cell. Utilizing NRP1 as additional receptor may enhance entry into human respiratory cells and expand the tissue tropism of SARS-CoV-2. The monobasic motif of the C-terminus of SARS-CoV S1 does not facilitate binding to NRP1. (b) The multibasic motif at the C-terminus of HA1 of a HPAIV provides a NRP1 substrate, whereas HA1 of a LPAIV does not. The higher expression level of the natural NRP1 ligand Sma3a in duck-derived endothelial cells may block binding of HPAIV HA1 to NRP1 and prevent viral uptake, while lower endogenous expression of Sma3a in chicken endothelial cells does not compete with HPAIV HA binding to NRP1 and therefore promotes virus entry. Efficient uptake of HPAIV HA in chicken endothelial cells might support HPAIV genesis in poultry (L.E. Steele et al., abstract The Eighth ESWI Influenza Conference, Virtual Edition, 4–7 December 2021).

HA cleavage has been identified as the prime determinant of avian IAV pathogenicity in poultry for a long time. HPAIVs emerge from LPAIVs by acquisition of a multibasic CS due to insertion of multiple basic amino acids at the CS. Natural HPAIVs are restricted to subtypes H5 and H7 69, 70. Proteolytic activation by furin supports systemic spread of infection with often-fatal outcome. Replication of LPAIVs, on the other hand, is confined to epithelial cells of the respiratory and intestinal tract due to the restricted expression of appropriate trypsin-like proteases. HPAIVs mostly infect endothelial cells in chickens, whereas they are still epitheliotropic in ducks [71]. It is generally assumed that the restricted expression of trypsin-like HA-cleaving proteases serves as primary positive selection pressure for LPAIV to HPAIV conversion. Interestingly, HPAIV are rarely isolated from wild aquatic birds and are believed to emerge after introduction into poultry 70, 72, indicating that a multibasic HA CS provides an advantage in poultry that is not relevant in waterfowl. The C-terminus of the cleaved HA1 subunit of HPAIV is also qualified as a NRP1 substrate, while HA1 of LPAIV is not (Figure 2b). A recent study by Steele and Short suggests that NRP1 might be a trigger for HPAIV genesis in poultry (L.E. Steele et al., abstract The Eighth ESWI Influenza Conference, Virtual Edition, 4–7 December 2021). The data by Steele et al. indicate that a higher endogenous expression level of the natural NRP1 ligand semaphorin 3a (Sema3a) in duck versus chicken endothelial cells blocks binding and uptake of cleaved HA in duck cells, whereas HA is taken up efficiently in chicken-derived endothelial cells (Figure 2b). Thus, efficient binding of HPAIV HA1 to NRP1 on endothelial cells supporting systemic spread of infection occurs only in chicken but not in duck. This might provide the positive selection pressure for LPAIV to HPAIV conversion in chicken. In contrast to chickens, vascular tropism is not a prominent feature of HPAIV pathogenesis in mammals including humans 71, 73. It will thus be of interest to investigate NPR1-dependent uptake of HPAIV HA into mammalian vascular endothelial cells. Notably, carboxypeptidase B has been shown to remove basic amino acids from the C-terminus of HA1 upon cleavage of HA0 into HA1 and HA2 7, 74. Carboxypeptidase B also eliminates the multibasic amino acid motif of the HA1 subunit of fowl plague virus, although not quantitatively, and therefore might interfere with recognition of HA1 as a NRP1 substrate. However, the biological significance of carboxypeptidase trimming of HA1 (and cleaved envelope proteins of other viruses) remains to be investigated in more detail.

Overall, these findings underline that a multibasic CS may not only be beneficial in facilitating virus activation by ubiquitously expressed furin, but also in promoting virus entry into a wide range of tissues and vascular spreading by using NPR1.

Concluding remarks

There are many open questions about the evolution and benefits of distinct CS motifs and protease usage in different IAV and CoV host species. Recent studies suggest that host differences may exist in the virus-activating protease repertoire. Whether these variations play a role in virus activation and provide a barrier that needs to be overcome by IAV and CoV in host adaptation remains to be further investigated. Importantly, viral glycoprotein cleavage has been recognized as a factor that is not only essential for virus infectivity but furthermore may promote virus infection via facilitating binding to additional receptors or by allowing the virus to avoid cellular restriction factors.

Understanding the role of glycoprotein cleavage and proteases involved in different host species may allow to evaluate the zoonotic potential and risk posed by IAVs and CoVs. Much attention is devoted to bats and rodents in determining the role of viral glycoprotein processing in host adaptation of zoonotic viruses, particularly CoVs. However, bats and rodents harbor a number of other viruses and comparative analyses of proteolytic activation of different viruses in these hosts might unravel basic mechanisms that drive the evolution of distinct CSs. Moreover, birds are an important host reservoir for IAV and CoV and should be included in future studies.

Conflict of interest statement

The authors declare no competing conflicts of interest.

Acknowledgements

This work was supported by grants from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) SFB 1021 (project B07) and by the LOEWE Center DRUID (project D1).

Data Availability

No data were used for the research described in the article.

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • of special interest

  • ••

    of outstanding interest

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No data were used for the research described in the article.


Articles from Current Opinion in Virology are provided here courtesy of Elsevier

RESOURCES