Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Dec 9.
Published in final edited form as: Chem. 2021 Oct 22;7(12):3393–3411. doi: 10.1016/j.chempr.2021.09.015

Mucin-mimetic glycan arrays integrating machine learning for analyzing receptor pattern recognition by influenza A viruses

Taryn M Lucas 1, Chitrak Gupta 2,3, Meghan O Altman 4,, Emi Sanchez 1, Matthew R Naticchia 1, Pascal Gagneux 4,5, Abhishek Singharoy 2,3, Kamil Godula 1,5,*
PMCID: PMC8726012  NIHMSID: NIHMS1750944  PMID: 34993358

SUMMARY

Influenza A viruses (IAVs) exploit host glycans in airway mucosa for entry and infection. Detection of changes in IAV glycan-binding phenotype can provide early indication of transmissibility and infection potential. While zoonotic viruses are monitored for mutations, the influence of host glycan presentation on viral specificity remains obscured. Here, we describe an array platform which uses synthetic mimetics of mucin glycoproteins to model how receptor presentation and density in the mucinous glycocalyx may impact IAV recognition. H1N1 and H3N2 binding in arrays of α2,3- and α2,6-sialyllactose receptors confirmed their known sialic acid-binding specificities and revealed their different sensitivities to receptor presentation. Further, the transition of H1N1 from avian to mammalian cell culture improved the ability of the virus to recognize mucin-like displays of α2,6-sialic acid receptors. Support vector machine (SVM) learning efficiently characterized this shift in binding preference and may prove useful to study viral evolution to a new host.

Keywords: Influenza A, hemagglutinin, mucin, glycan array, receptor pattern, machine learning

eTOC blurb:

Glycan array comprising synthetic mucin-like displays of sialoglycans enables profiling of Influenza A virus interactions based on receptor structure and presentation. Machine learning can detect and predict changes in receptor pattern binding behavior of the viruses.

INTRODUCTION

The periodic emergence of new respiratory viruses capable of spreading across the human population continues to exact a significant toll on human life and the global economy. The novel coronavirus, SARS-Cov2, which is responsible for the ongoing global COVID-19 pandemic,1 provides a stark example of the risks of zoonotic virus adaptation to our society. Other animal pathogens, such as avian Influenza A viruses (IAVs), continuously pose a threat of crossing to human hosts and require close monitoring.2 Many respiratory viruses, including IAVs, utilize specific glycan receptors on airway epithelial cells to initiate entry and replication.3 Characterization of the glycan-binding phenotype of IAVs may provide an early indicator of increased infection potential.4,5,6

IAVs carry two types of glycoproteins in their viral coat with specificity for terminal sialic acid modifications on cell surface glycans – the receptor-binding hemagglutinins (HAs) and the receptor-destroying neuraminidases (NAs).7,8,9 The configuration of the sialic acid glycosidic bond linkage to the underlying glycans in glycoproteins and glycolipids plays a prominent role in defining IAV host specificity (Fig 1A). According to the prevailing paradigm,4,10 avian viruses preferentially recognize α2,3-linked sialic acids abundant in the gastrointestinal tract of birds, while human IAVs have affinity for α2,6-sialosides expressed on lung epithelial cells in our upper airways. A switch in HA specificity from α2,3- to α2,6-linked sialic acids is associated with increased infection and transmission in humans.4,11,12

Figure 1.

Figure 1.

Machine learning-enabled glycomimetic array platform for assessing receptor pattern recognition by influenza A viruses (IAVs). A) IAVs begin their infection cycle by binding to sialylated host glycans, but these receptors are also present on mucins which have a proposed protective function. Avian and human IAVs show distinct preferences for the binding of α2,3- and α2,6-sialoglycan receptors. B) Glycopolymers, which mimic the architecture and composition of mucins, were used to build models of the mucinous glycocalyx on microarrays. A support vector machine (SVM) learning algorithm enabled analysis of viral binding response to changing receptor patterns in the synthetic glycocalyx arrays.

Glycomics screens13,14 and cell-based studies using glycosylation mutants15,16 have revealed that, in addition to a particular sialic acid linkage configuration, IAVs can also discriminate between distinct glycan classes and glycoconjugate types (i.e., N- and O-glycosylated proteins and glycolipids). Spatial combinations of these sialylated glycoconjugates give rise to three-dimensional, hierarchically organized receptor patterns in the host cell glycocalyx that determine the specificity and avidity of IAV binding. Glycan arrays, which present ensembles of chemically defined glycans printed and immobilized on glass substrates, are routinely used to analyze viral HA-receptor specificity.17,18,19,20,21 However, a recent cross-comparison between glycan composition of ex vivo human lung and bronchus tissues with glycan array binding data pointed to a limited ability of the arrays to predict infection events.14 This indicates that the current platforms may not accurately recapitulate the receptor presentation in the glycocalyx environment as encountered by viruses at the mucosal epithelium.

The mucosal epithelial cell glycocalyx is dominated by membrane-tethered mucins (MUCs), which are large, heavily glycosylated proteins projecting tens to hundreds of nanometers above the cell surface (Fig 1A).22,23 Mucins carry primarily, but not exclusively, O-glycans linked to tracks of serine and threonine residues within the core protein. As much as 80% of mucin mass derives from glycans, giving these glycoproteins an extended semi-flexible bottlebrush form.24 The O-glycans in mucins are frequently terminated with sialic acids that can serve as IAV receptors;25 however, epithelial mucins are believed to primarily provide protection against infection. Mucins can serve as decoys, which shed from the cell surface upon virus binding,22 or assemble into dense extended glycoprotein brushes that restrict virus access to apical membrane receptors and interfere with internalization.26,27 Due to the prominence of these extended glycoproteins within the glycocalyx and their extensive modification with sialic acids, mucins are most likely the first, and likely non-productive, site of virus attachment in its infection cycle.28 Interestingly, the IAV subtype, H1N1, was found to colocalize with some (i.e., MUC1) but not other (i.e., MUC13 and MUC16) mucins on the surfaces of A549 lung epithelial cells,29 revealing a preference of the virus for distinct mucin family members within the same cell and produced by a shared glycosylation machinery. The type of mucin and its presentation at the cell surface is likely to influence initial viral interactions at the epithelium and determine the course of infection. A more complete understanding of IAV interactions at the mucosal glycocalyx may, thus, provide a more accurate assessment of the potential of IAVs to infect human hosts.

Here, we report the development of an array platform, in which synthetic mucin-mimetics are used to model the mucosal epithelial cell glycocalyx, to evaluate receptor pattern recognition by IAVs (Fig 1B). We applied support vector machine (SVM) learning to identify and analyze effects of variations in glycan receptor type, mucin mimetic valency, nanoscale dimensions, and crowding in the glycocalyx models on shifts in the binding specificity of H1N1 and H3N2 IAV strains. We found that mucin-like polyvalent presentations of α2,3- and α2,6-sialoglycans and the surface crowding of the glycoconjugates differentially impacted adhesion of the viruses, consistent with the proposed protective functions of mucins in the airway epithelium. The mucin mimetic arrays also revealed an evolution of receptor pattern recognition by H1N1 produced in avian or mammalian cells, which could be characterized through machine learning.

RESULTS AND DISCUSSION

Construction of glycopolymers for mucin-like glycan receptor presentation.

To model the mucinous glycocalyx environment in glycan arrays, we have devised a method for generating synthetic glycopolymers (GPs) that replicate key structural features of mucins (i.e., polyvalent glycans displayed along extended linear polypeptide chains) while allowing for tuning of the mimetic size and glycosylation pattern (Fig 2A). Using the reversible addition-fragmentation chain transfer (RAFT) polymerization, we have generated a collection of mucin mimetics of increasing length glycosylated with α2,3- and α2,6-sialyllactose trisaccharides (α2,3-SiaLac and α2,6-SiaLac) as model avian and human IAV receptors, respectively. The polymers were terminated with an azide functionality and used either as soluble probes or covalently grafted on cyclooctyne-coated glass via the strain-promoted alkyne-azide cycloaddition (SPAAC) reaction to produce a mucin-like glycocalyx display.30 A tetramethylrhodamine (TAMRA) fluorophore was appended to the opposing chain end to allow for characterization of mucin mimetic density on the arrays.

Figure 2.

Figure 2.

Generation of mucin mimetic probes. A) Fluorescently labeled azide-terminated short (S), medium (M) and long (L) mucin-mimetic glycopolymers GP were generated via RAFT polymerization. B) Size exclusion chromatography (SEC) analysis of the polymeric precursors P. C) The naming scheme for the GPs indicates the polymer backbone length (S-, M-, and L-), sialic acid linkage type (superscripts 3 and 6, and Ø designate α2,3-SiaLac, α2,6-SiaLac, and Lac, respectively), and glycan valency (final subscript).

The mucin mimetic synthesis began with the polymerization of a Boc-protected N-methylaminooxypropyl acrylamide monomer (1) in the presence of a chain transfer agent (CTA, 2) and a radical initiator (AIBN) to generate a set of azide-terminated short (S, DP ~ 150), medium (M, DP ~ 200) and long (L, DP ~ 300) polymeric precursors, P (Fig 2A). Size exclusion chromatography analysis (SEC, Fig 2B and Table S2) confirmed good control over the target molecular weight (Mw) and dispersity (Đ) of the polymers. Next, the trithiocarbonate end groups in polymers P were removed by aminolysis and the newly exposed thiol groups were capped with TAMRA-maleimide (Fig 2A and Scheme S1). The fluorophore labeling efficiency was determined for each polymer by UV-VIS spectrometry and ranged between 6-30% (Tables S3). Side-chain Boc-group deprotection in the presence of phenol and trimethylsilyl chloride (TMSCl)31 followed by conjugation of the released N-methylaminooxy groups with reducing glycans under acidic conditions completed the synthesis of the mucin mimetic glycopolymers GP (Fig 2A)

The mucin mimetic library was comprised of 27 short (S), medium (M) and long (L) sialylated glycopolymers, 3GP and 6GP, decorated with increasing amounts of the trisaccharides, α2,3-SiaLac and α2,6-SiaLac, respectively (Fig 2C and Table S3). In addition, we generated 11 control polymers lacking sialic acid modifications (ØGP) displaying only the lactose disaccharide (Lac, Fig 2C and Table S3). The extent of glycosylation for all polymers was determined by 1H NMR spectroscopy (Data S1) and varied according to glycan structure. Treatment with 1.1 equiv. of glycan per polymer sidechain was sufficient to achieve maximum polymer glycosylation of ~ 70% for Lac and ~ 45% for the negatively charged α2,3-SiaLac and α2,6-SiaLac (Table S3). The use of sub-stoichiometric amounts of glycans enabled tuning of glycan valency in the mucin mimetics (Fig 2C and Fig S1). The mucin mimetic lengths (l) were estimated to range from ~ 8 nm to 12 nm according to their DP using a method by Miura et al. for calculating theoretical end-to-end distances in sialylated glycopolymers (Fig 2B and Equation S1).32

The oligomeric plant lectins, WGA and SNA, show distinct binding behavior in mucin-like receptor displays.

In the airways, cell surface-associated mucins are organized into a dense, brush-like glycocalyx, which projects tens to hundreds of nanometers above the epithelial cell surface.22 To gain insights into glycan receptor recognition by proteins and pathogens at the mucosal interface, we modeled the mucinous glycocalyx in arrays by printing mucin mimetic glycopolymers GP on cyclooctyne-functionalized glass (Fig 3A). In addition to varying the structure and glycosylation of the glycopolymer probes, we also modulated their surface crowding by increasing their concentration (cGP) from 1 to 10 μM in the printing buffer (PBS supplemented with 0.05% Tween-20, pH = 7.4). The fluorescent TAMRA labels introduced synthetically into the probes were used to establish their surface grafting efficiency (Fig S2A) and the overall glycan receptor density (Fig S2C) for each polymer condition. The printing conditions yielded spots of uniform morphology (Fig 3A) with linear increase in polymer density across the employed concentrations regardless of polymer size or glycosylation (Fig S3 and Fig S4).

Figure 3.

Figure 3.

Construction and validation of mucin mimetic arrays. A) Representative composite images of density-variant arrays of fluorescent mucin-mimetic glycopolymers GP (TAMRA, green) probed with Daylight649-labeled SNA and WGA lectins. Each condition is represented as a duplicate. Full array scans are provided in Fig S5. B) Binding isotherms and associated apparent surface dissociation constants (KD,surf) for binding of WGA to medium-sized mucin mimetics M-3GP50-100 with increasing α2,3-SiaLac valency printed at low surface density (cGP = 1 μM, ***p <0.0005 or greater). C) Binding responses of WGA and SNA to increasing glycan receptor density on the array. The dimeric WGA lectin binding is directly proportional to glycan density, whereas the tetrameric SNA lectin exhibits a more complex binding pattern. Insets represent graphical representation of lectin oligomeric state and orientation of sialic acid binding sites based on crystallographic data analysis (Figs S8 and S9).

To confirm selective recognition of the mucin mimetics based on to the structure of their pendant glycans, the arrays were probed with Dylight649-labeled lectins wheat germ agglutin (WGA) and Sambucus nigra agglutinin (SNA) (Fig 3A and Fig S5). WGA, which primarily recognizes GlcNAc but has often been used to indicate the presence of α2,3-linked sialic acids, is specific for α2,3-SiaLac polymers on our arrays,33,34 while SNA binds exclusively to the polymers containing the α2,6-linked isomer.35 To obtain quantitative assessment of lectin binding to the mucin mimetics, we probed the arrays with increasing concentrations of the lectins to establish binding isotherms and extract apparent surface dissociation constants (KD,surf) (Fig 3B and Fig S6). WGA binding to the medium sized α2,3-SiaLac polymers, M-3GP50-110, printed at low surface density (cGP = 1μM) showed valency-dependent binding with autoinhibition at the highest valencies caused by glycan crowding on the polymer backbone. This behavior is frequently observed for lectin binding to glycopolymer probes in solution.36 The low polymer printing concentration produced probe spacing on the array surface that allowed for the measurement of lectin binding responses to the underlying glycoconjugate architecture. Increasing the concentration of the polymers resulted in denser mucin mimetic arrays, attenuated WGA responsiveness to α2,3-SiaLac valency of the individual probes, and increased overall avidity of the dimeric lectin toward the receptor display (Fig S6 and Table S4). Our attempts to establish similar binding profiles for SNA were not successful due to protein aggregation at concentrations needed to reach saturation binding (Fig S7).

Collecting thermodynamic binding data for each lectin-probe combination in the array can be time consuming and may not be possible for some lectins, as was the case for SNA. Simplified plots of lectin binding in response to changing relative sialoglycan density in the arrayed glycopolymer spots provide a convenient way to discern different binding modes of the proteins. Using this analysis, we observed that the binding response of WGA to changing glycan density was generally linear, while SNA showed a less correlated binding pattern indicative of contributions from higher-order binding interactions, such crosslinking of neighboring glycopolymers on the array (Fig 3C). Analysis of crystallographic data for WGA and SNA provide a structural basis for their differences in crosslinking capacity (Fig S8 and Fig S9). WGA exists as a dimer with two sialic acid binding domains separated by 3.9 nm and positioned on the same face of the protein.34 This arrangement reasonably favors WGA binding to glycans presented on the same mucin mimetic and may be responsible for the largely linear relationship between receptor density and lectin binding response. By contrast, SNA can exist as either a monomer, dimer, or tetramer.37 Each monomer contains two glycan binding sites that are directed outward on opposite the edges of the protein.38 The various oligomeric states and the orientation of the binding sites make SNA more likely to engage and crosslink multiple glycoconjugates on the surface, producing the more complex binding behavior observed on the mucin mimetic arrays.

H1N1 PR8 virus shows linkage-specific differences in binding to mucin-like sialoglycan presentations.

Pathogens, which utilize oligomeric lectins and adhesins for binding to cell-surface glycan receptors, may be sensitive to the presentation of glycan receptors at the mucosal barrier.39 We examined the binding of the H1N1 (A/Puerto Rico/8/1934 or PR8) virus to different presentations of sialoglycan receptors in our mucin mimetic platform (Fig 4A). The PR8 strain is a well-characterized, laboratory-adapted human IAV strain, which has the ability to recognize both avian and human sialic acid receptor structures.40,41 As such, it provides a useful model for assessing how receptor presentation may affect viral binding and selectivity.

Figure 4.

Figure 4.

H1N1 EGG binding to mucin-mimetic displays of sialoglycan receptors. A) Red blood cell agglutination assays and array screens were used to probe the interactions of H1N1 produced in embryonated chicken eggs (H1N1 EGG) with soluble and surface bound mucin mimetics. B) Inhibitory activity, Ki, of soluble glycan receptors α2,3-SiaLac and α2,6-SiaLac and mucin mimetic glycopolymers GP in RBC agglutination assays expressed as the minimal ligand concentration needed to prevent cell aggregation. The experimental images are included in Figure S10. C) Representative composite images and bar graph representation of H1N1 EGG virus (red) binding to medium-sized mucin mimetics M-GP (green) printed at low surface density (cGP = 1 μM) according to glycan receptor valency. Each array condition is represented as a duplicate and full array images are included in Figures S11, S12, and S14. Values and error bars represent averages and standard deviations of experiments from 9 different arrays. Significance is based on viral binding to the Lac polymer control M-ØGP110 (black dashed line). E) H1N1 EGG binding to mucin mimetics of increasing length printed at low surface density (cGP = 1 μM). Values and error bars represent averages and standard deviations of experiments from 6 different arrays. Significance was determined against Lac polymer control L-ØGP165 (black dashed line). F) H1N1 EGG binding response to increasing crowding of mucin mimetics of all three lengths on the array surface. Values and error bars represent averages and standard deviations of experiments from 6 different arrays. (*p<0.05, **p<0.005, and ***p<0.0001)

H1N1, which was propagated in embryonated chicken eggs and henceforth labeled as H1N1 EGG, bound both receptor types in their soluble monovalent form, with ~ 4-fold preference for the α2,3-SiaLac isomer, as determined in red blood cell (RBC) agglutination inhibition assays (Ki2,3 = 13 μM vs Ki2,6 50 μM, Fig 4B, Fig S10, and Table S5). The array binding data mirrored this preference, while providing additional insights into the effects of receptor presentation on viral interactions (Fig 4CF). H1N1 EGG virus binding to the medium size α2,3-SiaLac mucin mimetics M-3GP50-110 immobilized at low surface densities (cGP = 1 μM) indicated enhanced viral capture with increasing receptor valency, with a valency threshold for binding above 50 α2,3-SiaLac residues and a plateau at ~ 80 glycans per polymer (Fig 4D and Fig S11). Shortening the polymer length while maintaining a high receptor valency above 80 (S-3GP85) had no negative effect on viral capture (Fig 4E and Fig S12). We observed some decrease in binding to the longest mucin mimetic L-3GP140 compared to M-3GP105 despite its higher valency, presumably due to its increased chain conformational flexibility. RBC hemagglutination inhibition assays with soluble α2,3-SiaLac polymers 3GP confirmed the observed valency-dependent binding trend for H1N1 EGG (Ki,3GP = 313 nM – 1.25 μM, Fig S10, and Table S5) and were consistent with prior reports using similar multivalent glycopolymers.32 In contrast to the behavior of the arrayed mucin mimetics, increasing the polymer length resulted in a more effective inhibition of RBC agglutination by H1N1 EGG in solution (Ki,S-3GP = 625 nM vs Ki, L-3GP = 78 nM, Fig S10). It should be noted that the increase in inhibitory capacity of the glycopolymers compared to the monovalent receptor can be accounted for based on glycan valency and concentration alone, rather than avidity enhancements due to multivalency. In the case of the short polymer S-3GP85, when the total amount of glycan on the polymer is taken into account, the per glycan inhibitory activity (Ki,S-3GP X α2,3-SiaLac valency = 53 μM) was effectively reduced compared to the free α2,3-SiaLac (Ki,α2,3-siaLac = 13 μM). Lactose glycopolymers, ØGP, lacking sialic acids served as negative controls in both assays (Fig 4 and Fig S10).

Glycopolymers carrying the α2,6-SiaLac receptor (M-6GP50-105) showed only a limited ability to engage H1N1 and required glycan valency above 90 to reach binding above background (Fig 4D and Fig S11). Extending the length of the mucin mimetic, again, resulted in a decrease in viral capture (Fig 4E and Fig S12). All of the α2,6-SiaLac polymers failed to inhibiting RBC hemagglutination by the virus over the range of tested polymer concentrations (Ki,6GP > 5 μM or 325 μM with respect to α2,6-SiaLac, Fig S10). Considering that monovalent α2,6-SiaLac can prevent RBCs agglutination (Kiα2,6 = 50 μM, Fig 4B and S10), it appears that binding of H1N1 EGG to this glycan receptor is disfavored in the polyvalent glycopolymer presentation.

High levels of mucin expression on the surfaces of epithelial cells produces a dense glycoprotein brush, which has been proposed to restrict IAV access to membrane receptors necessary for infection.26,27 To examine the effects of polymer size and density on viral adhesion, we modelled glycocalyx crowding in our arrays by increasing the printing concentration of the mucin mimetics. We assayed H1N1 EGG binding to maximally glycosylated mucin mimetics of all three lengths arrayed at concentrations of 1, 5, and 10 μM (Fig 4F and Fig S12). The virus retained its overall preference for the α2,3-SiaLac probes across all surface densities; however, increased crowding of the polymers led to attenuated viral adhesion, which became more pronounced with increasing mucin mimetic length. Crowding of the α2,6-SiaLac glycopolymers both enhanced (S-6GP65) and inhibited (L-6GP140) viral adhesion depending on polymer length (Fig 4F). Our data show that, while the H1N1 EGG virus can utilize the less preferred α2,6-SiaLac receptors when presented in surface displays on short mucin mimetic scaffolds, increasing the length and density of the conjugates generally negatively impacted viral adhesion regardless of receptor type. Such negative impact of increasing receptor density was previously reported for the binding of nanoparticles bearing recombinant HA proteins to sialoglycans in supported lipid bilayers.42 Thus, crowding of mucins in the glycocalyx may not only shield underlying glycan receptors from the virus,26 but also limit viral adhesion to the heavily sialylated mucins themselves.

The observed differential H1N1 EGG binding to the mucin-like receptor displays according to sialic acid linkage type supports the distinct functions of secreted and membrane bound mucins comprising the airway mucosal barrier.26,43 Therein, secreted mucins produced by goblet cells and presenting primarily α2,3-sialic acid modifications serve as decoy receptors for viral capture and clearance. By contrast, the membrane-tethered mucins produced by epithelial cells display α2,6-linked sialic acid receptors and are thought to limit viral adhesion. The binding of H1N1 EGG to polyvalent α2,3-SiaLac mucin mimetics but not the α2,6-SiaLac analogs and the sensitivity of the virus to surface crowding of mucin mimetics carrying both receptor types would provide a rationale for the synergistic but mechanistically distinct functions of secreted and surface-bound mucins in limiting viral infection.

Support vector machine (SVM) learning-enabled analysis of receptor pattern recognition by influenza A viruses.

While the focused analysis of H1N1 EGG binding to some of the key features of glycan presentation in the mucin mimetic array (e.g., polymer size, valency, or surface density, Fig 4) was informative, the multidimensionality of receptor presentation on the array makes comprehensive assessment challenging and time consuming. For SNA and WGA, plots of lectin capture on the array according to glycan abundance, regardless of polymer structure or brush density, revealed qualitative differences in the lectin binding behavior (Fig 3C). Similar scatterplot representations can highlight major differences in receptor pattern recognition by different IAVs, as shown in Figure 5A for H1N1 EGG and the H3N2 A/Aichi/2/68 strain produced in Madin-Darby Canine Kidney (MDCK) cells, H3N2 MDCK.

Figure 5.

Figure 5.

Analysis of receptor pattern recognition by H1N1 and H3N2 strains. A) Scatterplots of binding responses to changing glycan receptor density in mucin mimetic arrays for H1N1 EGG and H3N2 MDCK viruses. B) Workflow for creating SVM self-models using viral binding responses and receptor display parameters (i.e., glycan type, valency and spacing on mucin mimetic, mucin mimetic density and concentration in printing buffer) in mucin mimetic arrays. C) SVM analysis of H1N1 EGG and H3N2 MDCK binding in mucin mimetic arrays. In red and blue colors are predicted binding events for each virus to 2,3-SiaLac and α2,6-SiaLac, respectively. Non-binding events are shown in gray. Color intensity indicates the frequency of the predicted binding events according to the valency and surface density of the mucin mimetics. H1N1 EGG recognizes both α2,3- and α2,6-SiaLac with preference for low surface densities of mucin mimetics. H3N2 MDCK is specific for 2,6-SiaLac glycans and its binding does not diminish with increasing mucin mimetic density on the array.

Noting that molecular recognition is a multi-dimensional problem, we have applied a SVM learning approach (Fig 5B) to resolve the receptor patterns that are best recognized by the viruses. We leveraged the fact that the SVMs are known to predict accurate relationships from semi- or unstructured data.44 First, we established a binding threshold for the virus based off the background signal from the control Lac polymers (Fig S13). Then, we randomly selected a portion (67 %) of the array binding data for each virus to train the SVM using a 5-dimensional parameter space of glycan type, glycan valency, glycan spacing on polymer, glycan density on the array, and polymer printing concentration (Fig 5B). In combination, these parameters defined additional features of our mucin mimetic receptor displays, such as glycopolymer length (via glycan valency and glycan spacing on polymer) and glycopolymer density on the array surface (via glycan valency, glycan spacing on polymer, and glycan density on array). We used the remaining portion of the binding data set (33 %) to test the accuracy, recall, and precision of the model (Fig S14 and Fig S15). This model predicted binding of H1N1 EGG to both receptor types, with majority of binding events occurring toward lower glycan valencies and polymer surface densities (Fig 5C, left), consistent with our manual analysis of the array data for this virus (Fig 4). The prediction plots in Figure 5C show the parameters of receptor presentation that most influenced viral binding (i.e., glycan receptor type, mucin mimetic valency and surface density). Additional parameters, such as polymer size and glycan spacing on the polymer, also contribute to viral recognition and can be analyzed. We confirmed that the performance of the model was similar when the training and testing data sets represented either separate array experiments or were selected randomly from combined data across multiple experiments (Fig S16).

The receptor binding preferences of other IAV strains on the mucin-mimetic array can be rapidly analyzed using the machine learning approach. The application of the SVM to the array binding for H3N2 MDCK correctly predicted the specificity of the virus for α2,6-sialoglycans,10,45 and, newly, identified its better ability to utilize increasingly dense glycopolymer displays (Fig 5C). Manual analysis of the array binding data confirmed these predictions (Fig S17 and Fig S18), demonstrating the general applicability of the supervised models for analyzing receptor pattern recognition by IAVs.

H1N1 PR8 propagation in mammalian cells enhances interactions with α2,6-sialoglycans in mucin-like displays.

Having established the SVMs as an effective method to rapidly identify preferred receptor display parameters for viral binding in the mucin mimetic arrays, we set to explore how the glycan binding phenotype of the virus may change depending on the host in which it is propagated. Such information may enhance existing viral surveillance by either eliminating glycan-binding phenotype artifacts, which can be introduced during propagation of field-isolated viruses in the laboratory, or by establishing specific binding phenotype features associated with enhanced human transmission.

We performed a comparative solution and array binding analysis between H1N1 EGG and the same virus produced in MDCK cells (H1N1 MDCK). The MDCK cell line is a commonly used mammalian system for the propagation of IAVs in the laboratory. The binding of H1N1 MDCK to the soluble monovalent α2,3-SiaLac and α2,6-SiaLac, as measured via RBC hemagglutination inhibition, remained unchanged from that of H1N1 EGG (Fig 4B). However, H1N1 MDCK exhibited improved ability to engage mucin mimetics carrying the α2,6-SiaLac receptors both on arrays (Fig 6A, Fig S19, and Fig S20) and in their soluble form (Ki,s-6GP65 = 938 nM, Fig 4B). The increased ability to recognize α2,6-sialoglycans is expected for H1N1 propagated in MDCK cells, as the cells in the allantoic fluid of embryonated chicken eggs used for viral propagation display mainly α2,3-linked sialic acids and the surfaces of MDCK cells are populated by receptors with both linkages.46 During culture in MDCK cells, viruses can undergo selection for both receptor types, giving rise to virus population with an altered sialoglycan binding phenotype.11, 46 However, our hemagglutination inhibition assays indicate that the altered glycan-binding phenotype of H1N1 MDCK does not arise from changes in HA affinity for the individual monovalent receptors.

Figure 6.

Figure 6.

Analysis of changes in receptor pattern recognition by IAVs produced in avain and mammalian cells. A) Scatterplots of binding responses to changing glycan receptor density in mucin mimetic arrays for H1N1 MDCK. B) The H1N1 MDCK and H1N1 EGG SVM models can be applied to the MDCK array binding data to generated “self” and “cross”-predictions, respectively, that can be used to determine lost, conserved, and gained interactions. C) The SVM identified better utilization of increasingly crowded mucin mimetic displays by H1N1 upon transition from avian to mammalian cell culture.

Next, we adapted the SVM analysis to identify viral binding features for H1N1 that are conserved between viruses produced in embryonated eggs and those propagated in MDCK cells. We trained and validated a new SVM model for H1N1 MDCK as described above. The H1N1 EGG and H1N1 MDCK models were able to accurately produce predictions of binding and non-binding events for each respective virus in the mucin mimetic array (Fig S14C and Fig S15C). These, we termed “self-predictions”. The EGG model was then applied to the array binding data for H1N1 MDCK to produce an EGG-to-MDCK “cross-prediction” that was compared with the H1N1 MDCK “self-prediction” results (Fig. 6B and Fig S21). The binding events correctly predicted by both models were termed “conserved in MDCK”. Those refer to receptor presentations that are recognized by H1N1 regardless of whether it was produced in avian or mammalian cells (Fig S21). The interactions correctly predicted as binding by the H1N1 MDCK model but were deemed non-binding by the H1N1 EGG model; i.e., interactions that were absent in the EGG-to-MDCK “cross-prediction”, were termed “gained in MDCK”. These refer to interactions with glycan receptor patterns in the array that were absent or in primarily nonbinding regions for the H1N1 virus produced in eggs but emerged when the virus was propagated in MDCK cells. Finally, the non-binders that were correctly predicted by the H1N1 MDCK model but were predicted to be binders by the H1N1 EGG model; i.e., interactions that were absent in H1N1 MDCK but were anticipated to occur based on the EGG-to-MDCK prediction, were termed “lost in MDCK”. These refer to glycan patterns recognized by the H1N1 virus produced in eggs but lost in mammalian cell culture.

As shown in Figure 6C, the conserved and gained interactions predicted by the SVM algorithm occur in distinct clusters with most of the interaction gain happening at higher polymer valencies or densities. The predictions are in agreement with our agglutination inhibition data and preliminary manual array analysis (Fig 4 and Fig S19) pointing to improved ability of the H1N1 MDCK virus to engage mucin mimetic displays, both as individual polymers in solution as well as in ensembles on arrays. For α2,6-SiaLac glycopolymers, sensitivity of the virus to receptor crowding continued to persist in H1N1 MDCK, as the majority of binding gains resulted either from higher valency polymers grafted at low densities or from surface crowding of low valency mimetics (Fig 6C and Fig S22). Likewise, the SVM analysis identified stronger interactions of H1N1 MDCK with high densities of α2,3-SiaLac receptors in the mucin-mimetic array (Fig 6C). The predictions did not identify loss of any interactions present in the H1N1 EGG upon transition into the mammalian culture system . To validate this method, we compared the results of the cross-prediction with experimental data from H1N1-EGG. We confirmed that the interactions identified as conserved and gained in H1N1 MDCK by the SVM also correlated with binding and non-binding events, respectively, observed for H1N1-EGG in the array. We observed a high level of agreement between the cross-predictions and experimental data in the low and high receptor valency-density regions on the array, where the conserved and gained interactions are well defined (Fig S22).

We note that an analysis of conservation, gain or loss of binding is made possible by application of the 5-dimensional SVM, wherein the algorithm learns the “rules-of-binding” from a set of interaction patterns (defined by glycan valency, density, polymer spacing and printing concentration data), which is then compared to another dataset to bring out the similarities and differences in the interactions. Arriving at these conclusions would be impossible using only chemical intuition or simple visualization of data because the interaction patterns are not apparent when the data is plotted with equal weights on all five dimensions. Using between 200 to 2000 iterations, the SVM finds linear combinations of the parameters in this 5-dimensional space, reweighting the original dimensions, until most separated classes are created, and their underlying data features are segregated, to distinguish between binding and non-binding events.

It is unlikely that such combinations can be found manually, or using unsupervised clustering schemes, making SVM a natural choice to create an “intelligent” space that correlates binding rules to linear combinations of glycan valency and spacing on the mimetics and their density on the array surface. In contrast to popular simple clustering algorithms, such as K-means, which are limited by the number of dimensions, the strength of SVM is that it does not impose the ground-truth or prior knowledge of the binding patterns to learn the rules of binding.47 Rather, SVM infers both the rule as well as the outcome directly from higher-dimensional data sets. Although we could have employed more sophisticated algorithms, such as neural networks, the use of a discrete binary classifier in this study (binder = 1 and non-binder = 0) made SVM a more convenient choice. The SVMs are not solved for local optima, thus, they handle well high dimensional data using one to two classifiers, as is the case here.

The improved range of binding interactions of H1N1 MDCK with α2,6-linked sialic acids is expected based on the higher expression of these glycans in the mammalian cells. Even in absence of Muc gene expression in MDCK cells,48 the observed changes in receptor pattern recognition for both glycans do not stem from altered affinity of its HA toward the monovalent glycans in solution (Fig 4B). The HA protein carries several glycosylation sites near the sialic acid binding region and the addition of glycans can influence binding13 and may give rise to the observed changes in receptor pattern recognition.

We performed PAGE analysis of the viral proteins before and after PNGase treatment to remove N-linked glycans. The glycosylated form of HA1 fragment from the egg-grown H1N1 virus had lower molecular weight than that of the MDCK virus (Fig S23). This difference in gel mobility was eliminated by PNGase treatment, which catalyzes the removal of N-linked glycans. This points to a more extensive glycosylation of the MDCK cell-derived virus HA proteins and is consistent with previously observed differences in the glycosylation, but not in the primary amino acid sequence, of HAs from isolated duck H1N1 viruses propagated in MDCK cells and egg cultures.49 Whether the relationship between the changes in HA glycosylation and the altered receptor patter recognition of the virus observed in the current array study is causative or correlative is yet to be determined. Nonetheless, it presents one possible explanation for the altered binding behavior of the viruses. Such differences in glycosylation, may reflect the influence of species, cell type or combination of both on receptor pattern recognition by viruses and contribute to their emerging ability to cross between species. Similar analysis of glycan binding changes upon transition from egg culture to MDCK cells was not possible for the H3N2 virus used in this study, which, due to its restricted receptor specificity, does not propagate well in embryonated eggs without undergoing mutations in its HA sequence.50 However, we anticipate that the array can help identify other viruses with glycan-binding phenotype shifts occurring in response to adaptation to a new host.

COLNCLUSIONS

The glycocalyx exists as a complex assortment of membrane-tethered and secreted glycoconjugates that serve different roles in multicellular identity, function, and pathogen invasion. In this work, we aimed to model the interactions of IAVs with the mucin glycoprotein components of the mucosal barrier by generating mucin-mimetic glycopolymers with tunable sizes and compositions displaying α2,3- and α2,6-linked sialyllactose glycans as prototypes for the avian and human receptors for IAVs. RBC hemagglutination inhibition assays with soluble forms of the probes revealed an enhancement in selectivity of H1N1 PR8 viruses produced in embryonated chicken eggs from ~ 4-fold preference for binding of α2,3-SiaLac in its soluble monomeric form to more than 60-fold for the polyvalent receptor displays. This differential arose from selectively disfavored binding of the virus to the α2,6-SiaLac glycans in the mucin-mimetics rather than increased avidity toward the α2,3-SiaLac polymers. Systematic evaluation of H1N1 EGG capture on arrays of immobilized mucin mimetics enabled by an SVM learning algorithm showed that the virus bound surface displays of both receptor types but was attenuated at high polymer densities. The receptor pattern recognition changed when the virus was propagated in MDCK cells toward improved utilization of α2,6-SiaLac mucin mimetics in both soluble and immobilized forms, and a lower sensitivity to surface crowding of the α2,3-SiaLac glycopolymers.

Our findings are consistent with the proposed protective functions of soluble mucins at the airway epithelium which present primarily α2,3-linked receptors and prevent infection through viral capture and clearance. Newly, our observations that the presentation of α2,6-linked sialoglycans on linear polyvalent scaffolds and the arrangement of such conjugates in increasingly dense surface ensembles disfavors the binding of H1N1 produced in avian cells also provides support for the role of membrane-associated mucins in limiting viral adhesion and infection at the epithelial cell surface. The improved binding of H1N1 viruses produced in mammalian cells to the mucin-mimetic displays of α2,6-linked sialoglycans was not accompanied by changes in the affinity or selectivity of their HA proteins toward the individual glycan receptors. While the basis for the differences in receptor pattern recognition needs to be further investigated, our studies show that the mucin probes and their arrays may serve as useful tools to investigate viral interactions at the mucosal barrier and the evolution of their glycan receptor-binding phenotype.

The array and the SVM analysis can be extended to other IAV strains. The platform successfully predicted the α2,6-sialoglycan specificity of the H3N2 A/Aichi/2/68 virus and identified its better ability to utilize polyvalent and high-density mucin-like displays of the human receptor prototype, α2,6-SiaLac. The modularity of the mucin mimetic synthesis and the ease with which glycans can be incorporated into these materials51 should be well suited for the introduction of more complex and biologically relevant glycans into the array, including mucin type O-glycans. Of particular interest for future array development are glycans with linear or branched core structures variously modified with extended N-acetyllactosamine disaccharide repeats and presenting fucosylation or sulfation motifs, as they continue to emerge from glycomics13,14 ,28,52 and genetic screens16,53 as relevant to IAV recognition and infection. Machine learning, as employed in this study, is well poised to support such increase in the dimensionality of parameter space on the array and can be expanded to non-binary classification for analysis. This may allow future inclusion of multiple continuous classifiers, such the strength and kinetics of a binding interaction between a virus and a receptor pattern, to construct more sophisticated neural networks that reveal higher-order relationships between pathogens and their hosts.

EXPERIMENTAL PROCEDURES

RESOURCE AVAILABLILITY

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Kamil Godula (kgodula@ucsd.edu).

Materials availability

This study did not generate new unique reagents. Glycopolymers and microarrays can be made available through contact with the lead contact. Influenza virus A/PR/8/34 (H1N1) is available from ATCC using code VR-1469.

Data and code availability

All SVM data sets and code are available on Github and can be accessed by the following link: https://github.com/SingharoyLab/H1N1_host_interaction.

A detailed list of chemical and biological reagents including their sources and catalog numbers can be found as Table S1 in the Supporting Information.

Glycopolymer synthesis and characterization.

Fluorescently labeled glycopolymers end-functionalized with an azide (GP) used in this study and their polymeric precursors (P) were prepared using the RAFT polymerization method according to previously published procedures54 and are summarized in Figure 2 and described in detail in SI.

Cell culture.
Madin-Darby Canine Kidney (MDCK) cells:

MDCK cells were cultured in Dulbeco’s modified Eagle’s medium supplemented with 10% FBS, 100 U/mL penicillin, and 100 U/ml streptomycin.

Viral culture.

Influenza virus strain A/PR/8/34 (H1N1, ATCC VR-1469) was purchased from ATCC and propagated in MDCK cells that were transferred to DMEM medium supplemented with 0.2% BSA fraction V, 25mM HEPES buffer, 2 μg/ml TPCK-trypsin, and 1% penicillin/streptomycin (“DMEM-TPCK” media). The same strain was used for viral production in embryonated chicken eggs. Briefly, fertilized chicken eggs were obtained and stored at 37 °C. When the embryos were 10 days old (assessed by candeling), virus was injected through small holes made in the shell over the air sac. The holes were covered with parafilm and the eggs were placed back at 37 °C. After two days, the eggs were chilled to prepare for harvest. The eggshells were cut open above the airsac and the allantoic fluid was carefully collected into centrifuge tubes without rupturing the yolk. The tubes were centrifuged to pellet any debris and the supernatant containing virus was aliquoted into 1 mL cryovials and stored at −80 °C.

Viral titers.

Turkey red blood cells were purchased from Lampire and a 1% solution was used to determine viral titers via the hemagglutination test. MDCK cells were used to determine the 50% tissue culture infective dose (TCID50) using the Spearman-Karber method.

Hemagglutination Inhibition.

Glycopolymer solutions in PBS (25 μL, ranging from 20 μM to 20 nM by 2-fold dilutions) were added to a 96 well plate. The last well in each row was used as a PBS control and did not contain glycopolymer or virus. An equal volume (25 μL) of H1N1 diluted to HAU=4 (turkey RBCs) was added to each well and allowed to incubate at room temperature. After a ½ hr incubation, 50 μL of 1% turkey RBCs in PBS were added to all of the wells. The Ki value was read after a ½ hr as the lowest concentration of polymer to inhibit hemagglutination.

Array construction.

Microarrays were printed on cyclooctyne-coated glass slides as previously described54 using a sciFLEXARRAYER S3 printer (Scienion) following passivation with a 1% BSA/0.1% Tween-20 solution in PBS for 1 hour. Polymer solutions were diluted in printing buffer (0.005% Tween-20 in PBS) to concentrations of 1, 5, and 10 μM polymer and printed in replicates of six at a humidity of 70%. Following an overnight reaction at 4°C, excess polymer was removed by vigorous washing in 0.1% Triton X/PBS solution. The slides were then imaged on an Axon GenePix4000B scanner (Molecular Devices) at the highest PMT possible without saturation of pixels. The GenePix software was used to calculate the relative polymer density, by dividing the fluorescent intensity at 532 nm by the labeling efficiency for each polymer (obtained through UV-Vis measurements) and the spot area (calculated from the spot diameter generated by the software). To obtain relative glycan density, the polymer density was multiplied by the glycan valency for each polymer (attained through NMR integration).

Array binding assays.

Prior to array binding assays, subarrays were blocked with 3% BSA solution in PBS for 1 hr. For lectin binding, subarrays were washed three times with lectin binding buffer (0.005% Tween-20 in PBS with 0.1 mM CaCl2, MnCl2, and MgCl2). Dylight labeled SNA and WGA were diluted in the lectin binding buffer and incubated on the array for 1 hour in the dark. After washing with binding buffer, 0.1% Tween-20 solution in PBS, and rinsing with MilliQ water, the slides were imaged at the highest PMT possible without producing saturated pixels. For H1N1 binding, subarrays were washed with 1% BSA/PBST three times following passivation. H1N1 was diluted in 1% BSA/PBST and incubated on the array for 1 hr. The slide was washed two times with 1% BSA/PBST, and then fixed for 20 min with 4 % PFA in PBS. To visualize H1N1, binding a 1:500 dilution of anti-HA in 1% BSA/PBST was incubated on the array for 1 hour, followed by an hour incubation in the dark of a 1:500 dilution of anti-rabbit-AF647 antibody. The subarrays were washed two more times with 1% BSA/PBST, two times with the 0.1% Tween-20 solution in PBS, rinsed with MilliQ and imaged at the highest PMT possible without producing saturated pixels.

Machine learning workflow.

The machine learning workflow is demonstrated in Figure S21. The H1N1 EGG, H3N2, and H1N1 MDCK data sets contained 8373, 1922, and 1273 data points respectively. Given the size of our data, the feature space (4-5 variables), and the category of problem (binary classification), we chose to use support vector machine (SVM) algorithm for this work. SVM is ideally suited for such binary classification tasks. Likewise, more sophisticated algorithms like random forest shows ideal performance with larger data sets and/or more complex features. Use of random forest in place of SVM did not show any improvements (data not shown). As a preprocessing step, all negative viral fluorescent intensities (which resulted from background subtraction in the absence of viral binding) were adjusted to zero to indicate the lack of viral binding. The fluorescence intensities were then normalized over the entire data set in the range of [0,1]. This continuous data was converted to categorical (“binders”/“nonbinders”) based on a cutoff fluorescence that was determined from the distribution of fluorescence intensities of lactose samples which served as negative control. Next, the features (glycan type, valency, and polymer density) were scaled to a range of [0,1] in order to avoid bias from higher values. The only categorical feature (glycan type) was transformed into numeric by mapping to a two-dimensional space were α2,3-SiaLac is represented by (1,0) and α2,6-SiaLac by (0,1). This data set was then split into a training and test set, where training set contained 67% of the data. Each experiment containing 6 fluorescence measurements was individually split between training and test sets. The SVM algorithm was used for learning from the training data, and predictions were validated against the test data. We confirmed that the performance of the model was similar when the data were split across all experiments for creating training and test sets (Fig S16). Training and testing were performed separately for H1N1 EGG, H3N2, and H1N1 MDCK data. Convergence was reached within 2000 iterations for H1N1 EGG data and 200 iterations for H3N2 and H1N1 MDCK data (Fig S14). Next, the algorithm trained on egg-virus data was used to predict results for MDCK-virus data, and the prediction results from this so-called “cross-model” was compared with the results obtained from the model trained on MDCK-virus (the “self-model”). Data was prepared in python using pandas, numpy, and scipy packages. SVM was performed using python’s scikit-learn package.55

Supplementary Material

MMC1

Bigger picture.

Influenza A exploits sialic acid glycans on host cells to gain entry during infection. Mucosal tissues are protected against viral assault by a physical barrier of soluble and cell surface mucin glycoproteins, which are, paradoxically, also heavily decorated with sialic acids. To explore protective functions of mucins, we examined binding of H1N1 and H3N2 in mucin mimetic arrays modeling the structure and molecular crowding of the mucosal glycocalyx. Machine learning analysis showed that displaying human-like sialic acid receptors on mucin-mimetic scaffolds disfavored viral binding. Adhesion was further decreased by crowding on the array, providing insight into the possible role of membrane mucins in limiting viral association. H1N1 propagated in mammalian cells showed greater binding to denser receptor displays than when grown in eggs. This platform may thus find utility as a surveillance tool for assessing changes in receptor binding phenotype of circulating viruses.

HIGHLIGHTS.

  • Mucin-mimetic array profiles viral binding based on glycan receptor presentation.

  • Support vector machine (SVM) learning can detect and predict viral binding patterns.

  • Organization of human sialoglycans in mucin-type displays disfavors viral binding.

  • H1N1 better utilizes mucin-like receptor displays when produced in mammalian cells.

ACKNOWLEDGMENTS

We thank the UCSD Glycobiology Research and Training Center for access to tissue culture facilities and analytical instrumentation, Dr. Christopher Fisher for his help with virus culture and characterization methods, and the ASU Research Computing and their Agave computing cluster for the computational resources. The authors also wish to thank Prof. Mia Huang (TSRI) for her technical advice and valuable insights over the course of this research. This work was supported in part by the NIH Director’s New Innovator Award (NICHD: 1DP2HD087954-01), the NIH Director’s Glycoscience Common Fund (1R21AI129894-01) and the Gordon and Betty Moore Foundation via a Scialog grant (#9162.07). KG was supported by the Alfred P. Sloan Foundation (FG-2017-9094) and the Research Corporation for Science Advancement via the Cottrell Scholar Award (grant #24119). TML was supported by the Chemistry-Biology Interface training program (NIGMS: 5T32GM112584-03). PG and MA are supported by the G. Harold and Leyla Y. Mathers Charitable Foundation. ES was supported by the UCSD Paths Program and the Undergraduate Summer Research Award through the UCSD Division of Physical Sciences.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

SUPPORTING INFORMATION

Supplemental information can be found online at https://doi.org/10.1016/j.chempr.2021.09.015.

DECLARATION OF INTERESTS

The authors declare no competing interests.

INCLUSION AND DIVERSITY

One or more of the authors of this paper self-identifies as an underrepresented ethnic minority in science. One or more of the authors of this paper self-identifies as a member of the LGBTQ+ community.

REFERENCES

  • 1.Keni R, Alexander A, Nayak PG, Mudgal J, and Nandakumar K (2020). COVID-19: emergence, spread, possible treatments, and global burden. Front. Public Health 8, 10.3389/fpubh.2020.00216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Long JS, Mistry B, Haslam SM, Barclay W (2019). Host and viral determinants of influenza A virus species specificity. Nat. Rev 17, 67–81. [DOI] [PubMed] [Google Scholar]
  • 3.Lipsitch M, Barclay W, Raman R, Russell CJ, Belser JA, Cobey S, Kasson PM, Lloyd-Smith JO, Maurer-Stroh S, Riley S, Beauchemin CA, Bedford T, Friedrich TC, Handel A, Herfst S, Murcia PR, Roche B, Wilke CO, Russell CA (2016). Viral factors in influenza pandemic risk assessment. eLIFE 5, e18491. DOI: 10.7554/eLife.18491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Suzuki Y (2005). Sialobiology of influenza: molecular mechanism of host range variation of influenza viruses. Biol. Pharm. Bull, 28, 399–408. DOI: 10.1248/bpb.28.399. [DOI] [PubMed] [Google Scholar]
  • 5.Raman R, Tharakaraman K, Shriver Z, Jayaraman A, Sasisekharan V, Sasisekharan R (2014). Glycan receptor specificity as a useful tool for characterization and surveillance of influenza A virus. Trends Microbiol. 22, 632–641. DOI: 10.1016/j.tim.2014.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Air GM (2014). Influenza virus-glycan interactions. Cur. Opin. Virol 7, 128–133. DOI: 10.1016/j.coviro.2014.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Skehel JJ and Wiley DC (2000). Receptor binding and membrane fusion in virus entry: the influenza hemagglutinin. Annu Rev Biochem. 69, 531–569. DOI: 10.1146/annurev.biochem.69.1.531. [DOI] [PubMed] [Google Scholar]
  • 8.McAuley JL, Gilbertson BP, Trifkovic S, Brown LE, and McKimm-Breschkin JL (2019). Influenza virus neuraminidase structure and functions. Front. Microbiol 10 (39), DOI: 10.3389/fmicb.2019.00039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Gaynard A, Le Briand N, Frobert E, Lina B, Escuret V (2016). Functional balance between neuraminidase and haemagglutinin in influenza viruses. Clinical Microbiology and Infection 22 (12), 975–983. DOI: 10.1016/j.cmi.2016.07.007. [DOI] [PubMed] [Google Scholar]
  • 10.Rogers GN, and Paulson JC (1983). Receptor determinants of human and animal influenza virus isolates: differences in receptor specificity of the H3 hemagglutinin based on the species of origin. Virology 127, 361–373. [DOI] [PubMed] [Google Scholar]
  • 11.Couceiro JNSS, Paulson JC, and Baum LG (1993). Influenza strains selectively recognize sialyloligosaccharides on human respiratory epithelium; the role of the host cell in selection of hemagglutinin receptor specificity. Virus Research 29, 155–165. [DOI] [PubMed] [Google Scholar]
  • 12.Rogers GN, Paulson JC, Daniels RS, Skehel JJ, Wilson IA, and Wiley DC (1983) Single amino acid substitutions in influenza haemagglutinin change receptor binding specificity. Nature 304, 76–78. DOI: 10.1038/304076a0. [DOI] [PubMed] [Google Scholar]
  • 13.Peng W, de Vries RP, Grant OC, Thompson AJ, McBride R, Tsogtbaatar B, Lee PS, Razi N, Wilson IA, Woods RJ, and Paulson JC (2017). Recent H3N2 viruses have evolved specificity for extended, branched human-type receptors, conferring potential for increased avidity. Cell Host Microbe 21, 23–34. DOI: 10.1016/j.chom.2016.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Walther T, Karamanska R, Chan RWY, Chan MCW, Jia N, Air G, Hopton C, Wong MP, Dell A, Malik Peiris JS, Haslam SM, and Nicholls JM (2013). Glycomic analysis of human respiratory tract tissues and correlation with influenza virus infection. PLoS Pathog. 9 (3), e1003223. DOI: 10.1371/journal.ppat.1003223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chu VC, and Whittaker GR (2004). Influenza virus entry and infection require host cell N-linked glycoprotein. Proc. Natl. Acad. Sci. USA 101 (52), 18153–18158. DOI: 10.1073/pnas.0405172102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Narimatsu Y, Joshi HJ, Nason R, Van Coillie J, Karlsson R, Sun L, Ye Z, Chen YH, Schjoldager KT, Steentoft C, Furukawa S, Bensing BA, Sullam PM, Thompson AJ, Paulson JC, Büll C, Adema GJ, Mandel U, Hansen L, Bennett EP, Varki A, Vakhrushev SY, Yang Z, and Clausen H (2019). An atlas of human glycosylation pathways enables display of the human glycome by gene engineered cells. Mol. Cell 75 (2), 394–407. DOI: 10.1016/j.molcel.2019.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Stevens J, Blixt O, Glaser L, Taubenberger JK, Palese P, and Paulson JC (2006). Glycan microarray analysis of the hemagglutinins from modern and pandemic influenza viruses reveals different receptor specificities. J. Mol. Bio 355, 1143–1155. DOI: 10.1016/j.jmb.2005.11.002. [DOI] [PubMed] [Google Scholar]
  • 18.Stevens J, Blixt O, Tumpey TM, Taubenberger JK, Paulson JC, and Wilson IA (2006). Structure and receptor specificity of the hemagglutinin from an H5N1 influenza virus. Science 312, 404–410. DOI: 10.1126/science.1124513. [DOI] [PubMed] [Google Scholar]
  • 19.Liao H-Y, Hsu C-H, Wang S-C, Liang C-H, Yen H-Y, Su C-Y, Chen C-H, Jan J-T, Ren C-T, Chen C-H, Cheng T-J, Wu C-Y, and Wong C-H (2010). Differential receptor binding affinities of influenza hemagglutinins on glycan arrays. J. Am. Chem. Soc 132 (42), 14849–14856. DOI: 10.1021/ja104657b. [DOI] [PubMed] [Google Scholar]
  • 20.Bradley KC, Jones CA, Tompkins SM, Tripp RA, Russell RJ, Gramer MR, Heimbur-Molinaro J, Smith DF, Cummings RD, and Steinhauer DA (2011). Comparison of the receptor binding properties of contemporary swine isolates and early human pandemic H1N1 isolates. Virology 413 (2), 169–182. DOI: 10.1016/j.virol.2011.01.027. [DOI] [PubMed] [Google Scholar]
  • 21.Gulati S, Smith DF, Cummings RD, Couch RB, Griesemer SB, St. George K, Webster RG, and Air GM (2013). Human H3N2 influenza viruses isolated from 1968 to 2012 show varying preference for receptor substructures with no apparent consequences for disease or spread. PLoS ONE 8(6), e66325. DOI: 10.1371/journal.pone.0066325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Linden SK, Sutton P, Karlsson NG, Korolik V, and McGuckin MA (2008). Mucins in the mucosal barrier to infection, Mucosal Immunol. 1(3), 183–197. DOI: 10.1038/mi.2008.5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Möckl L (2020) The emerging role of the mammalian glycocalyx in functional membrane organization and immune system regulation. Front Cell Dev Biol. 15; 253. doi: 10.3389/fcell.2020.00253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hang HC, and Bertozzi CR (2015). The chemistry and biology of mucin-type O-linked glycosylation. Bioorganic & Medicinal Chemistry 13(17), 5021–5034. DOI: 10.1016/j.bmc.2005.04.085. [DOI] [PubMed] [Google Scholar]
  • 25.Mayr J, Lau K, Lai JCC, Gagarinov IA, Shi Y, McAtamney S, Chan RWY, Nichols J, von Itzstein M, and Haselhorst T (2018). Unravelling the role of O-glycans in influenza A virus infection. Sci. Rep 8, 16382. DOI: 10.1038/s41598-018-34175-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Delaveris CS, Webster ER, Banik SM, Boxer SG, and Bertozzi CR (2020). Membrane-tethered mucin-like polypeptides sterically inhibit binding and slow fusion kinetics of influenza A virus. Proc. Natl. Acad. Sci. USA 117(23), 12643–12650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Button B, Cai L-H, Ehre C, Kesimer M, Hill DB, Sheehan JK, Boucher RC, and Rubinstein M (2012). A periciliary brush promotes the lung health by separating the mucus layer from airway epithelia. Science 337, 937–944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Byrd-Leotis L, Liu R, Bradely KC, Lasanajak Y, Cummings SF, Song X, Heimburg-Molinaro J, Galloway SE, Culhane MR, Smith DF, Steinhauer DA, and Cummings RD (2014). Shotgun glycomics of pig lung identifies natural endogenous receptors for influenza viruses. Proc Nat Acad 111, E2241–E2250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.McAuley JL, Corcilius L, Tan H-X, Rayne RJ, McGunkin MA, and Brown LE (2017). The cell surface mucin MUC1 limits the severity of influenza A virus infection. Mucosal Immumology 10, 1581–1593. [DOI] [PubMed] [Google Scholar]
  • 30.Agard NJ, Prescher JA, and Bertozzi CR (2004). A strain-promoted [3 + 2] azide–alkyne cycloaddition for covalent modification of biomolecules in living systems. J. Am. Chem. Soc 127 (31), 5046–15047. DOI: 10.1021/ja044996f. [DOI] [PubMed] [Google Scholar]
  • 31.Kaiser E Sr., Picart F, Kubiak T, Tam JP, and Merrifield RB (1993). Selective deprotection of the N-tert-butyloxycarbonyl group in solid phase peptide synthesis with chlorotrimethylsilane and phenol. J. Org. Chem 58, 5167–5175 [Google Scholar]
  • 32.Nagao M, Fujiwara Y, Matsubara T, Hoshino Y, Sato T, and Miura Y (2017). Design of glycopolymers carrying sialyl oligosaccharides for controlling the interaction with the influenza virus. Biomacromolecules 18 (12), 4385–4392. DOI: 10.1021/acs.biomac.7b01426. [DOI] [PubMed] [Google Scholar]
  • 33.Armstrong GD, Howard LA, and Peppier MS (1988). Use of glycosyltransferases to restore pertussis toxin receptor activity to asialoagalactofetuin. J. Biol. Chem 263, 8677–8684. [PubMed] [Google Scholar]
  • 34.Wright CS, Jaeger J (1993). Crystallographic refinement and structure analysis of the complex of wheat germ agglutinin with a bivalent sialoglycopeptide from glycophorin A. J. Biol. Chem 232, 620–638. [DOI] [PubMed] [Google Scholar]
  • 35.Shibuya N, Goldstein IJ, Broekaert WF, Nsimba-Lubaki M, Peeters B, and Peumans WJ (1987). The elderberry (Sambucus nigra L.) bark lectin recognizes the Neu5Ac(alpha 2-6)Gal/GalNAc sequence. J. Biol. Chem 262, 1596–1601. [PubMed] [Google Scholar]
  • 36.Signal GB, Mammen M, Dahmann G, and Whitesides GM (1996). Polyacrylamides bearing pendant a-sialoside groups strongly inhibit agglutination of erythrocytes by influenza virus: the strong inhibition reflects enhanced binding through cooperative polyvalent interactions. J. Am. Chem. Soc 118, 3789–3800. [Google Scholar]
  • 37.Van Damme EJM, Peumans WJ, Barre A, and Rougé P (1998). Plant lectins: a composite of several distinct families of structurally and evolutionary related proteins with diverse biological roles. Critical Reviews in Plant Sciences 17, 575–692. [Google Scholar]
  • 38.Maveyraud L, Niwa H, Guillet V, Svergun DI, Konarev PV, Palmer RA, Peumans WJ, Rouge P, Van Damme EJ, Reynolds CD, Mourey L (2009). Structural basis for sugar recognition, including the Tn carcinoma antigen, by the lectin SNA-II from Sambucus nigra. Proteins, 75, 89–103. [DOI] [PubMed] [Google Scholar]
  • 39.Raman R, Tharakaraman K, Sasisekharan V, Sasisekharan R (2016). Glycan-protein interactions in viral pathogenesis. Current Opinion in Structural Biology. 40, 153–162. DOI: 10.1016/j.sbi.2016.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Meng B, Marriott AC, and Dimmock NJ (2010). The receptor preference of influenza viruses. Influenza and Other Respiratory Viruses. 4 (3), 147–153. DOI: 10.1111/j.1750-2659.2010.00130.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Koemer I, Mastrosovic MN, Haller O, Staeheli P, and Kochs G (2012). Altered receptor specificity and fusion activity of the haemagglutinin contribute to high virulence of a mouse-adapted influenza A virus. Journal of General Virology. 93, 970–979. DOI: 10.1099/vir.0.035782-0. [DOI] [PubMed] [Google Scholar]
  • 42.Iorio DD, Verheijden ML, van der Vries E, Jonkheijm P, and Huskens J (2019). Weak multivalent binding of influenza hemagglutinin nanoparticles at a sialoglycan-functionalized supported lipid bilayer. ACS Nano 13, 3413–3423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Zanin M, Baviskar P, Webster R, and Webby R (2016). The interaction between respiratory pathogens and mucus. Cell Host Microbe 19, 159–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Uddin S, Khan A, Hossain ME, Moni MA (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making 19, 281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Suzuki Y, Nagao Y, Kato H, Suzuki T, Masumoto M, and Murayama J (1987). The hemagglutinins of the human influenza viruses A and B recognize different receptor microdomains. Biochimica et Biophysica Acta 903, 417–424. [DOI] [PubMed] [Google Scholar]
  • 46.Ito T, Suzuki Y, Takada A, Kawamoto A, Otsuki K, Masuda H, Yamada M, Suzuki T, Kida H, and Kawaoka Y (1997). Differences in sialic acid-galactose linkages in the chicken egg amnion and allantois influence human influenza virus receptor specificity and variant selection. J Virol. 71(4), 3357–3362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Yao Y, Liu Y, Yu Y, Xu X, Lv W, Li Z, and Chen X (2013). K-SVM: An effective SVM algorithm based on K-means clustering. Journal of Computers 8, 2632–2639. [Google Scholar]
  • 48.Hudson MJ, Stamp GW, Chaudhary KS, Hewitt R, Stubbs AP, Abel D, and Lalani EN (2001). Human MUC1 mucin: a potent glandular morphogen. J Pathol 194, 373–383. [DOI] [PubMed] [Google Scholar]
  • 49.Inkster MD, Hinshaw VS, and Schulze IT (1993). The hemagglutinins of duck and human H1 influenza viruses differ in sequence conservation and in glycosylation. J Virol. 71(12), 7436–7443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Li J, Liu S, Gao Y, Tian S, Yang Y, and Ma N (2021). Comparison of N-linked glycosylation on hemagglutinins derived from chicken embryos and MDCK cells: a case of the production of a trivalent seasonal influenza vaccine. Applied Microbiology and Biotechnology 105, 3559–3572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Cohen M, Fisher CJ, Huang ML, Lindsay LL, Plancarte M, Boyce WM, Godula K, Gagneux P (2016). Capture and characterization of influenza A virus from primary samples using glycan bead arrays. Virology 439, 128–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Byrd-Leotis, Jia N, Dutta S, Trost JF, Gao C, Cummings SF, Braulke T, Müller-Leonnies S, Heimburg-Molinaro J, Steinhauer DA, and Cummings RD (2019). Influenza binds phosphorylated glycans from human lung. Sci Adv. 5, eaav2554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Nason R, Büll C, Konstantinidi A, Sun L, Ye Z, Halim A, Du W, Sørensen DM, Durbesson F, Furukawa S, Mandel U, Joshi HJ, Dworkin LA, Hansen L, David L, Iverson TI, Bensing BA, Sullam PM, Varki A, de Vries E, de Haan CAM, Vincentelli R, Henrissat B, Vakhrushev SY, Clausen H, and Narimatsu Y (2021). Display of the human mucinome with defined O-glycans by gene engineered cells. Nature Communications 12, 4070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Huang ML, Smith RAA, Trieger GW, Godula K (2014). Glycocalyx remodeling with proteoglycan mimetics promotes neural specification in embryonic stem cells. J. Am. Chem. Soc 136 (30), 10565–10568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011). Scikit-learn: machine learning in python. Journal of Machine Learning Research 12, 2825–2830. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

MMC1

Data Availability Statement

All SVM data sets and code are available on Github and can be accessed by the following link: https://github.com/SingharoyLab/H1N1_host_interaction.

A detailed list of chemical and biological reagents including their sources and catalog numbers can be found as Table S1 in the Supporting Information.

Glycopolymer synthesis and characterization.

Fluorescently labeled glycopolymers end-functionalized with an azide (GP) used in this study and their polymeric precursors (P) were prepared using the RAFT polymerization method according to previously published procedures54 and are summarized in Figure 2 and described in detail in SI.

Cell culture.

Madin-Darby Canine Kidney (MDCK) cells:

MDCK cells were cultured in Dulbeco’s modified Eagle’s medium supplemented with 10% FBS, 100 U/mL penicillin, and 100 U/ml streptomycin.

Viral culture.

Influenza virus strain A/PR/8/34 (H1N1, ATCC VR-1469) was purchased from ATCC and propagated in MDCK cells that were transferred to DMEM medium supplemented with 0.2% BSA fraction V, 25mM HEPES buffer, 2 μg/ml TPCK-trypsin, and 1% penicillin/streptomycin (“DMEM-TPCK” media). The same strain was used for viral production in embryonated chicken eggs. Briefly, fertilized chicken eggs were obtained and stored at 37 °C. When the embryos were 10 days old (assessed by candeling), virus was injected through small holes made in the shell over the air sac. The holes were covered with parafilm and the eggs were placed back at 37 °C. After two days, the eggs were chilled to prepare for harvest. The eggshells were cut open above the airsac and the allantoic fluid was carefully collected into centrifuge tubes without rupturing the yolk. The tubes were centrifuged to pellet any debris and the supernatant containing virus was aliquoted into 1 mL cryovials and stored at −80 °C.

Viral titers.

Turkey red blood cells were purchased from Lampire and a 1% solution was used to determine viral titers via the hemagglutination test. MDCK cells were used to determine the 50% tissue culture infective dose (TCID50) using the Spearman-Karber method.

Hemagglutination Inhibition.

Glycopolymer solutions in PBS (25 μL, ranging from 20 μM to 20 nM by 2-fold dilutions) were added to a 96 well plate. The last well in each row was used as a PBS control and did not contain glycopolymer or virus. An equal volume (25 μL) of H1N1 diluted to HAU=4 (turkey RBCs) was added to each well and allowed to incubate at room temperature. After a ½ hr incubation, 50 μL of 1% turkey RBCs in PBS were added to all of the wells. The Ki value was read after a ½ hr as the lowest concentration of polymer to inhibit hemagglutination.

Array construction.

Microarrays were printed on cyclooctyne-coated glass slides as previously described54 using a sciFLEXARRAYER S3 printer (Scienion) following passivation with a 1% BSA/0.1% Tween-20 solution in PBS for 1 hour. Polymer solutions were diluted in printing buffer (0.005% Tween-20 in PBS) to concentrations of 1, 5, and 10 μM polymer and printed in replicates of six at a humidity of 70%. Following an overnight reaction at 4°C, excess polymer was removed by vigorous washing in 0.1% Triton X/PBS solution. The slides were then imaged on an Axon GenePix4000B scanner (Molecular Devices) at the highest PMT possible without saturation of pixels. The GenePix software was used to calculate the relative polymer density, by dividing the fluorescent intensity at 532 nm by the labeling efficiency for each polymer (obtained through UV-Vis measurements) and the spot area (calculated from the spot diameter generated by the software). To obtain relative glycan density, the polymer density was multiplied by the glycan valency for each polymer (attained through NMR integration).

Array binding assays.

Prior to array binding assays, subarrays were blocked with 3% BSA solution in PBS for 1 hr. For lectin binding, subarrays were washed three times with lectin binding buffer (0.005% Tween-20 in PBS with 0.1 mM CaCl2, MnCl2, and MgCl2). Dylight labeled SNA and WGA were diluted in the lectin binding buffer and incubated on the array for 1 hour in the dark. After washing with binding buffer, 0.1% Tween-20 solution in PBS, and rinsing with MilliQ water, the slides were imaged at the highest PMT possible without producing saturated pixels. For H1N1 binding, subarrays were washed with 1% BSA/PBST three times following passivation. H1N1 was diluted in 1% BSA/PBST and incubated on the array for 1 hr. The slide was washed two times with 1% BSA/PBST, and then fixed for 20 min with 4 % PFA in PBS. To visualize H1N1, binding a 1:500 dilution of anti-HA in 1% BSA/PBST was incubated on the array for 1 hour, followed by an hour incubation in the dark of a 1:500 dilution of anti-rabbit-AF647 antibody. The subarrays were washed two more times with 1% BSA/PBST, two times with the 0.1% Tween-20 solution in PBS, rinsed with MilliQ and imaged at the highest PMT possible without producing saturated pixels.

Machine learning workflow.

The machine learning workflow is demonstrated in Figure S21. The H1N1 EGG, H3N2, and H1N1 MDCK data sets contained 8373, 1922, and 1273 data points respectively. Given the size of our data, the feature space (4-5 variables), and the category of problem (binary classification), we chose to use support vector machine (SVM) algorithm for this work. SVM is ideally suited for such binary classification tasks. Likewise, more sophisticated algorithms like random forest shows ideal performance with larger data sets and/or more complex features. Use of random forest in place of SVM did not show any improvements (data not shown). As a preprocessing step, all negative viral fluorescent intensities (which resulted from background subtraction in the absence of viral binding) were adjusted to zero to indicate the lack of viral binding. The fluorescence intensities were then normalized over the entire data set in the range of [0,1]. This continuous data was converted to categorical (“binders”/“nonbinders”) based on a cutoff fluorescence that was determined from the distribution of fluorescence intensities of lactose samples which served as negative control. Next, the features (glycan type, valency, and polymer density) were scaled to a range of [0,1] in order to avoid bias from higher values. The only categorical feature (glycan type) was transformed into numeric by mapping to a two-dimensional space were α2,3-SiaLac is represented by (1,0) and α2,6-SiaLac by (0,1). This data set was then split into a training and test set, where training set contained 67% of the data. Each experiment containing 6 fluorescence measurements was individually split between training and test sets. The SVM algorithm was used for learning from the training data, and predictions were validated against the test data. We confirmed that the performance of the model was similar when the data were split across all experiments for creating training and test sets (Fig S16). Training and testing were performed separately for H1N1 EGG, H3N2, and H1N1 MDCK data. Convergence was reached within 2000 iterations for H1N1 EGG data and 200 iterations for H3N2 and H1N1 MDCK data (Fig S14). Next, the algorithm trained on egg-virus data was used to predict results for MDCK-virus data, and the prediction results from this so-called “cross-model” was compared with the results obtained from the model trained on MDCK-virus (the “self-model”). Data was prepared in python using pandas, numpy, and scipy packages. SVM was performed using python’s scikit-learn package.55

RESOURCES