Table 3.
Developing a model predicting viral glycan-binding behavior
| Model | Train MSE | Test MSE |
|---|---|---|
| Fully connected | 0.8508 | 0.8753 |
| Language model (SweetTalk) | 0.8253 | 0.8726 |
| Graph model (SweetNet) | 0.7455a | 0.7352a |
Models consisted of a recurrent neural network analyzing the protein sequences of viral hemagglutinin as well as either a fully connected neural network using the counts of mono-, di-, and trisaccharides as input (“Fully connected”); a SweetTalk-based glycan language model; or a SweetNet-based GCNN. All models were trained to predict Z score transformed glycan binding of hemagglutinin from various influenza virus strains. Average MSEs from five independent training runs (N = 5), from both the training and independent test set, are shown here.
The superior value for each metric.