Skip to main content
[Preprint]. 2024 May 23:2024.05.15.594334. [Version 2] doi: 10.1101/2024.05.15.594334

Table 1 –

A model for glycan-glycosite matching was developed to predict permissible glycans on a glycosylation site. Modules analyzing the glycosite-flanking protein sequence and additional spatially proximal amino acids consisted of recurrent neural networks, while the module analyzing glycans was either a fully connected neural network using GlyCompare substructure features as input (GlyCompare), a glycan-based language model (SweetTalk), or a graph convolutional neural network (SweetNet). We further tested the effect of stochastic weight averaging (SWA) on model performance. Removing the information about spatially proximal amino acids from the model input is denoted by “-Spatial” while the addition of the whole protein sequence as an additional input for the model is indicated by “+Whole”. Results represent the mean values for accuracy and area under the curve (AUC) for the receiver-operator curve (ROC) on our test set after five independent training runs.

Metric GlyCompare SweetTalk SweetNet SweetNet
SWA
SweetNet
SWA
-Spatial
SweetNet
SWA
+Whole
Accuracy 0.763 0.799 0.831 0.875 0.861 0.879
ROC AUC 0.823 0.871 0.894 0.929 0.920 0.930