Skip to main content
. 2022 Jun 27;38(Suppl 1):i238–i245. doi: 10.1093/bioinformatics/btac256

Table 8.

Zero-shot and trained prediction performance on specific classes with more than 100 annotations

Ontology Term Name AUC (test) AUC (all) AUC (trained) AUC (trained mlp)
mf GO: 0001227 DNA-binding transcription repressor activity, RNA polymerase II-specific 0.257 0.405 0.932 0.926
mf GO: 0001228 DNA-binding transcription activator activity, RNA polymerase II-specific 0.574 0.699 0.948 0.944
mf GO: 0003735 Structural constituent of ribosome 0.400 0.194 0.940 0.942
mf GO: 0004867 Serine-type endopeptidase inhibitor activity 0.972 0.967 0.985 0.984
mf GO: 0005096 GTPase activator activity 0.847 0.870 0.938 0.960
bp GO: 0000381 Regulation of alternative mRNA splicing, via spliceosome 0.855 0.865 0.906 0.886
bp GO: 0032729 Positive regulation of interferon-gamma production 0.870 0.919 0.932 0.906
bp GO: 0032755 Positive regulation of interleukin-6 production 0.719 0.819 0.884 0.873
bp GO: 0032760 Positive regulation of tumor necrosis factor production 0.861 0.906 0.925 0.867
bp GO: 0046330 Positive regulation of JNK cascade 0.855 0.894 0.904 0.916
bp GO: 0051897 Positive regulation of protein kinase B signaling 0.772 0.864 0.888 0.915
bp GO: 0120162 Positive regulation of cold-induced thermogenesis 0.637 0.789 0.738 0.835
cc GO: 0005762 Mitochondrial large ribosomal subunit 0.889 0.975 0.874 0.916
cc GO: 0022625 Cytosolic large ribosomal subunit 0.898 0.969 0.893 0.849
cc GO: 0042788 Polysomal ribosome 0.858 0.950 0.889 0.780
cc GO: 1904813 Ficolin-1-rich granule lumen 0.653 0.782 0.792 0.900
Average 0.745 0.804 0.898 0.900

Note: Evaluation measures are class-centric. AUC(test) is the zero-shot performance on the test set, i.e., neither the class nor the protein were included during model training; AUC(all) is the zero-shot performance on all proteins, i.e., the class was never seen during training but the model has seen the proteins (annotated with other classes) during training; AUC(trained) and AUC(trained mlp) is the performance of the DeepGOZero and MLP models on the testing set when trained with the class (i.e. the protein is not seen but other proteins with the class were used during training).