Skip to main content
. 2021 May 11;7:e526. doi: 10.7717/peerj-cs.526

Table 1. Text methods on Cora for node classification (micro-F1, metric lies between (0,1) and higher value means better results).

% Labels 5% 10% 30% 50%
BoW 0.63 ± 0.01 0.68 ± 0.01 0.76 ± 0.01 0.78 ± 0.01
TF-IDF 0.35 ± 0.01 0.49 ± 0.01 0.70 ± 0.01 0.76 ± 0.01
LDA 0.49 ± 0.01 0.57 ± 0.01 0.60 ± 0.01 0.61 ± 0.01
SBERT pretrained 0.57 ± 0.01 0.61 ± 0.01 0.68 ± 0.01 0.70 ± 0.01
Word2Vec pretrained 0.34 ± 0.01 0.44 ± 0.01 0.59 ± 0.01 0.63 ± 0.01
Word2Vec (d = 300) 0.64 ± 0.01 0.68 ± 0.01 0.70 ± 0.01 0.71 ± 0.01
Word2Vec (d = 64) 0.65 ± 0.01 0.68 ± 0.01 0.70 ± 0.01 0.72 ± 0.01
Doc2Vec pretrained 0.54 ± 0.01 0.61 ± 0.00 0.65 ± 0.01 0.67 ± 0.01
Doc2Vec (d = 300) 0.49 ± 0.01 0.58 ± 0.01 0.66 ± 0.01 0.68 ± 0.01
Doc2Vec (d = 64) 0.50 ± 0.02 0.58 ± 0.01 0.65 ± 0.00 0.67 ± 0.01
Sent2Vec pretrained 0.63 ± 0.02 0.69 ± 0.01 0.74 ± 0.01 0.77 ± 0.01
Sent2Vec (d = 600) 0.68 ± 0.02 0.72 ± 0.01 0.75 ± 0.01 0.77 ± 0.01
Sent2Vec (d = 64) 0.68 ± 0.02 0.72 ± 0.01 0.75 ± 0.01 0.77 ± 0.01
Ernie pretrained 0.43 ± 0.01 0.52 ± 0.01 0.62 ± 0.01 0.65 ± 0.01

Note:

The best values with respect to confidence intervals are highlighted in bold.