Skip to main content
. 2022 Oct 25;11:229. doi: 10.1186/s13643-022-02082-4

Table 1.

Hyperparameter search space for convolutional neural networks and support vector machines

Hyperparameter Values checked Chosen value
For all models
 Sampling ratio (non-CRT:CRT) (1411:589), (2411:589), (3411:589), (4411:589) 3411: 589
 Class weights (non-CRT:CRT) (1:1), (1:5), (0.59:3.4), (1:17), (1:20) 0.59: 3.4
 Metric AUROC AUROC
Convolutional neural network—Word2Vec
 Max length of each abstract 100, 150, 200, 250, 300, 350 300
 Batch size (distribution) Uniform distribution (10, 30) 11
 Learning rate (distribution) Uniform distribution (0.0005, 0.005) 0.0047
 Dropout rate (distribution) Uniform distribution (0.1, 0.5) 0.29
 Number of filters (distribution) Uniform distribution (64, 1526) 923
 Kernel size (distribution) Uniform distribution (3, 12) 8
 Number of epochs (distribution) Uniform distribution (3, 20) 7
 Constraint applied to the kernel matrix (distribution) 1, 1.5, 2, 2.5, 3 2
 Optimizer (distribution) Adadelta, Adam Adam
 Embedding Skip-gram; CBOW Skip-gram
 Embedding dimensions 50, 100, 200, 300 100
 Number of embedding iterations 5, 10, 15, 20 10
 Loss Binary cross-entropy Binary cross-entropy
Convolutional neural network—FastText
 Max length of each abstract 100, 150, 200, 250, 300, 350 300
 Batch size (distribution) Uniform distribution (10, 30) 16
 Learning rate (distribution) Uniform distribution (0.0005, 0.005) 0.0026
 Dropout rate (distribution) Uniform distribution (0.1, 0.5) 0.47
 Number of filters (distribution) Uniform distribution (64, 1526) 532
 Kernel size (distribution) Uniform distribution (3, 12) 11
 Number of epochs (distribution) Uniform distribution (3, 20) 14
 Constraint applied to the kernel matrix (distribution) 1, 1.5, 2, 2.5, 3 2
 Optimizer (distribution) Adadelta, Adam Adam
 Embedding Skip-gram; CBOW Skip-gram
 Embedding dimensions 50, 100, 200, 300 100
 Number of embedding iterations 5, 10, 15, 20 10
 Loss Binary cross-entropy Binary cross-entropy
Support vector machines
 Kernel linear, polynomial, sigmoid, or radial basis function Radial basis function
 Kernel coefficient 1, 0.1, 0.01, 0.001, 0.0001 0.001
 Regularization parameter 1, 10, 100, 1000 100
 Ngrams 1, 1 to 2, 1 to 3, 1 to 4 1-gram and bi-gram (1 to 2)
 Word Vectorization Bag of Words, TF-IDF TF-IDF

CRT Cluster randomized trial, Ngrams A sequence of n words from a text document, TF-IDF Term frequency-inverse document frequency, CBOW Continuous bag of words model, AUROC Area under the receiver operating characteristic curve