. 2022 Aug 25;6:309. Originally published 2021 Nov 12. [Version 2] doi: 10.12688/wellcomeopenres.17164.2

Table 2. Optimal hyper-parameters for models with and without reader embeddings.

	Model
	ResNet18		ResNet34		ResNet50
	without reader embeddings	with reader embeddings	without reader embeddings	with reader embeddings	without reader embeddings	with reader embeddings
Hyper-parameter
Activation function for projected reader embeddings		identity		identity		identity
Batch size	8	32	16	16	8	16
Dropout	0.22	0.28	0.35	0.05	0.36	0.01
L2 regularization of convolutional layers	0.199886	1.9E-05	0.000163	5E-06	0.256886	0.00443
L2 regularization of fully connected layer	4.8E-05	5.1E-05	0.000242	5E-06	1E-05	2.7E-05
L2 regularization of fully connected layer projecting the reader embeddings		4E-06		0.291381		2E-06
Learning rate for convolutional layers	2.1E-05	0.000346	0.000474	0.000282	2E-05	3.6E-05
Learning rate for fully connected layer	0.002909	0.049604	0.00163	0.023335	2.6E-05	0.029704
Learning rate for fully connected layer projecting the reader embeddings		0.000923		0.000141		0.020499
Learning rate for reader embeddings		0.001818		0.007738		0.009301
Max L2-norm of reader Embeddings		1		4		1
Proportion of images with color brightness and contrast augmentation	0.2	0.5	0	0	0.5	1
Proportion of training images with affine transformation augmentation	0.8	0.2	1	0.2	1	0.5