. 2021 Oct 15;11:20517. doi: 10.1038/s41598-021-97238-y

Table 2.

The execution of all deep learning methods is timed on the test dataset of 255,701 samples.

Model	# Params.	Batch	Hardware	Time (s)	Speedup
NUPACK 3	N/A	N/A	64-core VM	372.59	$\times$ 1.00
RoBERTa	6.1M	1024	RTX 3090	388.44 ± 0.32	$\times$ 0.96
RNN	249K	8192	RTX 3090	15.87 ± 0.10	$\times$ 23.47
RNN	249K	4096	TPUv2	03.60 ± 0.11	$\times$ 103.50
CNN	2.8M	512	RTX 3090	23.84 ± 0.08	$\times$ 15.63
CNN	2.8M	4096	TPUv2	01.23 ± 0.17	$\times$ 301.74
${CNN}_{Lite}$	470K	512	RTX 3090	09.01 ± 0.00	$\times$ 41.34
${CNN}_{Lite}$	470K	4096	TPUv2	01.28 ± 0.15	$\times$ 290.21

The text in bold corresponds to the best model according to the time/speedup.

The average execution time and the standard deviation are reported in seconds. Each deep learning method is run 10 times, after an initial warm-up run. The time elapsed to load the dataset into memory is not taken into account and the batch size was chosen to maximise inference time. All deep learning models use consumer hardware or openly-available hardware (the TPU platform is completely free to use).