. 2023 Apr 14;4(4):100729. doi: 10.1016/j.patter.2023.100729

Table 5.

Comparison of test performance (and standard deviation) on the ChemProt and DDI relation extraction tasks with major layer-specific adaptation methods, for reduced numbers of training instances

No. Training instances	Baseline	Layer freeze	Layerwise decay	Layer reinit
ChemProt

100	22.45^∗ (3.22)	20.39 (2.53)	20.50 (2.44^∗)	19.33 (3.18)
500	44.77 (3.71)	48.40 (2.85)	48.55^∗(1.78^∗)	43.79 (3.29)
1000	56.62 (1.93)	59.91^∗(1.26^∗)	59.67 (1.39)	55.31 (2.65)

DDI

100	10.72 (2.93)	11.13^∗ (3.73)	10.34 (2.50^∗)	9.83 (2.64)
500	34.36 (5.46)	39.78 (4.39)	40.15^∗(3.34^∗)	36.67 (5.50)
1000	58.71 (2.87)	61.40 (2.53)	61.54^∗(1.46^∗)	58.67 (3.54)

^∗

Highest performance and lowest standard deviation.