. 2022 Dec 5;4(12):1185–1197. doi: 10.1038/s42256-022-00568-3

Table 3.

Results on Split CIFAR-100

Strategy	Method	Budget	GM	Task-IL	Domain-IL	Class-IL
Baselines	None – lower target			61.43 (±0.36)	18.42 (±0.33)	7.71 (±0.18)
	Joint – upper target			78.78 (±0.25)	46.85 (±0.51)	49.78 (±0.21)
Context-specific components	Separate Networks	-	-	76.83 (±0.25)	-	-
	XdG	-	-	69.86 (±0.34)	-	-
Parameter regularization	EWC	-	-	76.34 (±0.29)	21.65 (±0.55)	8.24 (±0.25)
	SI	-	-	74.84 (±0.39)	22.58 (±0.42)	8.10 (±0.24)
Functional regularization	LwF	-	-	78.59 (±0.24)	29.45 (±0.39)	25.57 (±0.27)
Replay	DGR	-	Yes	71.40 (±0.32)	20.52 (±0.43)	9.67 (±0.22)
	BI-R	-	Yes	79.14 (±0.21)	30.26 (±0.44)	25.81 (±0.41)
	ER	100	-	76.43 (±0.24)	39.00 (±0.34)	37.57 (±0.21)
	A-GEM	100	-	73.30 (±0.39)	20.51 (±0.59)	20.38 (±1.45)
Template-based classification	Generative Classifier	-	Yes	-	-	46.83 (±0.18)
	iCaRL	100	-	-	-	37.83 (±0.21)

Reported is the final test accuracy (as percentage, averaged over all contexts) of all compared methods on the Split CIFAR-100 protocol, which is performed according to all three scenarios. The experiments followed the academic continual learning setting and context identity information was available during training. The column ‘Budget’ indicates the number of examples per class that was allowed to be stored in a memory buffer. The column ‘GM’ indicates whether a generative model was learned, for which additional network capacity was used. Note that we were not able to run the method FROMP on this protocol due to its high computational costs. Each experiment was performed 10 times with different random seeds, reported is the mean (±s.e.m.) over these runs. All compared methods used convolutional layers that were pre-trained on CIFAR-10, see Methods for full details.