. 2024 Mar 8;84(3):241. doi: 10.1140/epjc/s10052-024-12607-x

Table 2.

Numerical breakdown of the events used to construct the given number of synthetic SM samples (in the SR) for the four machine learning methods considered in this report. The CURTAINs method uses a slightly narrower SB region of [2.7, 4.5] TeV to avoid transforming events across the $m_{JJ}$ turn-on region border. The SALAD samples are generated by applying the learned weights to an additional, much larger set of Herwig++ simulated SM events not contained in the LHC Olympics dataset. Note that CATHODE and CURTAINs are data-exclusive (i.e. fully data-driven), using only the the “detected” (DAT) Pythia set, while SALAD and FETA require an auxiliary “simulated” (SIM) Herwig++ set

Method	Training data	Validation data	# Samples	Oversampling
SALAD	793k SIM, 696k DAT	198K SIM, 174K DAT	1,045k	N/A
CATHODE	696k DAT	174K DAT	400k	3
CURTAINs	373k DAT	93k DAT	1,887k	4
FETA	793k SIM, 696k DAT	198K SIM, 174K DAT	732k	6