Skip to main content
. 2024 Mar 8;84(3):241. doi: 10.1140/epjc/s10052-024-12607-x

Table 2.

Numerical breakdown of the events used to construct the given number of synthetic SM samples (in the SR) for the four machine learning methods considered in this report. The CURTAINs method uses a slightly narrower SB region of [2.7, 4.5] TeV to avoid transforming events across the mJJ turn-on region border. The SALAD samples are generated by applying the learned weights to an additional, much larger set of Herwig++ simulated SM events not contained in the LHC Olympics dataset. Note that CATHODE and CURTAINs are data-exclusive (i.e. fully data-driven), using only the the “detected” (DAT) Pythia set, while SALAD and FETA require an auxiliary “simulated” (SIM) Herwig++ set

Method Training data Validation data # Samples Oversampling
SALAD 793k SIM, 696k DAT 198K SIM, 174K DAT 1,045k N/A
CATHODE 696k DAT 174K DAT 400k 3
CURTAINs 373k DAT 93k DAT 1,887k 4
FETA 793k SIM, 696k DAT 198K SIM, 174K DAT 732k 6