Table 1.
LLM Inference and Decoding Hyperparameters.
| Hyper parameter | Description | Default Value | LLM Role / Effect |
|---|---|---|---|
|
Number of LLM agents in the debate | 3 | Total agents (mixture of cooperative + adversarial) |
|
Number of adversarial LLM agents | 1 | Controls adversarial presence inside the group |
|
Number of debate rounds per question | 3 | Depth of iterative LLM interaction and persuasion |
|
Number of repetitions of each configuration | 1 | Re-runs to estimate robustness and variance |
|
Number of adversarial candidate arguments generated for scoring | 10 | Number of alternative arguments sampled from the adversarial model for selection |
|
GPU device index for HuggingFace model inference | 1 | Hardware assignment for non-OpenAI LLMs |
|
Number of parallel completions for argument generation/selection | 1 | Used to generate multiple argument candidates via OpenAI models |
|
Device allocation strategy for HF models | “auto” | Automatically maps LLM weights across available GPUs/CPU |
|
Maximum number of new tokens generated by HF models | 1000 | Controls length of agent responses and debate arguments |
|
Whether HuggingFace model uses stochastic sampling | True | Enables non-deterministic sampling for diverse arguments and responses[42] |
|
Sampling temperature for HF decoding | 0.6 | Controls randomness—higher = more diverse generations |
|
Nucleus sampling probability cutoff | 0.9 | Restricts sampling to top-p portion of probability mass |











