Table 3.
Hyper-parameters settings for training the teacher candidates and a non-KD student.
Hyper-parameter | Value |
---|---|
Batch Size | 16 |
Optimizer | Adam |
Epochs | 30 |
Learning Rate | 0.0001 |
Hyper-parameters settings for training the teacher candidates and a non-KD student.
Hyper-parameter | Value |
---|---|
Batch Size | 16 |
Optimizer | Adam |
Epochs | 30 |
Learning Rate | 0.0001 |