Table 1:
Clipping type | NTK matrix | Symmetric NTK | Positive in quadratic form | Positive in eigenvalues | Loss convergence | Monotone loss decay | To zero loss |
---|---|---|---|---|---|---|---|
No clipping | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
Batch clipping | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
Large R clipping (Flat & layerwise) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
Small R clipping (Flat) | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | |
Small R clipping (Layerwise) | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |