Table 1.
(a) difference d | description |
---|---|
xi − mj | input–output value difference; used in likelihood terms |
mi − mj | output–output value difference; used in regularization terms |
xi − xj | input–input value difference; used in both likelihood and regularization terms |
i − j | sequence difference; used in both likelihood and regularization terms |
(b) kernel function | description |
---|---|
1 | global |
I(|d|≤W) | hard (local in either value or sequence) |
I(|d|2/2≤W) | |
exp(−β|d|) | soft (semi-local in either value or sequence) |
exp(−β|d|2/2) | |
I(d = 1) | isolates only sequentially adjacent terms when used as sequence kernel |
I(d = 0) | isolates only terms that have the same index when used as sequence kernel |
influence function (derivative of loss function) |
(c) loss function | kernel × direction | composition |
---|---|---|
L0(d) = |d|0 | simple | |
L1(d) = |d|1 | ||
L2(d) = |d|2/2 | ||
LW,1(d) = min(|d|, W) | composite | |
LW,2(d) = min(|d|2/2, W) | ||
Lβ,1(d) = 1 − exp(−β|d|)/β | composite | |
Lβ,2(d) = 1 − exp(−β|d|2/2)/β |