Table 2. Some notations for the partial derivative computation.
d | size of the input and output |
h | size of the hidden units |
xl, l∈{1,2,…,d} | value of the lth input |
xl(j) | value of the lth j nearest neighbors of input |
zl, l∈{1,2,…,d} | value of the lth output |
yi, i∈{1,2,…,h} | value of the jth hidden unit |
Wij | connecting weight between the ith hidden unit and jth input and connecting weight between the ith hidden unit and jth output |
b | bias of the hidden layer |
c | bias of the output layer |
θ | any parameters to be estimated |
λ | non-negative regularization hyper-parameter |
n | size of each batch training |
J(θ; X(i), S(i)) | reconstruction error for given input X(i) |