Fig. 6.
For each pair of hyperparameters and of the proposed unsupervised algorithm the hyperparameters of the top layer (, , ) were optimized on the validation set. For these optimal (, , ), the mean error together with the SD of the individual runs on the held-out test set is shown for each pair of Lebesgue norm and the ranking parameter . In these experiments the hyperparameter was set to the optimal value determined on the validation set. The unsupervised algorithm did not converge for (, ) and (, ), indicating that a smaller value of is required for those hyperparameters.