Skip to main content
. Author manuscript; available in PMC: 2020 Jan 9.
Published in final edited form as: IEEE Access. 2019 Jan 9;7:11093–11104. doi: 10.1109/ACCESS.2019.2891970

Fig. 1.

Fig. 1.

(a) The Π model with temporal ensembling as described in [11]; and (b) our model (SSLDEC) that, compared to the Π model, has an additional clustering layer instead of a dense layer with a softmax activation, where a prediction is passed through a target distribution as explained in equation (3). While the Π model uses weighted sum of cross-entropy and squared difference between predictions, our model uses Kullback-Leibler divergence as the loss for both labeled and unlabeled data points as described in Section III-B. Both models use stochastic data augmentation and network dropout for regularization.