Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2017 Dec 5;6:e22901. doi: 10.7554/eLife.22901

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2017, Guerguiev et al

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

PMC Copyright notice

Figure 5. — (A) Illustration of output loss function ( $L^{1}$ ) and local hidden loss function ( $L^{0}$ ). For a given test example shown to the network in a forward phase, the output layer loss is defined as the squared norm of the difference between target firing rates $ϕ^{1 *}$ and the average firing rate during the forward phases of the output units. Hidden layer loss is defined similarly, except the target is $ϕ^{0 *}$ (as defined in the text). (B) Plot of $L^{1}$ vs. $L^{0}$ for all of the ‘2’ images after one epoch of training. There is a strong correlation between hidden layer loss and output layer loss (real data, black), as opposed to when output and hidden loss values were randomly paired (shuffled data, gray). (C) Plot of correlation between hidden layer loss and output layer loss across training for each category of images (each dot represents one category). The correlation is significantly higher in the real data than the shuffled data throughout training. Note also that the correlation is much lower on the first epoch of training (red oval), suggesting that the conditions for credit assignment are still developing during the first epoch.

Figure 5—source data 1. Fig_5B.csv.
The first two columns of the data file contain the hidden layer loss ( $L^{0}$ ) and output layer loss ( $L^{1}$ ) of a one hidden layer network in response to all ‘2’ images in the MNIST test set after one epoch of training. The last two columns contain the same data, except that the data in the third column (Shuffled data $L^{0}$ ) was generated by randomly shuffling the hidden layer activity vectors. Fig_5C.csv. The first 10 columns of the data file contain the mean Pearson correlation coefficient between the hidden layer loss ( $L^{0}$ ) and output layer loss ( $L^{1}$ ) of the one hidden layer network in response to each category of handwritten digits across training. Each row represents one epoch of training. The last 10 columns contain the mean Pearson correlation coefficients between the shuffled hidden layer loss and the output layer loss for each category, across training. Fig_5S1A.csv. This data file contains the maximum eigenvalue of ${(I - \bar{J_{β}} \bar{J_{γ}})}^{T} (I - \bar{J_{β}} \bar{J_{γ}})$ over 60,000 training examples for a one hidden layer network, where $\bar{J_{β}}$ and $\bar{J_{γ}}$ are the mean feedforward and feedback Jacobian matrices for the last 100 training examples.

elife-22901-fig5-data1.zip^{(272KB, zip)}

DOI: 10.7554/eLife.22901.009

Figure 5—figure supplement 1. — (A) Illustration of output loss function ( $L^{1}$ ) and local hidden loss function ( $L^{0}$ ). For a given test example shown to the network in a forward phase, the output layer loss is defined as the squared norm of the difference between target firing rates $ϕ^{1 *}$ and the average firing rate during the forward phases of the output units. Hidden layer loss is defined similarly, except the target is $ϕ^{0 *}$ (as defined in the text). (B) Plot of $L^{1}$ vs. $L^{0}$ for all of the ‘2’ images after one epoch of training. There is a strong correlation between hidden layer loss and output layer loss (real data, black), as opposed to when output and hidden loss values were randomly paired (shuffled data, gray). (C) Plot of correlation between hidden layer loss and output layer loss across training for each category of images (each dot represents one category). The correlation is significantly higher in the real data than the shuffled data throughout training. Note also that the correlation is much lower on the first epoch of training (red oval), suggesting that the conditions for credit assignment are still developing during the first epoch.

Figure 5—source data 1. Fig_5B.csv.
The first two columns of the data file contain the hidden layer loss ( $L^{0}$ ) and output layer loss ( $L^{1}$ ) of a one hidden layer network in response to all ‘2’ images in the MNIST test set after one epoch of training. The last two columns contain the same data, except that the data in the third column (Shuffled data $L^{0}$ ) was generated by randomly shuffling the hidden layer activity vectors. Fig_5C.csv. The first 10 columns of the data file contain the mean Pearson correlation coefficient between the hidden layer loss ( $L^{0}$ ) and output layer loss ( $L^{1}$ ) of the one hidden layer network in response to each category of handwritten digits across training. Each row represents one epoch of training. The last 10 columns contain the mean Pearson correlation coefficients between the shuffled hidden layer loss and the output layer loss for each category, across training. Fig_5S1A.csv. This data file contains the maximum eigenvalue of ${(I - \bar{J_{β}} \bar{J_{γ}})}^{T} (I - \bar{J_{β}} \bar{J_{γ}})$ over 60,000 training examples for a one hidden layer network, where $\bar{J_{β}}$ and $\bar{J_{γ}}$ are the mean feedforward and feedback Jacobian matrices for the last 100 training examples.

elife-22901-fig5-data1.zip^{(272KB, zip)}

DOI: 10.7554/eLife.22901.009