Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2021 Nov 12.

Published in final edited form as: Nature. 2021 May 12;593(7858):249–254. doi: 10.1038/s41586-021-03506-2

Extended Data Fig. 1: — We used a two-layer gated recurrent unit (GRU) recurrent neural network architecture to convert sequences of neural firing rate vectors x_t (which were temporally smoothed and binned at 20 ms) into sequences of character probability vectors y_t and ‘new character’ probability scalars z_t. The y_t vectors describe the probability of each character being written at that moment in time, and the z_t scalars go high whenever the RNN detects that T5 is beginning to write any new character. Note that the top RNN layer runs at a slower frequency than the bottom layer, which we found improved the speed of training by making it easier to hold information in memory for long time periods. Thus, the RNN outputs are updated only once every 100 ms.