Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2020 Jul 7;9:e53262. doi: 10.7554/eLife.53262

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2020, Bogacz

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

PMC Copyright notice

Figure 4. — (A) Summary of the algorithm used by the actor. (B) Identifying an action based on a gradient of $F$ . The panel shows an example of a dependence of $F$ on $a$ , and we wish $a$ to take the value maximizing $F$ . To find the action, we let $a$ to change over time in proportion to the gradient of $F$ over $a$ (Equation 4.2, where the dot over $a$ denotes derivative over time). For example, if the action is initialized to $a = 1$ .5, then the gradient of $F$ at this point is positive, so $a$ is increased (Equation 4.2), as indicated by a green arrow on the x-axis. These changes in $a$ continue until the gradient is no longer positive, i.e. when $a$ is at the maximum. Analogously, if the action is initialized to $a = 3.5$ , then the gradient of $F$ is negative, so $a$ is decreased until it reaches the maximum of $F$ .

HHS Vulnerability Disclosure