Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. Author manuscript; available in PMC: 2016 Dec 22.

Published in final edited form as: J Mach Learn Res. 2016 Dec 1;17:211.

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

PMC Copyright notice

Partial visualization of the members of an example 𝒬_T−1. We fix a state s_T−1 = (50.1, 48.6) in this example, and we plot Q̂_T−1(s_T, a_T) for each Q̂_T−1 ∈ 𝒬_T−1 and for each a_T−1 ∈ {, , , , }. For example, the markers near the top of the plot correspond to expected returns for each Q̂ ∈ 𝒬_T that is achievable by taking the action at the current time point and then following a particular future policy. This example 𝒬_T−1 contains 20 Q̂_T−1 functions, each assuming a different π_T.

Inline graphic — Partial visualization of the members of an example 𝒬_T−1. We fix a state s_T−1 = (50.1, 48.6) in this example, and we plot Q̂_T−1(s_T, a_T) for each Q̂_T−1 ∈ 𝒬_T−1 and for each a_T−1 ∈ {, , , , }. For example, the markers near the top of the plot correspond to expected returns for each Q̂ ∈ 𝒬_T that is achievable by taking the action at the current time point and then following a particular future policy. This example 𝒬_T−1 contains 20 Q̂_T−1 functions, each assuming a different π_T.