Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 May 25;12(6):746. doi: 10.3390/biom12060746

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2022 by the authors.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

PMC Copyright notice

(A) Framework of the ADQN–FBDD. (A) The ADQN–FBDD consists of a reinforced agent, a prioritized experience replay algorithm, and a chemical environment to perform chemical structure generation. The agent selects an action (insertion, deletion, or none) for the intermediate molecular fragment at each step to generate a new molecule that can maximize the cumulative rewards. The prioritized experience replay algorithm allows the agent to repeat the molecule generation based on the updated maximization of rewards. The chemical environment assesses the agent’s actions according to the predefined chemical rules and provides rewards. (B) Example of fragment-based actions. (C) The solid lines represent taken actions, including the addition or deletion of different fragments during an episode. The dashed lines represent actions that the RL agent was considered but did not take. An exploratory action is represented by the red dashed line, which was taken even though another sibling action, the one leading to S*, was ranked higher. The exploratory action did not result in any learning; however, other actions did, resulting in updates as demonstrated by the curved arrows where estimated values moved up the tree from later nodes to earlier nodes.