Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Dec 7;11:23545. doi: 10.1038/s41598-021-02910-y

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2021, corrected publication 2023

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

PMC Copyright notice

Quantum deep reinforcement learning algorithm for optimal decision making in knowledge-based adaptive radiotherapy. Schematic of a quantum deep reinforcement learning (qDRL) algorithm for optimal decision making in knowledge-based adaptive radiotherapy. qDRL employs deep q-net as a decision optimization algorithm and employs quantum state as the decision. Here, qDRL is a model-based algorithm that utilizes an artificial radiotherapy environment (ARTE) as the RL model. The qDRL artificially intelligent (AI) agent feeds in patient’s state $s_{t}$ in its memory (deep q-net) and obtains a set of q-values for a range of dose $(\{q_{{|d〉}_{t}}\})$ . The agent then selects the dose with the highest q-value and performs quantum amplification of that dose on a superimposed quantum dose decision state, $|D〉$ . A quantum measurement is performed on the amplified state. The obtained dose measurement, ${|d〉}_{t}$ , along with the state $s_{t}$ is fed into the ARTE. ARTE is composed of three functions in succession: (1) transition function, (2) RT outcome estimator, and (3) reward function, which predicts the patient’s next state $s_{t + 1}$ , RT treatment outcome in terms of probability of local control, $p_{LC}$ , and probability of radiation induced pneumonitis of grade 2 or higher, $p_{R P 2}$ , and reward value, $r_{t + 1},$ for the state-dose-decision pair. $s_{t + 1}$ , and $r_{t + 1}$ are then used by the quantum agent to update its memory. This cycle is repeated until the agent finds a terminating state, after which a new cycle is initiated for a different patient. Five relevant biophysical features from radiomics, cancer and normal tissue radiation, cytokines, and genetics, were selected to represent the patient’s state based on our earlier work¹³.