Skip to main content
. 2021 Dec 7;11:23545. doi: 10.1038/s41598-021-02910-y

Figure 2.

Figure 2

Quantum deep reinforcement learning algorithm for optimal decision making in knowledge-based adaptive radiotherapy. Schematic of a quantum deep reinforcement learning (qDRL) algorithm for optimal decision making in knowledge-based adaptive radiotherapy. qDRL employs deep q-net as a decision optimization algorithm and employs quantum state as the decision. Here, qDRL is a model-based algorithm that utilizes an artificial radiotherapy environment (ARTE) as the RL model. The qDRL artificially intelligent (AI) agent feeds in patient’s state st in its memory (deep q-net) and obtains a set of q-values for a range of dose (qdt). The agent then selects the dose with the highest q-value and performs quantum amplification of that dose on a superimposed quantum dose decision state, D. A quantum measurement is performed on the amplified state. The obtained dose measurement, dt, along with the state st is fed into the ARTE. ARTE is composed of three functions in succession: (1) transition function, (2) RT outcome estimator, and (3) reward function, which predicts the patient’s next state st+1, RT treatment outcome in terms of probability of local control, pLC, and probability of radiation induced pneumonitis of grade 2 or higher, pRP2, and reward value, rt+1, for the state-dose-decision pair. st+1, and rt+1 are then used by the quantum agent to update its memory. This cycle is repeated until the agent finds a terminating state, after which a new cycle is initiated for a different patient. Five relevant biophysical features from radiomics, cancer and normal tissue radiation, cytokines, and genetics, were selected to represent the patient’s state based on our earlier work13.