Figure 8.

Schematic of protecting quantum memories (qubit network) from the detrimental effects of noise via RL. Given a quantum device consisting of a few qubits, the RL agent is required to take actions (i.e. making a selection from gate sequences, and the execution of measurements). To obtain the optimal effects, the RL agent responds to measurement outcomes and collects reward signals to guide the RL agent towards good actions.