Skip to main content
. 2022 Feb 16;602(7897):414–419. doi: 10.1038/s41586-021-04301-9

Fig. 1. Representation of the components of our controller design architecture.

Fig. 1

a, Depiction of the learning loop. The controller sends voltage commands on the basis of the current plasma state and control targets. These data are sent to the replay buffer, which feeds data to the learner to update the policy. b, Our environment interaction loop, consisting of a power supply model, sensing model, environment physical parameter variation and reward computation. c, Our control policy is an MLP with three hidden layers that takes measurements and control targets and outputs voltage commands. df, The interaction of TCV and the real-time-deployed control system implemented using either a conventional controller composed of many subcomponents (f) or our architecture using a single deep neural network to control all 19 coils directly (e). g, A depiction of TCV and the 19 actuated coils. The vessel is 1.5 m high, with minor radius 0.88 m and vessel half-width 0.26 m. h, A cross section of the vessel and plasma, with the important aspects labelled.