Skip to main content
. 2022 Feb 7;13(9):2701–2713. doi: 10.1039/d1sc05976a

Fig. 2. Generative model architecture. The input encoder maps a protein–ligand complex to a set of means and standard deviations defining latent variables, which are sampled to produce a latent vector z. The conditional encoder maps a receptor to a conditional encoding vector c. The latent vector and conditional vector are concatenated and provided to the decoder, which maps them to a generated ligand density grid. The input encoder and conditional encoder consist of 3D convolutional blocks with leaky ReLU activation functions and residual connections46 (see detail of Conv3DBlock), alternated with average pooling. The decoder uses a similar architecture in reverse, with transposed convolutions and nearest-neighbor upsampling instead of pooling. U-Net skip connections47 were included between the convolutional features of the conditional encoder and the decoder to enhance the processing of receptor context. Spectral normalization48 was applied to all learnable parameters during training. The value displayed after module names in the diagram indicates the number of outputs (or feature maps, for convolutional modules). If not specified, the number of outputs did not change from the previous layer.

Fig. 2