Skip to main content
. Author manuscript; available in PMC: 2023 Jun 7.
Published in final edited form as: Cell Syst. 2023 May 9;14(5):404–417.e4. doi: 10.1016/j.cels.2023.03.008

Figure 6. Architecture of the CNN for image segmentation in TESLA.

Figure 6.

A 2-channel image is fed into the CNN to extract deep features through two convolutional components, each of which consists of a two-dimensional convolution (5×5,p), a ReLU activation function, and a batch normalization function. Subsequently, the response vectors of the features in p-dimensional cluster space are calculated through a one-dimensional convolutional layer (1×1,p) and normalized to rmm=1M across the axes of the cluster space using a batch normalization function. The normalized response map rmm=1M is used to calculate the spatial continuity loss. Further, raw cluster labels are determined by assigning the cluster to rmm=1M using an argmax function. Next, the raw cluster labels are refined to cmm=1M and used as pseudo labels to compute the gene expression loss. Finally, the spatial continuity loss, as well as the gene expression loss, are combined and backpropagated. After training, this CNN can segment input image into different clusters, and TESLA next identifies the clusters corresponding to the target region.