NISNet3D: three-dimensional nuclear synthesis and instance segmentation for fluorescence microscopy images

Liming Wu; Alain Chen; Paul Salama; Seth Winfree; Kenneth W Dunn; Edward J Delp

doi:10.1038/s41598-023-36243-9

. 2023 Jun 12;13:9533. doi: 10.1038/s41598-023-36243-9

NISNet3D: three-dimensional nuclear synthesis and instance segmentation for fluorescence microscopy images

Liming Wu ¹, Alain Chen ¹, Paul Salama ², Seth Winfree ³, Kenneth W Dunn ⁴, Edward J Delp ^1,^✉

PMCID: PMC10261124 PMID: 37308499

Abstract

The primary step in tissue cytometry is the automated distinction of individual cells (segmentation). Since cell borders are seldom labeled, cells are generally segmented by their nuclei. While tools have been developed for segmenting nuclei in two dimensions, segmentation of nuclei in three-dimensional volumes remains a challenging task. The lack of effective methods for three-dimensional segmentation represents a bottleneck in the realization of the potential of tissue cytometry, particularly as methods of tissue clearing present the opportunity to characterize entire organs. Methods based on deep learning have shown enormous promise, but their implementation is hampered by the need for large amounts of manually annotated training data. In this paper, we describe 3D Nuclei Instance Segmentation Network (NISNet3D) that directly segments 3D volumes through the use of a modified 3D U-Net, 3D marker-controlled watershed transform, and a nuclei instance segmentation system for separating touching nuclei. NISNet3D is unique in that it provides accurate segmentation of even challenging image volumes using a network trained on large amounts of synthetic nuclei derived from relatively few annotated volumes, or on synthetic data obtained without annotated volumes. We present a quantitative comparison of results obtained from NISNet3D with results obtained from a variety of existing nuclei segmentation techniques. We also examine the performance of the methods when no ground truth is available and only synthetic volumes were used for training.

Subject terms: Biological techniques, Computational biology and bioinformatics

Introduction

Over the past 10 years, various technological developments have provided biologists with the ability to collect 3D microscopy volumes of enormous scale and complexity. Methods of tissue clearing combined with automated confocal or lightsheet microscopes have enabled three-dimensional imaging of entire organs or even entire organisms at subcellular resolution. Multiplexing methods have been developed so that one can now simultaneously characterize 50 or more targets in the same tissue. However, as biologists analyze these large volumes (tissue cytometry), they quickly discover that the methods of automated image analysis necessary for extracting quantitative data from volumes of this scale are frequently inadequate for the task. In particular, while effective methods for distinguishing (segmenting) individual cells are available for analyses of two-dimensional images, corresponding methods for segmenting cells in three-dimensional volumes are generally lacking. The problem of three-dimensional image segmentation thus represents a bottleneck in the full realization of 3D tissue cytometry as a tool in biological microscopy.

There are generally two typical categories of approaches for segmentation: non-machine learning based image processing and computer vision techniques and techniques based on machine learning and in particular deep learning^1,2. The traditional image processing techniques (e.g. watershed transform, thresholding, edge detection, and morphological operations) can be effective on one type of microscopy volume but may not generalize to other types of volumes without careful parameter tuning. Segmentation techniques based on deep learning have shown great promise, in some cases providing accurate and robust results across a range of image types^3–7.

The utility of deep learning methods is limited by the large amount of manually annotated data (note: we use the terms “ground truth”, “manual annotations”, and “hand annotated volumes” interchangeably in this paper) needed for training and validation. Annotation is a labor-intensive and time-consuming process, especially for a 3D volume. While tools have been developed to facilitate the laborious process of manual annotation^8–11, the generation of training data for 3D microscopy images remains a major obstacle to implementing 3D segmentation approaches based upon deep learning.

The problem of generating sufficient training data can be alleviated using data augmentation, a process in which existing manually annotated training data is supplemented with synthetic data generated from modifications of the manually annotated data^12–15. An alternative method is to use synthetic data for training^14,16,17. One approach for generating synthetic 3D fluorescent microscopy volumes is by stacking 2D synthetic image slices using the 2D distributions of the fluorescent markers and the use of Generative Adversarial Networks (GANs)^18,19.

Convolutional neural networks (CNNs) have had great success for solving problems such as object classification, detection, and segmentation^20,21. The encoder-decoder architecture has been widely used for biomedical image analysis including volumetric segmentation^22–24, medical image registration²⁵, nuclear segmentation^{3,6,15,26–32}, and 2D cell nuclei tracking^33,34. However, most CNNs are designed for segmenting two-dimensional images and cannot be directly used for segmentation of 3D volumes^26,29,35. Other methods process images slice by slice and fuse together two dimensional results to form 3D segmentation results that fail to consider the 3D anisotropy of microscopy volumes^3,14,27.

True 3D segmentation is important for the analysis of biological structures. Nearly all biological structures require 3D characterization. The most common method for generating 3D images of biological structures is based upon collection of a sequence of 2D images. The axial dimension is sampled at a rate sufficient so that 3D structures are captured in the stack of images. Ideally the sampling rate should at the level of the resolution of the microscopy technique. While 3D microscopy is based upon volumes assembled from series of 2D images, it is important to appreciate that sequential images collected from a 3D volume differ from 2D images collected from a single plane in that sequential images collected from a 3D volume are not independent of one another, and thus contain information that is unique to the axial dimension. Insofar as methods of 3D segmentation recognize that the nature of object boundaries in 3D microscopy images differ in the axial dimension from those in the lateral dimension, the use of this axial information will support more accurate segmentation in the axial dimension.

In this paper, we are interested in segmenting nuclei in 3D microscopy volumes. There are two aspects of segmentation: semantic segmentation and instance segmentation. Semantic segmentation treats multiple objects within one general category as a single entity³⁶; voxels in a volume are indicated as nuclei or not. Instance segmentation identifies objects as different entities. In other words, while semantic segmentation is widely used for separating foreground objects and background structures in an image, instance segmentation is capable of distinguishing individual objects that may overlap one another in an image³⁶. Here we are interested in instance segmentation using deep learning so that we can analyze nuclei whose images overlap with one another. We describe here 3D Nuclei Instance Segmentation Network (NISNet3D), a deep learning-based 3D instance segmentation technique.

NISNet3D is a true 3D instance segmentation method that operates directly on 3D volumes, using 3D CNNs to exploit 3D information in a microscopy volume, thereby generating more accurate segmentations of nuclei in 3D image volumes. It can be trained on both actual microscopy volumes and synthetic microscopy volumes or a combination of both. NISNet3D can also be trained on synthetic data and further lightly retrained on limited number of other types of microscopy data as an incremental improvement.

Results

NISNet3D—general approach

Nuclei instance segmentation methods typically include a foreground and background separation step and a nuclei instance identification and segmentation step. Non-machine learning methods use thresholding such as Otsu’s thresholding for separating background and foreground and use a watershed transformation to identify individual cell nuclei. However, thresholding methods are sensitive to variability in intensity and may not generate accurate segmentation masks. Also, watershed transform segmentation without accurate markers may generate under-segmentation or over-segmentation results. Machine learning methods typically generate more accurate results if they are provided with enough training data. However, most current methods that are designed for nuclei instance segmentation only work with 2D images or use a 2D to 3D construction, which may not fully capture the 3D spatial information.

NISNet3D uses a modified 3D U-Net with attention and residual blocks, a 3D marker-controlled watershed transform, and a nuclei instance segmentation system for separating touching nuclei. NISNet3D also uses an improved distance transform for marker-controlled the watershed transform by generating a 3D gradient volume to obtain more accurate markers for each nuclei. The overall approach of NISNet3D is described here (and shown in Fig. 1). A complete, detailed description of NISNet3D and all methods used here are provided in “Methods”.

Overview of NISNet3D for 3D nuclei instance segmentation. NISNet3D uses a modified 3D U-Net for nuclei segmentation and 3D vector field array estimation where each voxel represents a 3D vector pointing to the nearest nuclei centroid. NISNet3D then generates a 3D gradient field array from the 3D vector field array and further generates refined markers. Finally, a 3D watershed transform segmentation operates on three inputs (the seeds from the gradients, the binary segmentation masks, and the original microscopy volume) to generate the instance segmentation volume.

3D U-Net was chosen as the base architecture since 2D U-Nets have shown to been effective in many biomedical applications²². While there are existing 3D techniques including a 3D U-Net architecture²³, these methods do not exploit the true 3D information in the microscopy volume, address the issue of overlapping nuclei, nor are they designed to analyze large microscopy volumes. The instance segmentation built into NISNet3D generates more accurate markers than the direct distance transform used in the typical watershed transform by learning a 3D vector field array which contains information used to accurately estimate the location of nuclei centroids and boundaries. As shown in Fig. 1, using the 3D vector field array, NISNet3D generates a 3D gradient array to aid in the detection of nuclei boundaries. This approach also refines the 3D markers that are used by the watershed transform for segmentation. Our hypothesis is that the 3D gradient array along the boundaries of touching nuclei should be very large and by extending 3D U-Net’s semantic segmentation using the proposed vector field array, NISNet3D will achieve better instance segmentation.

Quantitative analysis of segmentation quality

Segmentation accuracy in selected subvolumes of each microscopy volume was evaluated by comparison with results obtained by manual annotation with ITK-SNAP⁸, which serve as ground truth³⁹. Each subvolume was annotated by a single annotator, and each annotation was examined and approved by a biologist (co-author Dunn). We have annotated 1, 16, 9, 5 subvolumes for $V_{1}$ – $V_{4}$ , respectively. See Table 1 for the size of each annotated subvolume. Datasets $V_{1}$ – $V_{4}$ and corresponding ground truth subvolumes are available for download³⁹. Detailed information of all original five datasets used in our evaluation is shown in Table 1.

Table 1.

The description of the five datasets used for evaluation.

Original microscopy volumes
Volume ID	Volume description	Original size ( $X \times Y$ $\times Z$ )	Subvolume size ( $X \times Y$ $\times Z$ )	Number of subvolumes in volume	Number of annotated subvolumes	Percentage of volume annotated (%)	Number of subvolumes generated	Percentage of volume generated (%)
$V_{1}$	Scale³⁷-cleared rat kidney	$512 \times 512 \times 200$	$128 \times 128 \times 64$	50	1	2	250	500
$V_{2}$	Shallow rat liver	$512 \times 512 \times 32$	$128 \times 128 \times 32$	16	16	100	250	1562
$V_{3}$	BABB³⁸-cleared rat kidney	$512 \times 512 \times 415$	$128 \times 128 \times 64$	104	9	8.6	250	240
$V_{4}$	Cleared mouse intestine	$512 \times 930 \times 157$	$128 \times 128 \times 40$	114	5	4.3	200	175
$V_{5}$	Zebrafish brain	$2000 \times 1450 \times 397$	$64 \times 64 \times 64$	4392	27	0.6	n/a	n/a

Open in a new tab

Generation of synthetic training data

Deep learning methods generally require large amounts of training samples. Manually annotating ground truth is a tedious task, particularly for 3D microscopy volumes. To address this issue, we generated synthetic microscopy volumes for training NISNet3D.

To generate synthetic microscopy volumes, we first generate synthetic segmentation masks, which will be used as the synthetic ground truth masks during training, by iteratively adding initial binary nuclei to an empty 3D volume. Although nuclei are generally ellipsoidal, many cell types are characterized by nuclei of different shapes, which can be approximated as deformed ellipsoids (Fig. 2), we model these initial nuclei as deformed 3D binary ellipsoids having random sizes and orientations. Specifically, we use an elastic transformation⁴⁰ to deform the 3D binary masks of the nuclei. Details of the procedures for generating initial binary segmentation maps and deforming them are described in “ Synthetic segmentation mask generation”. Examples of deformed ellipsoids are shown in Fig. 2.

Generating synthetic nuclei from deformed ellipsoids. The three columns on the left show the XY focal planes of the non-ellipsoidal shaped nuclei in actual microscopy volumes. The voxel resolutions are $1 \times 1 \times 1 {micron}^{3} (X \times Y \times Z)$ . The columns on the right show 3D renderings of various nuclei from a synthetic binary nuclei segmentation volume after using elastic deformation.

We modify our synthetic microscopy volume generation method SpCycleGAN^6,7,28 to model non-ellipsoidal or irregularly shaped nuclei. We have shown in our previous work^7,41–43 that SpCycleGAN can be used to generate synthetic microscopy volumes that can be used for training. Thus, we used the unpaired image-to-image translation model known as SpCycleGAN⁷ for generating synthetic microscopy volumes. By unpaired we mean that the binary segmentation masks we created above are not the ground truth of actual microscopy images. SpCycleGAN is a variation of CycleGAN⁴⁴ that better maintains the spatial relationship of the nuclei (see “Synthetic microscopy volume generation”). Following training SpCycleGAN, we can generate synthetic 3D volumes with known ground truth. We do this by giving the trained SpCycleGAN binary segmentation masks that indicate where to locate the nuclei. Since SpCycleGAN generates 2D slices, we sequentially generate 2D slices for each focal plane and stack them to construct a 3D synthetic volume with nuclei located where we told SpCycleGAN we wanted the nuclei. Hence, we have synthetic volumes and their corresponding ground truth. Thus SpCycleGAN is an unsupervised and unpaired method that requires no manually annotated images for training. Using this SpCycleGAN, we can create large numbers of synthetic ground truth masks and generate many corresponding synthetic microscopy volumes for training. In total, we generated 950 synthetic microscopy subvolumes ( $128 \times 128 \times 128$ voxels) using representative subvolumes of different datasets (see Table 1).

Examples of actual microscopy images and corresponding synthetic microscopy images are shown in Fig. 3.

Generation of synthetic images and training data from exemplar datasets. Single optical XY-sections of actual microscopy images (top row), synthetic segmentation masks used by SpCycleGAN (middle row), and corresponding synthetic microscopy images (bottom row). $V_{1}$ is rat kidney tissue that was fixed and cleared using the Scale³⁷ technique, and then imaged by confocal microscopy. $V_{3}$ is rat kidney tissue that was fixed, cleared using BABB³⁸ and imaged using confocal microscopy. $V_{1}$ differs from $V_{3}$ in that $V_{1}$ includes fluorescent objects that are not nuclei, and thus must be distinguished from nuclei.

The synthetic images generated by SpCycleGAN were used to train the NISNet3D model. Further, to test the necessity of manually annotated images, we trained two versions of NISNet3D for two weighted models, NISNet3D-synth and NISNet3D-slim. NISNet3D-synth was trained using only synthetic datasets whereas NISNet3D-slim was trained on synthetic images and a small number of manually annotated actual microscopy volumes. The details of NISNet3D-synth and NISNet3D-slim training and evaluation are described in “Methods”.

Evaluation of 3D segmentation—general approach

We compare NISNet3D with other deep learning image segmentation methods including VNet²⁴, 3D U-Net²³, Cellpose³, DeepSynth⁶, nnU-Net⁴⁵ and StarDist3D¹⁵. In addition, we also compare NISNet3D to several non-deep learning segmentation methods including the 3D Watershed transform⁴⁶, Squassh⁴⁷, and the “IdentifyPrimaryObject” module from CellProfiler⁴⁸. We used the above methods for comparison with NISNet3D because they have been commonly used in the literature^49–53. We also use the 2D watershed transform and connected component analysis from VTEA for comparison⁵⁴. We trained and evaluated the methods using the same dataset as used for the evaluation of NISNet3D. We also used the same training and evaluation strategies used for NISNet3D, as described in “Experimental settings”. It should be emphasized that this means we also use synthetic data to train the comparison methods similar to how we used synthetic data to train NISNet3D.

The parameters of all the comparison segmentation methods were manually adjusted to achieve the best visual segmentation results. We feel this is the best way for a fair comparison of the segmentation methods because we tried to allow each of the methods to demonstrate their best performance. This is further discussed in “Experimental settings”. The values of $mP$ , $mR$ , $F_{1}$ , AP, and mAP were obtained for all of the methods mentioned above using the microscopy datasets $V_{1}$ – $V_{5}$ are shown in Tables 2 and 3. Note that in Table 2, the methods above the double line are results using both synthetic and partially annotated actual microscopy volumes for training. The methods below the double line use only synthetic data for training. We are surprised by how NISNet3D-synth, using only synthetic volumes for training, outperforms NISNet3D-slim that uses a combination of both synthetic data and partially annotated actual volumes in many cases. NISNet3D’s unique strength lies in instance segmentation, providing more accurate discrimination of individual nuclei. For qualitative evaluations the entire 3D volume was used, while for quantitative evaluations we use the annotated subvolumes. Therefore $mP$ , $mR$ , $F_{1}$ , AP, and mAP are computed for each dataset.

Table 2.

Comparison of object-based instance segmentation metrics for microscopy datasets in fluorescence datasets.

Open in a new tab

The best performance with respect to each metric is in bold. We use the mean Precision (mP), mean Recall (mR), mean F1 ( ${mF}_{1}$ ) score, and mean Average Precision (mAP) on multiple IoU thresholds. The methods above the double line are results using both synthetic and actual microscopy volumes for training, and use actual volumes for testing. The methods below the double line use only synthetic data for training but use actual volumes for testing.

Table 3.

Comparison of object-based instance segmentation metrics in electron microscopy dataset NucMM.

Methods	Microscopy $V_{5}$
Methods	mP	mR	${mF}_{1}$	${AP}_{. 50}$	${AP}_{. 75}$	mAP	AJI
StarDist3D¹⁵	73.44	74.24	73.84	88.59	9.78	61.20	68.56
DeepSynth⁶	81.63	76.04	78.72	71.47	50.88	63.74	75.54
Cellpose³	96.15	94.47	95.30	94.90	83.82	91.21	81.31
NISNet3D-slim	96.89	96.24	96.56	95.98	88.84	93.62	83.90

Open in a new tab

The best performance with respect to each metric are in bold. We use the mean Precision (mP), mean Recall (mR), mean F1 ( ${mF}_{1}$ ) score, and mean Average Precision (mAP) on multiple IoU thresholds. AJI is the Aggregated Jaccard Index⁵⁵.

Qualitative and quantitative improvement of instance segmentation with NISNet3D

Results presented in Figs. 4, 5, Supplementary Figures S4, and S5 show that, in general, the non-deep learning approaches (3D Watershed transform⁴⁶, Squassh⁴⁷, Cellprofiler’s “IdentifyPrimaryObject” module⁴⁸, and VTEA’s 2D watershed transform and connected component analysis⁵⁴) yielded heterogeneous segmentation results characterized by examples of both over- and under-segmentation. For example, VTEA’s 2D watershed transform and connected component analysis and Cellprofiler’s “IdentifyPrimaryObject” module are uniquely susceptible to axial over-segmentation (asterisks in XZ planes of $V_{3}$ in Fig. 4), due to their dependence on fusing 2D segmentations into a final 3D volume. The deep learning approaches including StarDist3D¹⁵, Cellpose³, DeepSynth⁶ and nnU-Net⁴⁵ performed far better in instance segmentation with some notable weaknesses compared to NISNet3D (Figs. 4, 5, Supplementary Figures S4, S5). For example, examination of the XY and XZ planes of $V_{3}$ and $V_{4}$ shown in Fig. 4 shows multiple examples of nuclei that were either over- or under-segmented by 3D U-Net, StarDist3D, and Cellpose, but were accurately detected and distinguished by NISNet (some indicated with asterisks). Results presented in Fig. 5 demonstrate that NISNet3D performs particularly well in the challenging setting of densely packed nuclei, particularly in poorly labeled samples. Asterisks indicate striking examples of nuclei that were accurately detected and distinguished by NISNet3D, but suffered from over- and under-segmentation by Cellpose and VTEA.

Qualitative assessment of NISNet3D segmentation in 2D and 3D. Three volumes were segmented by a 3D Cellprofiler “IdentifyPrimaryObject” module, 2D watershed and 3D connected components (VTEA), 3D U-Net, StarDist3D, Cellpose and NISNet3D. Shown are single optical section in XY or orthogonal sections of the image volume without (first column) or with unique segmented instances. NISNet3D demonstrates improved instance segmentation in XY and XZ imaging planes (red asterisks). We use different colors to distinguish different nuclei instances segmented by corresponding methods.

NISNet3D demonstrates improved instance segmentation. Two regions from ( $V_{2}$ and $V_{4}$ ) were segmented with 2D watershed transform and connected component analysis (VTEA), Cellpose, and NISNet3D. A single optical plane and XZ and YZ projections are shown with segmentation instance uniquely colored. NISNet3D improves semantic and instance segmentation over with examples of over- and under-segmentation (red vs yellow asterisks respectively) by both Cellpose or the 2D/3D connected component approach implemented in VTEA. Overall, NISNet3D provides the best instance segmentation in 3D.

Overall, visual comparisons of segmentation results obtained from NISNet3D with those obtained from the Cellpose, DeepSynth, StarDist3D, 3D U-Net, and nnU-Net suggest the NISNet3D addresses segmentation weaknesses of these other deep learning techniques. To assess the qualitative improvement in segmentation results generated by NISNet3D, we measured quantitative metrics of instance and semantic segmentation on manually annotated image volumes. As the primary goal of nuclear segmentation is to detect nuclei, instance segmentation accuracy was evaluated using object-based metrics. To reduce bias⁵⁶, we calculated the mean Precision, mean Recall and mean $F_{1}$ scores with multiple IoU thresholds. We set $T_{IoUs} = {0.25, 0.3, \dots, 0.45}$ for datasets $V_{1}$ – $V_{4}$ , and set $T_{IoUs} = {0.5, 0.55, \dots, 0.75}$ for dataset $V_{5}$ . Since segmenting 3D objects is more challenging than segmenting 2D objects, the IoU thresholds we used for evaluating 3D segmentation methods (0.25–0.5) are lower than the IoU thresholds reported for evaluating 2D segmentation methods (0.5–0.95), but are comparable to those reported in the literature for 3D segmentation^{3,15,49,50,57–60}. The higher thresholds used for evaluation of nuclei in $V_{5}$ reflect the fact that the nuclei in this volume are less densely packed, and thus overlap less. Note that these thresholds are not used by any of the segmentation methods and thus do not affect how the nuclei are segmented. For a given threshold, the common object detection metric Average Precision (AP)^61,62 was calculated by computing the area under the Precision-Recall Curve⁶³. In addition, we use the Aggregated Jaccard Index (AJI)⁵⁵ to integrate instance and segmentation segmentation errors.

NISNet3D had the highest $mAP$ and ${mF}_{1}$ scores amongst the tested segmentation approaches. $mP$ and $mR$ were slightly lower in some cases without impacting the ${mF}_{1}$ score. In these cases, where NISNet3D under-performed in $mR$ and $mP$ , this is likely due to over and under segmentation errors from a large marker being generated. Importantly, when the IoU threshold is increased, NISNet3D continues to outperform the other tested approaches (Fig. 6). We observe that all approaches under-performed on $V_{4}$ likely due to more crowded nuclei and under-sampling in the Z-axis. Analyses of the accuracy of semantic segmentation (shown in Supplementary Tables S1, S2) show that Cellpose³, StarDist¹⁵, nnU-Net⁴⁵, VNet²⁴, and 3D U-Net²³ perform similarly with respect to distinguishing nuclei from background.

Evaluation results using average precision (AP) for multiple intersection-over-unions (IoUs) thresholds, $T_{IoUs}$ , for datasets $V_{1}$ – $V_{5}$ . Note we only show the methods that have higher performance.

Discussion

In this paper, we describe NISNet3D, a true 3D nuclei segmentation approach built on 3D U-Net²³. Arguably, a significant bottleneck to the development and use of 3D deep learning approaches has been access to high quality training data⁶⁴. Therefore, to facilitate training NISNet3D we developed a generative model based on our previous work on CycleGAN to make high quality training data without the need for ground truth^7,44. To augment this generative approach, we also incorporated elastic deformations to generate non-ellipsoidal models of nuclei to better capture the diversity of nuclei found in tissues. Lastly, to address a need for better instance segmentation in 3D we also implemented an improved instance splitting approach.

We demonstrate that NISNet3D trained exclusively on our synthetic volumes (tested on actual microscopy volumes) performed as well as or better than NISNet3D trained on a combination of manually annotated and synthetic volumes. This suggests that synthetic volumes, made without any manually annotated volumes, are sufficient for training a 3D segmentation deep learning model to produce high-quality instance segmentations of cell nuclei. Further, although NISNet3D semantic segmentation accuracy was similar to other segmentation approaches tested here, NISNet3D instance segmentation was more accurate. We conclude that NISNet3D specifically improves instance segmentation of nuclei through a combination of better 3D training data and our instance-splitting approach.

We acknowledge that NISNet3D may have potential issues with segmenting nuclei that do not conform to our assumption that nuclei are only slightly deformed ellipsoids. For example, multi-lobed nuclei found in leukocytes or multi-nucleated myocytes are cases where our assumption may not hold. Morphological variability is a general issue in machine learning as training must involve all potential morphologies⁶. During the process of training, machine learning approaches derive features and characteristics from the contents of the training images. Thus, this weakness can be addressed by incorporating more models of varying nuclei morphology, ideally using our SpCyclGAN.

In the future, we will explore the generation of synthetic nuclei segmentation masks to better simulate complex nuclei morphologies and more realistic densities and distributions. Further, we are extending the current SpCycleGAN to generate fully 3D synthetic volumes to include a model of a microscope point spread function. This way, the generated synthetic volumes will more closely model the distribution of actual microscopy volumes, further improving synthetic volumes and the segmentation accuracy of approaches like NISNet3D.

Methods

In this section, we provide more details on the structure of NISNet3D. In our experiments, we used both real annotated volumes and synthetic volumes to train the comparison methods similar to how we used synthetic data to train NISNet3D. We therefore provide more detail as to how we generated the synthetic training data.

Notation and overview

In this paper, we denote a 3D image volume of size $X \times Y \times Z$ voxels by I, and a voxel having coordinates (x, y, z) in the volume by $I_{(x, y, z)}$ . We use superscripts to distinguish between the different types of volumes and arrays. $I^{orig}$ will be used to denote an original microscopy volume. The objective is to segment nuclei within $I^{orig}$ . The nuclei are labeled with unique voxel intensities and comprise the labeled volume $I^{seg}$ . If we have ground truth from real microscopy volumes then the volume $I^{label}$ corresponds to the annotated nuclei with unique voxel intensities. This is done to distinguish each nuclei instance.

If we use synthetic images for training then $I^{bi}$ and $I^{syn}$ will be used to denote a generated synthetic binary segmentation masks and a synthetic microscopy volume, respectively. In addition, nuclei with corresponding masks in $I^{bi}$ are labeled with unique voxel intensities and to the volume $I^{label}$ . This is done to distinguish each nuclei instance. In either case (actual or synthetic training data) $I^{label}$ will be used during the training of NISNet3D and comparison methods.

In addition to $I^{label}$ , NISNet3D generates from $I^{label}$ a vector field array, $I^{vec}$ of size $X \times Y \times Z \times 3$ , also for training purposes. Each element $I_{(x, y, z)}^{vec}$ of $I^{vec}$ , located at (x, y, z), is a 3D vector $V_{(x, y, z)}$ that is associated with the nucleus voxel $I_{(x, y, z)}^{label}$ and points to the centroid of the nucleus. Detailed description of vector field generation will be discussed in “Modified 3D U-Net”.

For a given input volume, $I^{orig}$ , NISNet3D generates a corresponding volume $I^{mask}$ of size $X \times Y \times Z \times 3$ comprising the binary segmentation results as well as an estimate of the vector field array, ${\hat{I}}^{vec}$ , which is used to locate the nuclei centroids. In addition, ${\hat{I}}^{vec}$ is used to generate a 3D array of gradients, $I^{G}$ of size $X \times Y \times Z$ , that is used in conjunction with morphological operations and watershed segmentation to distinguish and separate touching nuclei as well as remove small objects. The final output, $I^{seg}$ , is a color-coded segmentation volume. An overview of the entire system is shown in Fig. 7.

Overview of 3D nuclei instance segmentation using NISNet3D. The training and synthetic image generation are also shown but are not explicitly part of NISNet3D. NISNet3D is trained with synthetic microscopy volumes generated from SpCycleGAN and/or with annotated actual volumes as indicated. Note that $I^{bi}$ and $I^{orig}$ are not matched since they are part of the unpaired image-to-image translation of SpCyCleGAN.

NISNet3D training and instance segmentation

In this section, we describe the NISNet3D architecture, how to train it, and then use NISNet3D for nuclei instance segmentation.

Modified 3D U-Net

NISNet3D utilizes a modified 3D U-Net as shown in Fig. 8 which outputs the same size array as the input.

The 3D U-Net encoder consists of multiple Conv3D Blocks (Fig. 9a) and Residual Blocks (Fig. 9b)⁶⁵. Instead of using max pooling layers, we use Conv3D Blocks with stride 2 for feature down-sampling, which introduces more learnable parameters. Each convolution block consists of a 3D convolution layer with filter size $3 \times 3 \times 3$ , a 3D batch normalization layer, and a leaky ReLU layer. The decoder consists of multiple TransConv3D blocks (Fig. 9d) and attention gates (Fig. 9c). Each TransConv3D block includes a 3D transpose convolution with filter size $3 \times 3 \times 3$ followed by 3D batch normalization and leaky ReLU. We use a self-attention mechanism⁶⁶ to refine the feature concatenation while reconstructing the spatial information.

(a) Conv3D Block, (b) residual block, (c) attention gate, (d) TransConv3D block.

Training the modified 3D U-Net

The modified 3D U-Net can be trained on both synthetic microscopy volumes $I^{syn}$ or actual microscopy volumes $I^{orig}$ if manual ground truth annotations are available.

As shown in Fig. 7, during training, NISNet3D’s modified 3D U-Net takes an $I^{syn}$ or $I^{orig}$ , an $I^{label}$ , and an $I^{vec}$ as input. $I^{label}$ is the $X \times Y \times Z$ gray-scale label volume corresponding to $I^{bi}$ where different nuclei are marked with unique pixel intensities. It used is used by NISNet3D to learn the segmentation masks. To facilitate segmentation NISNet3D identifies nuclei centroids and estimates their locations using a vector field, $I^{vec}$ . In particular, $I^{vec}$ is part of the ground truth data used during training to enable the network to learn nuclei centroid locations. In addition, the 3D vector field array helps identify the boundaries of adjacent/touching nuclei since their corresponding vectors tend to point in different directions.

As indicated previously, a vector field, $I^{vec}$ , is an $X \times Y \times Z \times 3$ array whose elements are 3D vectors that point to the centroids of the corresponding nuclei. Specifically, $I_{(x, y, z)}^{vec}$ is a 3D vector at location (x, y, z) and points to the nearest nucleus centroid. We also denote the $X \times Y \times Z$ array consisting of the first component of each 3D vector by $I^{vec (x)}$ . Similarly, we denote the arrays of the second and third components by $I^{vec (y)}$ and $I^{vec (z)}$ , respectively, that is $I_{(x, y, z)}^{vec} = (I_{(x, y, z)}^{vec (x)}, I_{(x, y, z)}^{vec (y)}, I_{(x, y, z)}^{vec (z)})$ .

The first step in the 3D vector field array generation (VFG) (see in Fig. 10) is to obtain the centroid of each nucleus in $I^{label}$ . We denote the $k^{th}$ nucleus as the set of voxels with intensity k in $I^{label}$ , and the location of the centroid of the $k^{th}$ nucleus by $(x_{k}, y_{k}, z_{k})$ . We define $V_{(x, y, z)}^{k}$ to be the 3D vector from (x, y, z) to $(x_{k}, y_{k}, z_{k})$ and set $I_{(x, y, z)}^{vec} = V_{(x, y, z)}^{k}$ as long as the voxel at (x, y, z) is not a background voxel. Note that if a voxel $I_{(x, y, z)}^{label}$ is a background voxel (i.e. $I_{(x, y, z)}^{label}$ =0), the corresponding entry $I_{(x, y, z)}^{vec}$ in the vector field array is set to 0, as shown in Eq. (1).

Steps for 3D vector field array generation (VFG). Each nucleus voxel in the 3D vector field array, $I^{vec}$ represents a 3D vector that points to the centroid of current nucleus.

Using $I^{syn}$ or $I^{orig}$ , $I^{label}$ , and $I^{vec}$ , NISNet3D then outputs an estimate of the 3D vector field array ${\hat{I}}^{vec}$ ( ${\hat{I}}^{vec (x)}$ , ${\hat{I}}^{vec (y)}$ , ${\hat{I}}^{vec (z)}$ ) and the 3D binary segmentation mask volume $I^{mask}$ .

\begin{matrix} I_{(x, y, z)}^{vec} = \{\begin{matrix} V_{(x, y, z)}^{k}, & if I_{(x, y, z)}^{label} \neq 0 \\ 0, & otherwise \end{matrix}) \end{matrix}

Loss functions

Given $I^{syn}$ or $I^{orig}$ , $I^{label}$ , and $I^{vec}$ , the modified 3D U-Net simultaneously learns the nuclei segmentation masks $I^{mask}$ and the 3D vector field array ${\hat{I}}^{vec}$ . In the modified 3D U-Net two branches but no sigmoid function are used to obtain ${\hat{I}}^{vec}$ because the vector represented at a voxel can point to anywhere in an array. In other words, the entries in ${\hat{I}}^{vec}$ can be negative numbers or large numbers. Unlike previous methods^28,67,68 that directly learn the distance transform map, the 3D vector field array contains both the distance and direction of the nearest nuclei centroid from the current voxel location. This can help NISNet3D avoid the multiple detection of irregular shaped nuclei. The output 3D vector field array ${\hat{I}}^{vec}$ is compared with the ground truth vector field array $I^{vec}$ and the error between them minimized using the mean square error (MSE) loss function. Similarly, the segmentation result $I^{mask}$ is compared with the ground truth binary volume $I^{bi}$ and the difference minimized using a combination of the Focal Loss⁶⁹ $L_{FL}$ and Tversky Loss⁷⁰ $L_{TL}$ metrics. Denoting a ground truth binary volume $I^{bi}$ by S, the corresponding segmentation result $I^{mask}$ by $\hat{S}$ , a ground truth vector field array $I^{vec}$ by V , and the estimated vector field array ${\hat{I}}^{vec}$ by $\hat{V}$ , then, the entire loss function is,

\begin{matrix} L (S, \hat{S}, V, \hat{V}) = λ_{3} L_{TL} (S, \hat{S}) + λ_{4} L_{FL} (S, \hat{S}) + λ_{5} L_{MSE} (V, \hat{V}) \end{matrix}

where

\begin{matrix} L_{TL} = \frac{\sum_{i = 1}^{P} s_{i 1} {\hat{s}}_{i 1}}{\sum_{i = 1}^{P} s_{i 1} {\hat{s}}_{i 1} + α_{1} \sum_{i = 1}^{P} s_{i 1} {\hat{s}}_{i 0} + α_{2} \sum_{i = 1}^{P} s_{i 0} {\hat{s}}_{i 1}}, \\ L_{FL} = - \frac{1}{P} \sum_{i = 1}^{P} {β s_{i 1} {\hat{s}}_{i 0}^{γ} log ({\hat{s}}_{i 1}) + (1 - β) s_{i 0} {\hat{s}}_{i 1}^{γ} log ({\hat{s}}_{i 0})}, \\ L_{MSE} = \frac{1}{P} \sum_{i = 1}^{P} {(v_{i} - {\hat{v}}_{i})}^{2} \end{matrix}

where $α_{1} + α_{2} = 1$ are two hyper-parameters in Tversky loss⁷⁰ that control the balance between false positive and false negative detections. $β$ and $γ$ are two hyper-parameters in Focal loss⁶⁹ where $β$ balances the importance of positive/negative voxels, and $γ$ adjusts the weights for easily classified voxels. $v_{i} \in V$ is the $i^{th}$ element of V, and ${\hat{v}}_{i} \in \hat{V}$ is the $i^{th}$ entry in $\hat{V}$ . Similarly, $s_{i} \in S$ is the $i^{th}$ voxel in S, and ${\hat{s}}_{i} \in \hat{S}$ is the $i^{th}$ voxel in $\hat{S}$ . We define ${\hat{s}}_{i 0}$ to be the probability that the $i^{th}$ voxel in $\hat{S}$ is a nuclei, and ${\hat{s}}_{i 1}$ as the probability that the $i^{th}$ voxel in $\hat{S}$ is a background voxel. Similarly, $s_{i 1} = 1$ if $s_{i}$ is a nuclei voxel and 0 if $s_{i}$ is a background voxel, and vice versa for $s_{i 0}$ . Lastly, P is the total number of voxels in a volume.

Divide and conquer

To segment a large microscopy volume, we propose a divide-and-conquer scheme shown in Fig. 11. We use an analysis window of size $K \times K \times K$ that slides along each original microscopy volume $I^{orig}$ of size $X \times Y \times Z$ and crops a subvolume. Considering that cropping may result in some nuclei being partially included, and that these partially included nuclei which lie on the border of the analysis window may cause inaccurate segmentation results, we construct a padded window by symmetrically padding each cropped subvolume by $\frac{K}{4}$ voxels on each border. Also, the stride of the moving window is set to $\frac{K}{2}$ so with every step it slides, it will have $\frac{K}{2}$ voxels overlapping with the previous window. For the segmentation results of every $K \times K \times K$ window, only the interior and centered $\frac{K}{2} \times \frac{K}{2} \times \frac{K}{2}$ subvolume, which we denoted as the interior window, will be used as the segmentation results. In this paper, we use $K = 128$ for all testing data. Once the analysis window slides along the entire volume, a segmentation volume $I^{mask}$ and 3D vector field array ${\hat{I}}^{vec}$ of size $X \times Y \times Z$ will be generated. In this way, we can analyze any size input volume, especially very large volumes.

3D nuclei instance segmentation

Based on the output of the modified 3D U-Net, a 3D gradient field and an array of markers are generated and used to separate densely clustered nuclei. Moreover, the markers are refined to attain better separation. As mentioned previously, each element/vector in the estimated 3D vector field array ${\hat{I}}^{vec}$ point to the centroid of the nearest nucleus. Most often, the vectors associated with voxels on the boundary of two touching nuclei point in different directions that correspond to the locations of the centroids of the touching nuclei, and hence have a large difference/gradient. We employ such larger gradients to detect the boundaries of touching and also overlapping nuclei.

Define $\nabla {\hat{I}}^{vec}$ as the gradient of ${\hat{I}}^{vec}$ which is obtained as given as:

\begin{matrix} \nabla {\hat{I}}^{vec} & = {[\begin{matrix} \nabla {\hat{I}}^{vec (x)} & \nabla {\hat{I}}^{vec (y)} & \nabla {\hat{I}}^{vec (z)} \end{matrix}]}^{T} \\ = {[\begin{matrix} \frac{\partial {\hat{I}}^{vec (x)}}{\partial x} & \frac{\partial {\hat{I}}^{vec (y)}}{\partial y} & \frac{\partial {\hat{I}}^{vec (z)}}{\partial z} \end{matrix}]}^{T} \\ = {[\begin{matrix} S_{x} * {\hat{I}}^{vec (x)} & S_{y} * {\hat{I}}^{vec (y)} & S_{z} * {\hat{I}}^{vec (z)} \end{matrix}]}^{T}, \end{matrix}

where ${\hat{I}}^{vec (x)}$ , ${\hat{I}}^{vec (y)}$ , and ${\hat{I}}^{vec (z)}$ are the x, y, and z sub-arrays of the estimated 3D vector field array ${\hat{I}}^{vec}$ respectively, $S_{x}$ , $S_{y}$ , $S_{z}$ are 3D Sobel filters, and $*$ is the convolution operator. We also define the gradient map $I^{G}$ as the maximum of the gradient components along the x, y and z directions :

\begin{matrix} I^{G} & = max (\nabla {\hat{I}}^{vec (x)}, \nabla {\hat{I}}^{vec (y)}, \nabla {\hat{I}}^{vec (z)}) \end{matrix}

Elements of $I^{G}$ that correspond to boundaries of touching nuclei have larger values (larger gradients) which can be used to identify individual nuclei. We subsequently threshold $I^{G}$ via a thresholding function $τ (x, T_{m})$ where $τ (x, T_{m}) = 1$ if $x \geq T_{m}$ otherwise 0. This step is used to highlight and distinguish boundary voxles of individual nuclei from spurious edges that may arise from finding the gradients. The value of $T_{m}$ can affect the number of voxels delineated as boundary voxels. Larger values of $T_{m}$ result in thinner boundaries whereas smaller values of $T_{m}$ result in a thicker nuclei boundaries. We then subtract the result from the binary segmentation mask $I^{mask}$ obtained from the modified 3D U-Net, and then apply that to a non-linear half wave rectifier/RELU activation function $σ (\cdot)$ that sets all negative values to 0. We denote the result as $I^{blob}$ as given in Eq. (6). The difference $I^{mask} - τ (I^{G}, T_{m})$ removes boundary voxles of touching nuclei while emphasizing the interior. Since the difference can be negative in some cases, we use $σ (\cdot)$ to eliminate negative values. Hence, $I^{blob}$ highlights the interior regions of nuclei which maybe used as markers to perform watershed segmentation^46,71.

\begin{matrix} I^{blob} & = σ (I^{mask} - τ (I^{G}, T_{m})) \end{matrix}

\begin{matrix} I^{mark} & = δ_{t_{f}} (δ_{t_{c}} (I^{blob}, B_{c}), B_{f}) \end{matrix}

However, in some cases $I^{blob}$ may still contain boundary information. To refine it we use 3D conditional erosion with a coarse structuring element $B_{c}$ and a fine structuring $B_{f}$ as shown in Fig. 12. To achieve this we first identify the connected components⁷² in $I^{blob}$ and then iteratively erode each connected component using $B_{c}$ until its size is smaller than some threshold $t_{c}$ . We then continue eroding each connected component using $B_{f}$ until its size is smaller than another threshold $t_{f}$ . We denote the final result by $I^{mark}$ , as shown in Eq. (7), where $δ_{t_{c}} (I^{blob}, B_{c})$ defines iterative erosion of all connected component in $I^{blob}$ by a coarse structuring element $B_{c}$ until the size of each component is smaller than the coarse object threshold $t_{c}$ . Similarly, $δ_{t_{f}} (I^{blob}, B_{f})$ , denotes iterative erosion by a fine structuring element, $B_{f}$ , until the size of the object is smaller than the fine object threshold $t_{f}$ . Finally, marker-controlled watershed⁴⁶ is used to generate instance segmentation masks $I^{seg}$ using $I^{mark}$ . Small objects in $I^{seg}$ that are less than 20 voxels in size are then removed, and each object color coded for visualization.

(a) coarse 3D structuring element $B_{c}$ and (b) fine 3D structuring element $B_{f}$ for 3D conditional erosion.

3D nuclei image synthesis

Deep learning methods generally require large amounts of training samples to achieve accurate results. However, manually annotating ground truth is a tedious task and impractical in many situations especially for large 3D microscopy volumes. NISNet3D, like any other trainable segmentation method, can use synthetic 3D volumes, annotated actual 3D volumes, or combinations of both as demonstrated here.

To generate synthetic volumes, we first generate synthetic segmentation masks that are used as ground truth masks, and then translate the synthetic segmentation masks into synthetic microscopy volumes using an unsupervised image-to-image translation model known as SpCycleGAN⁷.

The way SpCycleGAN works is that we first train SpCycleGAN and then after it is trained, we can generate 3D volumes with its own inherent ground truth. We do this by giving the trained SpCycleGAN binary segmentation masks that indicate where we want the nuclei to be located in synthetic image. SpCycleGAN will then generate a synthetic volume with nuclei located where we told SpCycleGAN we wanted the nuclei. Hence, we have synthetic volumes and their corresponding ground truth. It should be noted that we do not need any ground truth volumes to train SpCycleGAN.

Synthetic segmentation mask generation

To generate a volume of synthetic binary nuclei segmentation masks, we iteratively add N initial binary nuclei to an empty 3D volume of size $128 \times 128 \times 128$ . Each initial nuclei is modeled as a 3D binary ellipsoid having random size, orientation, and location. The size, orientation, and location of the $n^{th}$ ellipsoid are parameterized by $a^{(n)}$ , $θ^{(n)}$ , and $t^{(n)}$ , respectively, where $a^{(n)} = {[a_{x}^{(n)}, a_{y}^{(n)}, a_{z}^{(n)}]}^{T}$ denotes a vector of semi-axes lengths of the $n^{th}$ ellipsoid, $θ^{(n)} = {[θ_{x}^{(n)}, θ_{y}^{(n)}, θ_{z}^{(n)}]}^{T}$ denotes a vector of rotation angles relative to the X, Y, and Z coordinate axes, and $t^{(n)} = {[t_{x}^{(n)}, t_{y}^{(n)}, t_{z}^{(n)}]}^{T}$ denotes the displacement/location vector of the $n^{th}$ ellipsoid relative the origin. The parameters $a^{(n)}$ , $θ^{(n)}$ , and $t^{(n)}$ are randomly selected based on observations of nuclei characteristics in actual microscopy volumes.

Let $J_{(x, y, z)}^{nuc, (n)}$ denote the $n^{th}$ initial nucleus at location (x, y, z) having intensity/label $n \in {1, \dots, N}$ , then

\begin{matrix} J_{(x, y, z)}^{nuc, (n)} = \{\begin{matrix} n, & if {(\frac{x}{a_{x}^{(n)}})}^{2} + {(\frac{y}{a_{y}^{(n)}})}^{2} + {(\frac{z}{a_{z}^{(n)}})}^{2} < 1 \\ 0, & otherwise \end{matrix}) \end{matrix}

The ellipsoids are assigned different intensity values to differentiate them from each other. Subsequently, each $J_{(x, y, z)}^{nuc, (n)}$ undergoes a random translation given by $t^{(n)}$ followed by random rotations specified in $θ^{(n)}$ . Denoting the original coordinates by $X$ and the translated and rotated coordinates by $\tilde{X}$ , respectively then

\begin{matrix} \tilde{X} = [\begin{matrix} \tilde{x} \\ \tilde{y} \\ \tilde{z} \end{matrix}] = R_{z} (θ_{z}^{(n)}) R_{y} (θ_{y}^{(n)}) R_{x} (θ_{x}^{(n)}) [\begin{matrix} x + t_{x}^{(n)} \\ y + t_{y}^{(n)} \\ z + t_{z}^{(n)} \end{matrix}] \end{matrix}

where $R_{x} (θ_{x}^{(n)})$ , $R_{y} (θ_{y}^{(n)})$ , and $R_{z} (θ_{z}^{(n)})$ denote the rotation matrices relative to the X, Y, and Z axes by angles $θ_{x}^{(n)}$ , $θ_{y}^{(n)}$ , and $θ_{z}^{(n)}$ , respectively. Thus, the final $n^{th}$ ellipsoid $I_{(x, y, z)}^{nuc, (n)}$ is given by $I_{(x, y, z)}^{nuc, (n)} = J_{(\tilde{x}, \tilde{y}, \tilde{z})}^{nuc, (n)}$ for all $n \in {1, \dots, N}$ . We finally constrain the transformed nuclei such that they do not overlap by more than $t_{ov}$ voxels. The ellipsoids $I_{(x, y, z)}^{nuc, (n)}$ , $n \in {1, \dots, N}$ , are then used to fill in a binary volume $I^{bi}$ of size $X \times Y \times Z$ .

Since actual nuclei are not strictly ellipsoidal but look more like deformed ellipsoids (see Fig. 2 (left column)), we use an elastic transformation^40,73 to deform $I^{bi}$ . To achieve this we first generate a coarse displacement vector field, $I^{coarse}$ , which is an array of size $d \times d \times d \times 3$ , whose entries are independent and Gaussian distributed random variables having zero mean and variance $σ^{2}$ (that is $I_{(k, x, y, z)}^{coarse} \sim N (0, σ^{2})$ ). The size parameter d is used to control the amount of deformation applied to the nuclei. $I^{coarse}$ is then interpolated to size $X \times Y \times Z \times 3$ , via spline interpolation⁷⁴ or bilinear interpolation⁴⁰ to produce a smooth displacement vector field $I^{smooth}$ . The entry $I_{(x, y, z, 1)}^{smooth}$ indicates the distance by which voxel $I_{(x, y, z)}^{bi}$ will be shifted along the X-axis. Similarly, the elements $I_{(x, y, z, 2)}^{smooth}$ and $I_{(x, y, z, 3)}^{smooth}$ provide the distances by which $I_{(x, y, z)}^{bi}$ will be shifted along the Y and Z-axes, respectively. In our experiments, we used spline interpolation and values for $d \in {4, 5, 10}$ . Each element in $I^{coarse}$ serves as an anchor point controlling the magnitude of elastic transform. Larger values of d result in larger deformations being applied through $I^{smooth}$ whereas smaller d values produce less deformation. Examples of deformed ellipsoids are shown in Fig. 2 (right column).

Synthetic microscopy volume generation

In the previous section, we described how we generate binary segmentation masks. Here we describe how we use the masks to generate synthetic microscopy volumes that have nuclei located as determined the binary segmentation mask. In particular, we use the unpaired image-to-image translation model known as SpCycleGAN⁷ for generating synthetic microscopy volumes. SpCycleGAN⁷ is a variation of CycleGAN⁴⁴ where an additional spatial constraint is added to the loss function to maintain the spatial locations of the nuclei.

SpCycleGAN is shown in Fig. 13. The way SpCycleGAN works is that we first train SpCycleGAN and then after it is trained, we can generate 3D volumes with known ground truth. We do this by giving the trained SpCycleGAN binary segmentation masks that indicate where we want the nuclei to be located. SpCycleGAN will then generate a synthetic volume with nuclei located where we told SpCycleGAN we wanted the nuclei. Hence, we have synthetic volumes and their corresponding ground truth. In fact, the synthetic 3D volumes we generate using SpCycleGAN do not need any ground truth annotations at all because SpCycleGAN is an unsupervised and unpaired method. We can create many synthetic ground truth masks and generate many corresponding synthetic microscopy volumes that have nuclei located at the locations we choose using SpCycleGAN. By unpaired we mean that the inputs to SpCycleGAN are volumes of binary segmentation masks and actual microscopy images that do not correspond with each other. In addition, the binary segmentation masks created above are not used as ground truth volumes for performing segmentation of actual microscopy images.

The architecture of SpCycleGAN that is used for generating synthetic microscopy volumes.

As shown in Fig. 7, we use slices from volumes of the binary segmentation masks, $I^{bi, (m)}, m \in {1, \dots, M}$ , where M is the number of training samples/volumes, and slices from actual microscopy volumes $I^{orig, (m)}, m \in {1, \dots, M}$ for training SpCycleGAN. After training we generate L synthetic microscopy volumes, $I^{syn, (l)}, l \in {1, \dots, L}$ , using synthetic microscopy segmentation masks, $I^{bi, (l)}, l \in {1, \dots, L}$ other than the ones used for training. Note that since SpCycleGAN generates 2D slices, we use the slices to construct a 3D synthetic volume. We acknowledge that this can be a problem because SpCycleGAN may not capture information across the volume stack.

SpCycleGAN consists of two generators G and F, two discriminators $D_{1}$ and $D_{2}$ . G learns the mapping from $I^{bi}$ to $I^{orig}$ whereas F performs the reverse mapping. Also, SpCycleGAN introduced a segmentor S for maintaining the spatial location between $I^{bi}$ and $F (G (I^{bi}))$ . The entire loss function of SpCycleGAN is:

\begin{matrix} L (G, F, S, D_{1}, D_{2}) & = L_{GAN} (G, D_{1}, I^{bi}, I^{orig}) \\ + L_{GAN} (F, D_{2}, I^{orig}, I^{bi}) \\ + λ_{1} L_{cycle} (G, F, I^{orig}, I^{bi}) \\ + λ_{2} L_{spatial} (G, S, I^{orig}, I^{bi}) \end{matrix}

where $λ_{1}$ and $λ_{2}$ are weight coefficients controlling the loss balance between $L_{cycle}$ and $L_{spatial}$ , and

\begin{matrix} L_{GAN} (G, D_{1}, I^{bi}, I^{orig}) & = E_{I^{orig}} [log (D_{1} (I^{orig}))] \\ + E_{I^{bi}} [log (1 - D_{1} (G (I^{bi})))] \\ L_{GAN} (F, D_{2}, I^{orig}, I^{bi}) & = E_{I^{bi}} [log (D_{2} (I^{bi}))] \\ + E_{I^{orig}} [log (1 - D_{2} (F (I^{orig})))] \\ L_{cycle} (G, F, I^{orig}, I^{bi}) & = E_{I^{bi}} [‖ F (G (I^{bi})) - I^{bi} ‖_{1}] \\ + E_{I^{orig}} [‖ G (F (I^{orig})) - I^{orig} ‖_{1}] \\ L_{spatial} (G, S, I^{bi}, I^{orig}) & = E_{I^{bi}} [‖ S (G (I^{bi})) - I^{bi} ‖_{2}] \end{matrix}

where ${‖ \cdot ‖}_{1}$ and ${‖ \cdot ‖}_{2}$ denote the $L_{1}$ norm and $L_{2}$ norm, respectively. $E_{I}$ is the expected value over all the input volumes of a batch to the network.

We use 512 slices of $V_{1}$ , 512 slices of $V_{2}$ , 4096 slices of $V_{3}$ , and 2048 slices of $V_{4}$ for training SpCycleGAN.

There is a limitation of our method since we need to model the nuclei shape in the synthetic binary mask as close as possible to the real nuclei in the microscopy volume. This way SpCycleGAN can generate more realistic microscopy images.

Comparison methods

We compared NISNet3D with several deep learning image segmentation methods including VNet²⁴, 3D U-Net²³, Cellpose³, DeepSynth⁶, nnU-Net⁴⁵ and StarDist3D¹⁵. In addition, we also compare NISNet3D with several commonly used biomedical image analysis tools and software including 3D Watershed⁴⁶, Squassh⁴⁷, and Cellprofiler’s “IdentifyPrimaryObject” module⁴⁸. We also compare to a blob-slice method using the 2D watershed transform with 3D connected components from VTEA⁵⁴. In the cases of software packages Cellprofiler and VTEA, we configured them for segmentation using our data sets. The details are described in the next subsection.

VNet²⁴ and 3D U-Net²³ are two popular 3D encoder-decoder networks with shortcut concatenations designed for biomedical image segmentation. Cellpose³ uses a modified 2D U-Net for estimating image segmentation and spatial flows, and uses a dynamic system³ to cluster pixels and further separate touching nuclei. When segmenting 3D volumes, Cellpose works from three different directions slice by slice and combines the 2D segmentation results into a 3D segmentation volume³. DeepSynth uses a modified 3D U-Net to segment 3D microscopy volumes and the watershed transform to separate touching nuclei⁶. nnU-Net⁴⁵ is a self-configuring U-Net-based method that mainly does semantic segmentation for biomedical applications and uses connected component analysis of the center class, followed by iterative outgrowing into the nuclei border region on the semantic segmentation results. Similarly, StarDist3D uses a modified 3D U-Net to estimate the star-convex polyhedra used to represent nuclei¹⁵.

Alternative non-deep learning based techniques such as 3D Watershed⁴⁶ uses the watershed transformation⁷¹ and conditional erosion⁴⁶ for nuclei segmentation. Similarly, Squassh, an ImageJ plugin for both 2D and 3D microscopy image segmentation, uses active contours⁴⁷. CellProfiler is an image processing toolbox and provides customized image processing and analysis modules⁴⁸ We used Cellprofiler’s “IdentifyPrimaryObject” module for our study. VTEA is an ImageJ plugin that combines various approaches including Otsu’s thresholding and the watershed transform to segment 2D nuclei slice by slice and reconstruct the results into a 3D volume⁵⁴.

For comparison purposes, we trained and evaluated the methods above using the same dataset as used for NISNet3D. We also used the same training and evaluation strategies used for NISNet3D as described in “Experimental settings”. This means that the methods that required training were also trained on synthetic data. We provide a discussion of the results in “Discussion”. Note that the 3D Watershed transform, Squassh, CellProfiler, and VTEA do not need to be trained because they use more traditional image analysis techniques.

nnU-Net is primarily used for semantic segmentation for biomedical applications^36,45. In order to obtain nuclei instance segmentation, nnU-Net uses post-processing with connected component analysis of the center class, followed by iterative outgrowing into the nuclei border region from the semantic segmentation results. We used nnU-Net’s post-processing steps to obtain the instance segmentation results presented. We observe that nnU-Net has difficulty separately touching nuclei on an object-level basis. This is due to the connected component analysis during the post-processing where touching nuclei will be grouped as one object.

Experimental settings

The parameters used for generating $I^{bi}$ are provided in Table 4 where $(a_{min}$ , $a_{max})$ is the range of the ellipsoid semi-axes lengths, $t_{ov}$ is the maximum allowed overlapping voxels between two nuclei, and N is the total number of nuclei in a synthetic volume. These parameters are based on visual inspection of nuclei characteristics in actual microscopy volumes.

Table 4.

The training evaluation strategies for NISNet3D and comparison methods.

Parameters for generating synthetic microscopy volumes							NISNet3D-slim training strategies		NISNet3D-slim
Volume name	$a_{min}$	$a_{max}$	$t_{o}$	$N$	$d$	$σ$	Version	Training	Evaluation
$V_{1}$	4	8	5	500	10	1	$M_{1}$	50 volumes of synthetic $V_{1}$	Tested on 1 subvolume of $V_{1}$
$V_{2}$	10	14	10	70	4	1	$M_{2}$	50 volumes of synthetic $V_{2}$	Tested on 16 subvolumes of $V_{2}$
$V_{3}$	6	10	200	200	5	2	$M_{3}$	50 volumes of synthetic $V_{3}$	3-fold cross-validated on 9 subvolumes of $V_{3}$
$V_{4}$	8	10	100	560	4	4	$M_{4}$	lightly retrained using $M_{3}$ on 1 subvolume of $V_{4}$	Tested on 4 subvolumes of $V_{4}$
$V_{5}$	–	–	–	–	–	–	$M_{5}$	27 subvolumes of $V_{5}$	Tested on 27 subvolumes of $V_{5}$

Open in a new tab

SpCycleGAN was trained on unpaired $I^{bi}$ and $I^{orig}$ and the trained model was used to generate synthetic microscopy volumes. The weight coefficients of the loss functions were set to $λ_{1} = λ_{2} = 10$ for the SpCycleGAN⁷. The synthetic microscopy volumes were verified by a biologist (co-author Dunn).

For the 3D Watershed transform, we used a 3D Gaussian filter to preprocess the images and used Otsu’s method to segment/isolate objects from the background of each volume. Subsequently the 3D conditional erosion described in “3D nuclei instance segmentation” was used to obtain the markers, and the marker-controlled 3D watershed method, which was implemented by the Python Scikit-image library, was used to separate touching nuclei.

For Cellprofiler, we used customized image processing modules including inhomogeneity correction, median filtering, and morphological erosion were used to preprocess the images, and the default “IdentifyPrimaryObject” module was used to obtain 2D segmentation masks for each slice. “IdentifyPrimaryObjects” uses the 2D watershed transform segmentation for identification of objects in 3D biological structures⁴⁸.

With regards to Squassh⁴⁷, we used “Background subtraction” with a manually tuned “rolling ball window size” parameter. The rest of the parameters were set to their default values. We observed that Squassh did fairly well on volume $V_{2}$ but totally failed on $V_{4}$ due to the densely clustered nuclei. For VTEA⁵⁴, a Gaussian filter and background subtraction were used to preprocess the image. The object building method in VTEA was set to “Connect 3D”, and the segmentation threshold determined automatically. We manually tuned the parameters “Centroid offset”, “Min vol”, and “Max vol” to obtain the best visual segmentation results. Finally, the watershed transform was used for 2D segmentation. The 2D segmentations were then merged into a 3D segmentation volume using a blob-slice method^54,75. For Cellpose, we used the “nuclei” style and since the training of Cellpose is only limited to 2D images, we trained Cellpose on every XY focal planes of our subvolumes following the training schemes given in Table 4. For nnU-Net, we trained nnU-Net for 500 epochs using the “3d_fullres” configuration.

For VNet²⁴, 3D U-Net²³, and DeepSynth⁶ we improved the segmentation results by using our 3D conditional erosion described in “3D nuclei instance segmentation” followed by 3D marker-controlled watershed transform to separate touching nuclei.

Both SpCycleGAN and NISNet3D were implemented using PyTorch. We used 9-block ResNet for the generators G, F, and the segmentor S (see Fig. 13). The discriminators $D_{1}$ and $D_{2}$ (Fig. 13) were implemented with the “PatchGAN” classifier⁷⁶. SpCycleGAN was also trained with an Adam optimizer⁷⁷ for 200 epochs with an initial learning rate 0.0002 that linearly decays to 0 after the first 100 epochs. Figure 2.3 depicts the generated synthetic nuclei segmentation masks and corresponding synthetic microscopy images.

As mentioned above, we investigated two versions of NISNet3D: NISNet3D-slim (which includes 5 models $M_{1}$ – $M_{5}$ ) and NISNet3D-synth (which includes 1 model $M_{syn}$ ) for different application scenarios. NISNet3D-synth is designed for the situation where no ground truth annotations are available. In contrast, NISNet3D-slim is used for the case where limited ground truth annotated volumes are available. In the case of NISNet3D-slim, synthetic volumes are used for training and a small amount of actual ground truth data is for used for light retraining. We train NISNet3D using both of the strategies to show the effectiveness of our approach. We trained 5 versions of NISNet3D-slim, denoted as $M_{1}$ – $M_{5}$ , using three training methods (See Table 4). First, we trained three models $M_{1}$ , $M_{2}$ , and $M_{3}$ using the synthetic versions of the three volumes $V_{1}$ – $V_{3}$ , respectively. Next, we transfer the weights from $M_{3}$ and continue training on a limited number of actual microscopy subvolumes of $V_{4}$ to produce the fourth model $M_{4}$ . Finally, we directly train $M_{5}$ on subvolumes of actual microscopy data $V_{5}$ only. After training, we evaluated the NISNet3D-slim models $M_{1}$ – $M_{5}$ using two different evaluation schemes. This is summarized in Supplementary Figure S3. In one scheme we directly test the models $M_{1}$ , $M_{2}$ , $M_{4}$ , and $M_{5}$ using all the subvolumes of the actual microscopy volumes $V_{1}$ , $V_{2}$ , $V_{4}$ , and $V_{5}$ , respectively, since they were not used for training. In the case of $M_{3}$ , we first train $M_{3}$ on 50 volumes of synthetic $V_{3}$ to obtain a pre-trained $M_{3}^{pre}$ .

For the second scheme we use 3-fold cross-validation to lightly retrain $M_{3}^{pre}$ to obtain the final $M_{3}$ . Specifically, we randomly shuffled 9 subvolumes of $V_{3}$ and divided them into 3 equal sets where each set contains 3 subvolumes, and then iteratively lightly retrain the $M_{3}^{pre}$ on one of the sets and test on the other two sets. We use the average of the evaluation results from the three iterations. We used cross-validation to show the effectiveness of our method when the evaluation data is limited. Note that when lightly retraining $M_{3}^{pre}$ we update all its parameters while continue to train on actual microscopy volumes. The training and evaluation scheme for all models is provided in Table 4.

We then investigated how all the methods work when there is no ground truth data for training. For this we train NISNet3D, now known as NISNet3D-synth for these experiments. NISNet3D-synth, $M_{syn}$ , was trained on 950 synthetic microscopy subvolumes that include the synthetic versions of $V_{1}$ – $V_{4}$ . Our experience has been that we need more than 800 synthetic microscopy subvolumes for training the types of 3D volumes described in Table 1. We then tested $M_{syn}$ separately on each manually annotated original microscopy datasets described in Table 1. A figure summarizing the training scheme of NISNet3D-synth and NISNet3D-slim can be found in Supplementary Figure S3.

Both NISNet3D-slim and NISNet3D-synth were trained for 100 epochs using the Adam optimizer⁷⁷ with a constant learning rate of 0.001. The weight coefficients of the loss function were set to $λ_{3} = 1$ , $λ_{4} = 10$ , $λ_{5} = 10$ based on the settings in SpCycleGAN^7,78, and the hyper-parameters $α_{1}$ , $α_{2}$ in $L_{TL}$ were set to 0.3 and 0.7 based on the highest performance configuration⁷⁰, and the hyper-parameters $β$ , $γ$ of $L_{FL}$ ⁷⁹ were set to 0.8 and 2 to minimize the training losses. The nuclei instance segmentation parameters used in our experiments are provided in Table 5. These parameters controls how well they split two touching nuclei. Lower $T_{m}$ will split touching nuclei better. If nuclei size is smaller we should set smaller $T_{c}$ and $T_{f}$ . We choose these parameters based on the feedback of segmentation qualities (the ability to split touching nuclei visually) from biologist.

Table 5.

Parameters used for NISNet3D nuclei instance segmentation.

Parameters	NISNet3D-slim					NISNet3D-synth
Parameters	$V_{1}$	$V_{2}$	$V_{3}$	$V_{4}$	$V_{5}$	$V_{1}$	$V_{2}$	$V_{3}$	$V_{4}$
$T_{m}$	5	5	1	1	0	0	0	4	0
$T_{c}$	700	3000	2000	2000	2000	2000	2000	2000	2000
$T_{f}$	200	500	700	300	200	200	200	500	200

Open in a new tab

Evaluation metrics

We use object-based metrics to evaluate nuclei instance segmentation accuracy. We define $N_{TP}^{t}$ as the number of True Positive detections when the Intersection-over-Union (IoU)⁸⁰ between a detected nucleus and a ground truth nucleus is greater than some threshold of t. Similarly, $N_{FP}^{t}$ is the number of False Positives greater than t, and $N_{FN}^{t}$ is the number of False Negatives^81,82 greater than t, respectively. $N_{TP}^{t}$ measures how many nuclei in a volume are correctly detected, and the higher the value of $N_{TP}^{t}$ the more accurate the detection. $N_{FP}^{t}$ represents the detected nuclei that are not actually nuclei but are false detections, and $N_{FN}^{t}$ represents the number of nuclei that were not detected. A precise and accurate detection method should have high $N_{TP}^{t}$ but low $N_{FP}^{t}$ and $N_{FN}^{t}$ values.

Based on $N_{TP}^{t}$ , $N_{FP}^{t}$ , and $N_{FN}^{t}$ we define the following metrics that are used to reduce bias⁵⁶: mean Precision ( $mP = \frac{1}{| T_{IoUs} |} \sum_{t \in T_{IoUs}} \frac{N_{TP}^{t}}{N_{TP}^{t} + N_{FP}^{t}}$ ), mean Recall ( $mR = \frac{1}{| T_{IoUs} |} \sum_{t \in T_{IoUs}} \frac{N_{TP}^{t}}{N_{TP}^{t} + N_{FN}^{t}}$ ), and mean $F_{1}$ score ( ${mF}_{1} = \frac{1}{| T_{IoUs} |} \sum_{t \in T_{IoUs}} \frac{2 N_{TP}^{t}}{2 N_{TP}^{t} + N_{FP}^{t} + N_{FN}^{t}}$ ) that are the mean of Precision, Recall, and the $F_{1}$ score, respectively, over a set of multiple IoU thresholds $T_{IoUs}$ . We set $T_{IoUs} = {0.25, 0.3, \dots, 0.45}$ for datasets $V_{1}$ – $V_{4}$ , and set $T_{IoUs} = {0.5, 0.55, \dots, 0.75}$ for dataset $V_{5}$ . In addition, we obtained the Average Precision (AP)^61,62 (a commonly used object detection metric), by estimating the area under the Precision-Recall Curve⁶³, using the same sets of thresholds in $T_{IoUs}$ . For example, ${AP}_{. 25}$ is the average precision evaluated at an IoU threshold of 0.25. The mean Average Precision (mAP) is then obtained as $mAP = \frac{1}{| T_{IoUs} |} \sum_{t \in T_{IoUs}} {AP}_{t}$ .

The reason for using different thresholds $T_{IoUs}$ for $V_{1}$ – $V_{4}$ compared to those used for $V_{5}$ is that throughout the experiments we observed that the nuclei in datasets $V_{1}$ – $V_{4}$ are more challenging to segment than the nuclei in $V_{5}$ . Using the same IoU thresholds for evaluating all the datasets resulted in a lower evaluation accuracy for the volumes $V_{1}$ – $V_{4}$ than for $V_{5}$ . Thus, we chose two different sets of IoU thresholds for $V_{1}$ – $V_{4}$ , and $V_{5}$ , respectively.

Finally, we use the Aggregated Jaccard Index (AJI)⁵⁵ to integrate object and voxel errors. The AJI is defined as:

\begin{matrix} AJI = \frac{\sum_{i = 1}^{N} | G_{i} \cup S_{m}^{i} |}{\sum_{i = 1}^{N} | G_{i} \cap S_{m}^{i} | + \sum_{j \in U} | S_{j} |} \end{matrix}

where $G_{i}$ denotes the ith nucleus in the ground truth volume G having a total number of N nuclei, U is the volume of segmented nuclei without corresponding ground truth, and $S_{m}^{i}$ is the mth connected component in the segmentation mask which has the largest Jaccard Index/similarity with $G_{i}$ . Note that each segmented nucleus with index m that belongs to the mth connected component cannot be used more than once.

The values of all the parameters for each method were chosen to achieve the best visual results, as discussed in “Discussion”. The values of the various metrics used are given in Tables 2 and 3 for each microscopy dataset. Figure 6 shows the AP scores using multiple IoU thresholds for each subvolume of the dataset $V_{1}$ – $V_{5}$ . The orthogonal views (XY focal planes and XZ focal planes) of the segmentation masks are overlaid on the original microscopy subvolume for each method on $V_{1}$ – $V_{5}$ as shown in Fig. 4, Supplementary Figures S4 and S5. Note that the different colors correspond to different nuclei.

Test microscopy volumes and ground truth annotations

We used four different actual microscopy 3D volumes in our experiments, denoted by $V_{1}$ – $V_{4}$ , having fluorescent-labeled (DAPI or Hoechst 33342 stain) nuclei that were collected from cleared rat kidneys (Scale³⁷ and BABB³⁸), rat livers, and cleared mouse intestines using confocal microscopy. These volumes were used to demonstrate the performance of NISNet3D, to generate training data for NISNet3D-slim, and to generate annotations that provide the ground truth for quantitative evaluation of NISNet3D and other segmentation methods. $V_{1}$ differs from $V_{3}$ in that $V_{1}$ includes fluorescent objects that are not nuclei, and thus must be distinguished from nuclei³⁷. In addition, we also used a publicly available electron microscopy zebrafish brain volume, $V_{5}$ , known as NucMM⁸³. We used the NucMM dataset and its provided ground truth to demonstrate that NISNet3D not only works on fluorescent microscopy volumes but also works with other high resolution imaging modalities.

Supplementary Information

Supplementary Information.^{(4.7MB, pdf)}

Acknowledgements

This research was partially supported by a George M. O’Brien Award from the National Institutes of Health under Grant NIH/NIDDK P30 DK079312 and the endowment of the Charles William Harrison Distinguished Professorship at Purdue University. The original image volumes used in this paper were provided by Malgorzata Kamocka, Sherry Clendenon, and Michael Ferkowicz at Indiana University. We gratefully acknowledge their cooperation.

Author contributions

L.W., E.D., and P.S. conceived and designed NISNet3D. L.W. and A.C. designed the convolutional network and developed the segmentation approach, generated the synthetic data, conducted the segmentations, generated ground truth data, and conducted the comparisons with alternative methods. L.W., P.S., A.C., and E.D. wrote the manuscript. L.W., A.C., P.S., S.W., K.D., and E.D. revised the manuscript.

Data availibility

The datasets analyzed during the current study are available in the Zenodo repository https://doi.org//10.5281/zenodo.7065147³⁹.

Code availibility

The NISNet3D source code package is available at https://gitlab.com/viper-purdue/nisnet-release. The source code is released under Creative Commons License Attribution-NonCommercial-ShareAlike-CC BY-NC-SA. The source code cannot be used for commercial purposes. Address all correspondence to Edward J. Delp, ace@ecn.purdue.edu.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-023-36243-9.

References

1.Lucas AM, et al. Open-source deep-learning software for bioimage segmentation. Mol. Biol. Cell. 2021;32:823–829. doi: 10.1091/mbc.E20-10-0660. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Piccinini F, et al. Software tools for 3d nuclei segmentation and quantitative analysis in multicellular aggregates. Comput. Struct. Biotechnol. J. 2020;18:1287–1300. doi: 10.1016/j.csbj.2020.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Stringer C, Wang T, Michaelos M, Pachitariu M. Cellpose: A generalist algorithm for cellular segmentation. Nat. Methods. 2021;18:100–106. doi: 10.1038/s41592-020-01018-x. [DOI] [PubMed] [Google Scholar]
4.Greenwald NF, et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. 2022;40:555–565. doi: 10.1038/s41587-021-01094-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Kromp F, et al. Evaluation of deep learning architectures for complex immunofluorescence nuclear image segmentation. IEEE Trans. Med. Imaging. 2021;40:1934–1949. doi: 10.1109/TMI.2021.3069558. [DOI] [PubMed] [Google Scholar]
6.Dunn KW, et al. Deepsynth: Three-dimensional nuclear segmentation of biological images using neural networks trained with synthetic data. Sci. Rep. 2019;9:18295–18309. doi: 10.1038/s41598-019-54244-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Fu, C. et al. Three dimensional fluorescence microscopy image synthesis and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2302–2310 (2018). noteSalt Lake City, UT.
8.Yushkevich PA, et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage. 2006;31:1116–1128. doi: 10.1016/j.neuroimage.2006.01.015. [DOI] [PubMed] [Google Scholar]
9.Berger DR, Seung HS, Lichtman JW. Vast (volume annotation and segmentation tool): Efficient manual and semi-automatic labeling of large 3d image stacks. Front. Neural Circ. 2018;12:88. doi: 10.3389/fncir.2018.00088. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Hollandi R, Diósdi Á, Hollandi G, Moshkov N, Horváth P. Annotatorj: An imagej plugin to ease hand annotation of cellular compartments. Mol. Biol. Cell. 2020;31:2179–2186. doi: 10.1091/mbc.E20-02-0156. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Borland D, et al. Segmentor: A tool for manual refinement of 3d microscopy annotations. BMC Bioinform. 2021;22:1–12. doi: 10.1186/s12859-021-04202-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Wang, J. & Perez, L. The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621 (arXiv preprint) (2017).
13.Mikołajczyk A, Grochowski M. Data augmentation for improving deep learning in image classification problem. Int. Interdiscip. PhD Works. 2018;20:117–122. [Google Scholar]
14.Yang L, et al. Nuset: A deep learning tool for reliably separating and analyzing crowded cells. PLoS Comput. Biol. 2020;16:e1008193. doi: 10.1371/journal.pcbi.1008193. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Weigert, M., Schmidt, U., Haase, R., Sugawara, K. & Myers, G. Star-convex polyhedra for 3d object detection and segmentation in microscopy. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 3666–3673 (2020).
16.Sadanandan SK, Ranefall P, Le Guyader S, Wählby C. Automated training of deep convolutional neural networks for cell segmentation. Sci. Rep. 2017;7:1–7. doi: 10.1038/s41598-017-07599-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Caicedo JC, et al. Evaluation of deep learning strategies for nucleus segmentation in fluorescence images. Cytometry A. 2019;95:952–965. doi: 10.1002/cyto.a.23863. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Baniukiewicz P, Lutton EJ, Collier S, Bretschneider T. Generative adversarial networks for augmenting training data of microscopic cell images. Front. Comput. Sci. 2019;10:25. [Google Scholar]
19.Goodfellow IJ, et al. Generative adversarial networks. IEEE Signal Process. Mag. 2014;1406:2661. [Google Scholar]
20.Liu L, et al. Deep learning for generic object detection: A survey. Int. J. Comput. Vision. 2020;128:261–318. [Google Scholar]
21.Carneiro G, Zheng Y, Xing F, Yang L. Review of deep learning methods in mammography, cardiovascular, and microscopy image analysis. Deep Learn. Convolut. Neural Netw. Med. Image Comput. 2017;20:11–32. [Google Scholar]
22.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Med. Image Comput. Comput. Assist. Intervention. 2015;9351:231–241. [Google Scholar]
23.Çiçek Ö, Abdulkadir A, Lienkamp S, Brox T, Ronneberger O. 3D u-net: Learning dense volumetric segmentation from sparse annotation. Med. Image Comput. Comput. Assist. Intervention. 2016;9901:424–432. [Google Scholar]
24.Milletari, F., Navab, N. & Ahmadi, S. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In International Conference on 3D Vision 565–571 (2016).
25.Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV. Voxelmorph: A learning framework for deformable medical image registration. IEEE Trans. Med. Imaging. 2019;38:1788–1800. doi: 10.1109/TMI.2019.2897538. [DOI] [PubMed] [Google Scholar]
26.Graham S, et al. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med. Image Anal. 2019;58:101563. doi: 10.1016/j.media.2019.101563. [DOI] [PubMed] [Google Scholar]
27.Fu, C. et al. Nuclei segmentation of fluorescence microscopy images using convolutional neural networks. In Proceedings of the IEEE International Symposium on Biomedical Imaging 704–708 (2017).
28.Ho, D. J., Fu, C., Salama, P., Dunn, K. W. & Delp, E. J. Nuclei segmentation of fluorescence microscopy images using three dimensional convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops 834–842 (2017).
29.Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention 265–273 (2018).
30.Ho DJ, et al. Sphere estimation network: Three-dimensional nuclei detection of fluorescence microscopy images. J. Med. Imaging. 2020;7:1–16. doi: 10.1117/1.JMI.7.4.044003. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Lux, F. & Matula, P. Dic image segmentation of dense cell populations by combining deep learning and watershed. In IEEE International Symposium on Biomedical Imaging 236–239 (2019).
32.Arbelle A, Cohen S, Raviv TR. Dual-task convlstm-unet for instance segmentation of weakly annotated microscopy videos. IEEE Trans. Med. Imaging. 2022;41:1948–1960. doi: 10.1109/TMI.2022.3152927. [DOI] [PubMed] [Google Scholar]
33.Bao, R., Al-Shakarji, N. M., Bunyak, F. & Palaniappan, K. Dmnet: Dual-stream marker guided deep network for dense cell segmentation and lineage tracking. In IEEE International Conference on Computer Vision Workshops 3354–3363 (2021). [DOI] [PMC free article] [PubMed]
34.Scherr T, Löffler K, Böhland M, Mikut R. Cell segmentation and tracking using cnn-based distance predictions and a graph-based matching strategy. PLoS One. 2020;15:e0243219. doi: 10.1371/journal.pone.0243219. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Mandal, S. & Uhlmann, V. Splinedist: Automated cell segmentation with spline curves. In Proceedings of the International Symposium on Biomedical Imaging 1082–1086 (2021).
36.Ruiz-Santaquiteria J, Bueno G, Deniz O, Vallez N, Cristobal G. Semantic versus instance segmentation in microscopic algae detection. Eng. Appl. Artif. Intell. 2020;87:103271. [Google Scholar]
37.Hama H, et al. Scale: A chemical approach for fluorescence imaging and reconstruction of transparent mouse brain. Nat. Neurosci. 2011;14:1481–1488. doi: 10.1038/nn.2928. [DOI] [PubMed] [Google Scholar]
38.Clendenon SG, Young PA, Ferkowicz M, Phillips C, Dunn KW. Deep tissue fluorescent imaging in scattering specimens using confocal microscopy. Microsc. Microanal. 2011;17:614–617. doi: 10.1017/S1431927611000535. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Chen, A. et al. 3d ground truth annotations of nuclei in 3d microscopy volumes. bioRxiv (2022).
40.Simard, P. Y., Steinkraus, D. & Platt, J. C. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the International Conference on Document Analysis and Recognition 958–963 (2003).
41.Chen, A. et al. Three dimensional synthetic non-ellipsoidal nuclei volume generation using Bezier curves. In Proceedings of the IEEE International Symposium on Biomedical Imaging (2021).
42.Wu, L. et al. Rcnn-slicenet: A slice and cluster approach for nuclei centroid detection in three-dimensional fluorescence microscopy images. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 3750–3760 (2021).
43.Wu, L., Chen, A., Salama, P., Dunn, K. W. & Delp, E. J. An ensemble learning and slice fusion strategy for three-dimensional nuclei instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2022).
44.Zhu, J., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision 2242–2251 (2017).
45.Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnu-net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods. 2021;18:203–211. doi: 10.1038/s41592-020-01008-z. [DOI] [PubMed] [Google Scholar]
46.Yang X, Li H, Zhou X. Nuclei segmentation using marker-controlled watershed, tracking using mean-shift, and Kalman filter in time-lapse microscopy. IEEE Trans. Circ. Syst. I Regul. Pap. 2006;53:2405–2414. [Google Scholar]
47.Rizk A, et al. Segmentation and quantification of subcellular structures in fluorescence microscopy images using squassh. Nat. Protoc. 2014;9:586–596. doi: 10.1038/nprot.2014.037. [DOI] [PubMed] [Google Scholar]
48.McQuin C, et al. Cellprofiler 3.0: Next-generation image processing for biology. PLoS Biol. 2018;16:e2005970-1-17. doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Hirsch, P. & Kainmueller, D. An auxiliary task for learning nuclei segmentation in 3d microscopy images. In booktitleProceedings of the Third Conference on Medical Imaging with Deep Learning, vol. 121 of Proceedings of Machine Learning Research (Arbel, T. et al. eds) 304–321 (2020).
50.Englbrecht F, Ruider IE, Bausch AR. Automatic image annotation for fluorescent cell nuclei segmentation. PLoS One. 2021;16:1–13. doi: 10.1371/journal.pone.0250093. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Lee MY, et al. Cellseg: A robust, pre-trained nucleus segmentation and pixel quantification software for highly multiplexed fluorescence images. BMC Bioinform. 2022;23:46. doi: 10.1186/s12859-022-04570-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Cutler KJ, et al. Omnipose: A high-precision morphology-independent solution for bacterial cell segmentation. Nat. Methods. 2022;19:1438–1448. doi: 10.1038/s41592-022-01639-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Mougeot G, et al. Deep learning—promises for 3D nuclear imaging: a guide for biologists. J. Cell Sci. 2022 doi: 10.1242/jcs.258986. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Winfree S, et al. Quantitative three-dimensional tissue cytometry to study kidney tissue and resident immune cells. J. Am. Soc. Nephrol. 2017;28:2108–2118. doi: 10.1681/ASN.2016091027. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Kumar N, et al. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans. Med. Imaging. 2017;36:1550–1560. doi: 10.1109/TMI.2017.2677499. [DOI] [PubMed] [Google Scholar]
56.Hosang J, Benenson R, Dollár P, Schiele B. What makes for effective detection proposals? IEEE Trans. Pattern Anal. Mach. Intell. 2016;38:814–830. doi: 10.1109/TPAMI.2015.2465908. [DOI] [PubMed] [Google Scholar]
57.Shen X, Stamos I. 3d object detection and instance segmentation from 3d range and 2d color images. Sensors. 2021 doi: 10.3390/s21041213. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Basu A, Senapati P, Deb M, Rai R, Dhal KG. A survey on recent trends in deep learning for nucleus segmentation from histopathology images. Evol. Syst. 2023 doi: 10.1007/s12530-023-09491-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Schürch CM, et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell. 2020;182:1341–1359.e19. doi: 10.1016/j.cell.2020.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Caicedo JC, et al. Nucleus segmentation across imaging experiments: The 2018 data science bowl. Nat. Methods. 2019;16:1247–1253. doi: 10.1038/s41592-019-0612-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Everingham M, et al. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vision. 2015;111:98–136. [Google Scholar]
62.Padilla, R., Netto, S. L. & da Silva, E. A. B. A survey on performance metrics for object-detection algorithms. In Proceedings of the International Conference on Systems, Signals and Image Processing 237–242 (2020).
63.Davis, J. & Goadrich, M. The relationship between precision-recall and roc curves. In Proceedings of the international conference on Machine learning 233–240 (2006).
64.Winfree S. User-accessible machine learning approaches for cell segmentation and analysis in tissue. Front. Physiol. 2022 doi: 10.3389/fphys.2022.833333. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
66.Oktay, O. et al. Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999 (arXiv preprint) (2018).
67.Han, S. et al. Nuclei counting in microscopy images with three dimensional generative adversarial networks. In Proceedings of the SPIE Conference on Medical Imaging10949, 753 – 763 (2019).
68.Zhang, D. et al. Nuclei instance segmentation with dual contour-enhanced adversarial network. In Proceedings of the IEEE International Symposium on Biomedical Imaging 409–412 (2018).
69.Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision 2999–3007 (2017).
70.Salehi, M., Sadegh, S., Erdogmus, D. & Gholipour, A. Tversky loss function for image segmentation using 3d fully convolutional deep networks. IN Proceedings of International Workshop on Machine Learning in Medical Imaging 379–387 (2017).
71.Soille, P. & Vincent, L. M. Determining watersheds in digital pictures via flooding simulations. Visual Communications and Image Processing’90: Fifth in a Series1360, 240–250 (1990).
72.Di Stefano, L. & Bulgarelli, A. A simple and efficient connected components labeling algorithm. In Proceedings of the International Conference on Image Analysis and Processing 322–327 (1999).
73.Svoboda D, Kozubek M, Stejskal S. Generation of digital phantoms of cell nuclei and simulation of image formation in 3d image cytometry. Cytometry Part A J. Int. Soc. Adv. Cytometry. 2009;75:494–509. doi: 10.1002/cyto.a.20714. [DOI] [PubMed] [Google Scholar]
74.McKinley S, Levine M. Cubic spline interpolation. Coll. Redwoods. 1998;45:1049–1060. [Google Scholar]
75.Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. A hybrid blob-slice model for accurate and efficient detection of fluorescence labeled nuclei in 3d. BMC Bioinform. 2010;11:1–13. doi: 10.1186/1471-2105-11-580. [DOI] [PMC free article] [PubMed] [Google Scholar]
76.Isola, P., Zhu, J., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 5967–5976 (2017).
77.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (arXiv preprint) (2017).
78.Wang J, et al. 3D GAN image synthesis and dataset quality assessment for bacterial biofilm. Bioinformatics. 2022;38:4598–4604. doi: 10.1093/bioinformatics/btac529. [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In IEEE International Conference on Computer Vision 2999–3007 (2017).
80.Rahman, M. A. & Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In International Symposium on Visual Computing 234–244 (2016).
81.Rezatofighi, H. et al. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 658–666 (2019).
82.Powers, D. M. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. arXiv:2010.16061 (arXiv preprint) (2020).
83.Lin, Z. et al. Nucmm dataset: 3d neuronal nuclei instance segmentation at sub-cubic millimeter scale. In International Conference on Medical Image Computing and Computer-Assisted Intervention 164–174 (2021).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information.^{(4.7MB, pdf)}

Data Availability Statement

The datasets analyzed during the current study are available in the Zenodo repository https://doi.org//10.5281/zenodo.7065147³⁹.

[CR1] 1.Lucas AM, et al. Open-source deep-learning software for bioimage segmentation. Mol. Biol. Cell. 2021;32:823–829. doi: 10.1091/mbc.E20-10-0660. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Piccinini F, et al. Software tools for 3d nuclei segmentation and quantitative analysis in multicellular aggregates. Comput. Struct. Biotechnol. J. 2020;18:1287–1300. doi: 10.1016/j.csbj.2020.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Stringer C, Wang T, Michaelos M, Pachitariu M. Cellpose: A generalist algorithm for cellular segmentation. Nat. Methods. 2021;18:100–106. doi: 10.1038/s41592-020-01018-x. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Greenwald NF, et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. 2022;40:555–565. doi: 10.1038/s41587-021-01094-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Kromp F, et al. Evaluation of deep learning architectures for complex immunofluorescence nuclear image segmentation. IEEE Trans. Med. Imaging. 2021;40:1934–1949. doi: 10.1109/TMI.2021.3069558. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Dunn KW, et al. Deepsynth: Three-dimensional nuclear segmentation of biological images using neural networks trained with synthetic data. Sci. Rep. 2019;9:18295–18309. doi: 10.1038/s41598-019-54244-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Fu, C. et al. Three dimensional fluorescence microscopy image synthesis and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2302–2310 (2018). noteSalt Lake City, UT.

[CR8] 8.Yushkevich PA, et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage. 2006;31:1116–1128. doi: 10.1016/j.neuroimage.2006.01.015. [DOI] [PubMed] [Google Scholar]

[CR9] 9.Berger DR, Seung HS, Lichtman JW. Vast (volume annotation and segmentation tool): Efficient manual and semi-automatic labeling of large 3d image stacks. Front. Neural Circ. 2018;12:88. doi: 10.3389/fncir.2018.00088. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Hollandi R, Diósdi Á, Hollandi G, Moshkov N, Horváth P. Annotatorj: An imagej plugin to ease hand annotation of cellular compartments. Mol. Biol. Cell. 2020;31:2179–2186. doi: 10.1091/mbc.E20-02-0156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Borland D, et al. Segmentor: A tool for manual refinement of 3d microscopy annotations. BMC Bioinform. 2021;22:1–12. doi: 10.1186/s12859-021-04202-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Wang, J. & Perez, L. The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621 (arXiv preprint) (2017).

[CR13] 13.Mikołajczyk A, Grochowski M. Data augmentation for improving deep learning in image classification problem. Int. Interdiscip. PhD Works. 2018;20:117–122. [Google Scholar]

[CR14] 14.Yang L, et al. Nuset: A deep learning tool for reliably separating and analyzing crowded cells. PLoS Comput. Biol. 2020;16:e1008193. doi: 10.1371/journal.pcbi.1008193. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Weigert, M., Schmidt, U., Haase, R., Sugawara, K. & Myers, G. Star-convex polyhedra for 3d object detection and segmentation in microscopy. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 3666–3673 (2020).

[CR16] 16.Sadanandan SK, Ranefall P, Le Guyader S, Wählby C. Automated training of deep convolutional neural networks for cell segmentation. Sci. Rep. 2017;7:1–7. doi: 10.1038/s41598-017-07599-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Caicedo JC, et al. Evaluation of deep learning strategies for nucleus segmentation in fluorescence images. Cytometry A. 2019;95:952–965. doi: 10.1002/cyto.a.23863. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Baniukiewicz P, Lutton EJ, Collier S, Bretschneider T. Generative adversarial networks for augmenting training data of microscopic cell images. Front. Comput. Sci. 2019;10:25. [Google Scholar]

[CR19] 19.Goodfellow IJ, et al. Generative adversarial networks. IEEE Signal Process. Mag. 2014;1406:2661. [Google Scholar]

[CR20] 20.Liu L, et al. Deep learning for generic object detection: A survey. Int. J. Comput. Vision. 2020;128:261–318. [Google Scholar]

[CR21] 21.Carneiro G, Zheng Y, Xing F, Yang L. Review of deep learning methods in mammography, cardiovascular, and microscopy image analysis. Deep Learn. Convolut. Neural Netw. Med. Image Comput. 2017;20:11–32. [Google Scholar]

[CR22] 22.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Med. Image Comput. Comput. Assist. Intervention. 2015;9351:231–241. [Google Scholar]

[CR23] 23.Çiçek Ö, Abdulkadir A, Lienkamp S, Brox T, Ronneberger O. 3D u-net: Learning dense volumetric segmentation from sparse annotation. Med. Image Comput. Comput. Assist. Intervention. 2016;9901:424–432. [Google Scholar]

[CR24] 24.Milletari, F., Navab, N. & Ahmadi, S. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In International Conference on 3D Vision 565–571 (2016).

[CR25] 25.Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV. Voxelmorph: A learning framework for deformable medical image registration. IEEE Trans. Med. Imaging. 2019;38:1788–1800. doi: 10.1109/TMI.2019.2897538. [DOI] [PubMed] [Google Scholar]

[CR26] 26.Graham S, et al. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med. Image Anal. 2019;58:101563. doi: 10.1016/j.media.2019.101563. [DOI] [PubMed] [Google Scholar]

[CR27] 27.Fu, C. et al. Nuclei segmentation of fluorescence microscopy images using convolutional neural networks. In Proceedings of the IEEE International Symposium on Biomedical Imaging 704–708 (2017).

[CR28] 28.Ho, D. J., Fu, C., Salama, P., Dunn, K. W. & Delp, E. J. Nuclei segmentation of fluorescence microscopy images using three dimensional convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops 834–842 (2017).

[CR29] 29.Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention 265–273 (2018).

[CR30] 30.Ho DJ, et al. Sphere estimation network: Three-dimensional nuclei detection of fluorescence microscopy images. J. Med. Imaging. 2020;7:1–16. doi: 10.1117/1.JMI.7.4.044003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Lux, F. & Matula, P. Dic image segmentation of dense cell populations by combining deep learning and watershed. In IEEE International Symposium on Biomedical Imaging 236–239 (2019).

[CR32] 32.Arbelle A, Cohen S, Raviv TR. Dual-task convlstm-unet for instance segmentation of weakly annotated microscopy videos. IEEE Trans. Med. Imaging. 2022;41:1948–1960. doi: 10.1109/TMI.2022.3152927. [DOI] [PubMed] [Google Scholar]

[CR33] 33.Bao, R., Al-Shakarji, N. M., Bunyak, F. & Palaniappan, K. Dmnet: Dual-stream marker guided deep network for dense cell segmentation and lineage tracking. In IEEE International Conference on Computer Vision Workshops 3354–3363 (2021). [DOI] [PMC free article] [PubMed]

[CR34] 34.Scherr T, Löffler K, Böhland M, Mikut R. Cell segmentation and tracking using cnn-based distance predictions and a graph-based matching strategy. PLoS One. 2020;15:e0243219. doi: 10.1371/journal.pone.0243219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Mandal, S. & Uhlmann, V. Splinedist: Automated cell segmentation with spline curves. In Proceedings of the International Symposium on Biomedical Imaging 1082–1086 (2021).

[CR36] 36.Ruiz-Santaquiteria J, Bueno G, Deniz O, Vallez N, Cristobal G. Semantic versus instance segmentation in microscopic algae detection. Eng. Appl. Artif. Intell. 2020;87:103271. [Google Scholar]

[CR37] 37.Hama H, et al. Scale: A chemical approach for fluorescence imaging and reconstruction of transparent mouse brain. Nat. Neurosci. 2011;14:1481–1488. doi: 10.1038/nn.2928. [DOI] [PubMed] [Google Scholar]

[CR38] 38.Clendenon SG, Young PA, Ferkowicz M, Phillips C, Dunn KW. Deep tissue fluorescent imaging in scattering specimens using confocal microscopy. Microsc. Microanal. 2011;17:614–617. doi: 10.1017/S1431927611000535. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Chen, A. et al. 3d ground truth annotations of nuclei in 3d microscopy volumes. bioRxiv (2022).

[CR40] 40.Simard, P. Y., Steinkraus, D. & Platt, J. C. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the International Conference on Document Analysis and Recognition 958–963 (2003).

[CR41] 41.Chen, A. et al. Three dimensional synthetic non-ellipsoidal nuclei volume generation using Bezier curves. In Proceedings of the IEEE International Symposium on Biomedical Imaging (2021).

[CR42] 42.Wu, L. et al. Rcnn-slicenet: A slice and cluster approach for nuclei centroid detection in three-dimensional fluorescence microscopy images. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 3750–3760 (2021).

[CR43] 43.Wu, L., Chen, A., Salama, P., Dunn, K. W. & Delp, E. J. An ensemble learning and slice fusion strategy for three-dimensional nuclei instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2022).

[CR44] 44.Zhu, J., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision 2242–2251 (2017).

[CR45] 45.Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnu-net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods. 2021;18:203–211. doi: 10.1038/s41592-020-01008-z. [DOI] [PubMed] [Google Scholar]

[CR46] 46.Yang X, Li H, Zhou X. Nuclei segmentation using marker-controlled watershed, tracking using mean-shift, and Kalman filter in time-lapse microscopy. IEEE Trans. Circ. Syst. I Regul. Pap. 2006;53:2405–2414. [Google Scholar]

[CR47] 47.Rizk A, et al. Segmentation and quantification of subcellular structures in fluorescence microscopy images using squassh. Nat. Protoc. 2014;9:586–596. doi: 10.1038/nprot.2014.037. [DOI] [PubMed] [Google Scholar]

[CR48] 48.McQuin C, et al. Cellprofiler 3.0: Next-generation image processing for biology. PLoS Biol. 2018;16:e2005970-1-17. doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.Hirsch, P. & Kainmueller, D. An auxiliary task for learning nuclei segmentation in 3d microscopy images. In booktitleProceedings of the Third Conference on Medical Imaging with Deep Learning, vol. 121 of Proceedings of Machine Learning Research (Arbel, T. et al. eds) 304–321 (2020).

[CR50] 50.Englbrecht F, Ruider IE, Bausch AR. Automatic image annotation for fluorescent cell nuclei segmentation. PLoS One. 2021;16:1–13. doi: 10.1371/journal.pone.0250093. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] 51.Lee MY, et al. Cellseg: A robust, pre-trained nucleus segmentation and pixel quantification software for highly multiplexed fluorescence images. BMC Bioinform. 2022;23:46. doi: 10.1186/s12859-022-04570-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Cutler KJ, et al. Omnipose: A high-precision morphology-independent solution for bacterial cell segmentation. Nat. Methods. 2022;19:1438–1448. doi: 10.1038/s41592-022-01639-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR53] 53.Mougeot G, et al. Deep learning—promises for 3D nuclear imaging: a guide for biologists. J. Cell Sci. 2022 doi: 10.1242/jcs.258986. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR54] 54.Winfree S, et al. Quantitative three-dimensional tissue cytometry to study kidney tissue and resident immune cells. J. Am. Soc. Nephrol. 2017;28:2108–2118. doi: 10.1681/ASN.2016091027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR55] 55.Kumar N, et al. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans. Med. Imaging. 2017;36:1550–1560. doi: 10.1109/TMI.2017.2677499. [DOI] [PubMed] [Google Scholar]

[CR56] 56.Hosang J, Benenson R, Dollár P, Schiele B. What makes for effective detection proposals? IEEE Trans. Pattern Anal. Mach. Intell. 2016;38:814–830. doi: 10.1109/TPAMI.2015.2465908. [DOI] [PubMed] [Google Scholar]

[CR57] 57.Shen X, Stamos I. 3d object detection and instance segmentation from 3d range and 2d color images. Sensors. 2021 doi: 10.3390/s21041213. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR58] 58.Basu A, Senapati P, Deb M, Rai R, Dhal KG. A survey on recent trends in deep learning for nucleus segmentation from histopathology images. Evol. Syst. 2023 doi: 10.1007/s12530-023-09491-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR59] 59.Schürch CM, et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell. 2020;182:1341–1359.e19. doi: 10.1016/j.cell.2020.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR60] 60.Caicedo JC, et al. Nucleus segmentation across imaging experiments: The 2018 data science bowl. Nat. Methods. 2019;16:1247–1253. doi: 10.1038/s41592-019-0612-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR61] 61.Everingham M, et al. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vision. 2015;111:98–136. [Google Scholar]

[CR62] 62.Padilla, R., Netto, S. L. & da Silva, E. A. B. A survey on performance metrics for object-detection algorithms. In Proceedings of the International Conference on Systems, Signals and Image Processing 237–242 (2020).

[CR63] 63.Davis, J. & Goadrich, M. The relationship between precision-recall and roc curves. In Proceedings of the international conference on Machine learning 233–240 (2006).

[CR64] 64.Winfree S. User-accessible machine learning approaches for cell segmentation and analysis in tissue. Front. Physiol. 2022 doi: 10.3389/fphys.2022.833333. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR65] 65.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).

[CR66] 66.Oktay, O. et al. Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999 (arXiv preprint) (2018).

[CR67] 67.Han, S. et al. Nuclei counting in microscopy images with three dimensional generative adversarial networks. In Proceedings of the SPIE Conference on Medical Imaging10949, 753 – 763 (2019).

[CR68] 68.Zhang, D. et al. Nuclei instance segmentation with dual contour-enhanced adversarial network. In Proceedings of the IEEE International Symposium on Biomedical Imaging 409–412 (2018).

[CR69] 69.Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision 2999–3007 (2017).

[CR70] 70.Salehi, M., Sadegh, S., Erdogmus, D. & Gholipour, A. Tversky loss function for image segmentation using 3d fully convolutional deep networks. IN Proceedings of International Workshop on Machine Learning in Medical Imaging 379–387 (2017).

[CR71] 71.Soille, P. & Vincent, L. M. Determining watersheds in digital pictures via flooding simulations. Visual Communications and Image Processing’90: Fifth in a Series1360, 240–250 (1990).

[CR72] 72.Di Stefano, L. & Bulgarelli, A. A simple and efficient connected components labeling algorithm. In Proceedings of the International Conference on Image Analysis and Processing 322–327 (1999).

[CR73] 73.Svoboda D, Kozubek M, Stejskal S. Generation of digital phantoms of cell nuclei and simulation of image formation in 3d image cytometry. Cytometry Part A J. Int. Soc. Adv. Cytometry. 2009;75:494–509. doi: 10.1002/cyto.a.20714. [DOI] [PubMed] [Google Scholar]

[CR74] 74.McKinley S, Levine M. Cubic spline interpolation. Coll. Redwoods. 1998;45:1049–1060. [Google Scholar]

[CR75] 75.Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. A hybrid blob-slice model for accurate and efficient detection of fluorescence labeled nuclei in 3d. BMC Bioinform. 2010;11:1–13. doi: 10.1186/1471-2105-11-580. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR76] 76.Isola, P., Zhu, J., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 5967–5976 (2017).

[CR77] 77.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (arXiv preprint) (2017).

[CR78] 78.Wang J, et al. 3D GAN image synthesis and dataset quality assessment for bacterial biofilm. Bioinformatics. 2022;38:4598–4604. doi: 10.1093/bioinformatics/btac529. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR79] 79.Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In IEEE International Conference on Computer Vision 2999–3007 (2017).

[CR80] 80.Rahman, M. A. & Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In International Symposium on Visual Computing 234–244 (2016).

[CR81] 81.Rezatofighi, H. et al. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 658–666 (2019).

[CR82] 82.Powers, D. M. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. arXiv:2010.16061 (arXiv preprint) (2020).

[CR83] 83.Lin, Z. et al. Nucmm dataset: 3d neuronal nuclei instance segmentation at sub-cubic millimeter scale. In International Conference on Medical Image Computing and Computer-Assisted Intervention 164–174 (2021).

PERMALINK

NISNet3D: three-dimensional nuclear synthesis and instance segmentation for fluorescence microscopy images

Liming Wu

Alain Chen

Paul Salama

Seth Winfree

Kenneth W Dunn

Edward J Delp

Abstract

Introduction

Results

NISNet3D—general approach

Figure 1.

Quantitative analysis of segmentation quality

Table 1.

Generation of synthetic training data

Figure 2.

Figure 3.

Evaluation of 3D segmentation—general approach

Table 2.

Table 3.

Qualitative and quantitative improvement of instance segmentation with NISNet3D

Figure 4.

Figure 5.

Figure 6.

Discussion

Methods

Notation and overview

Figure 7.

NISNet3D training and instance segmentation

Modified 3D U-Net

Figure 8.

Figure 9.

Training the modified 3D U-Net

Figure 10.

Loss functions

Divide and conquer

Figure 11.

3D nuclei instance segmentation

Figure 12.

3D nuclei image synthesis

Synthetic segmentation mask generation

Synthetic microscopy volume generation

Figure 13.

Comparison methods

Experimental settings

Table 4.

Table 5.

Evaluation metrics

Test microscopy volumes and ground truth annotations

Supplementary Information

Acknowledgements

Author contributions

Data availibility

Code availibility

Competing interests

Footnotes

Supplementary Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases