Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2023 Jun 12;13:9533. doi: 10.1038/s41598-023-36243-9

NISNet3D: three-dimensional nuclear synthesis and instance segmentation for fluorescence microscopy images

Liming Wu 1, Alain Chen 1, Paul Salama 2, Seth Winfree 3, Kenneth W Dunn 4, Edward J Delp 1,
PMCID: PMC10261124  PMID: 37308499

Abstract

The primary step in tissue cytometry is the automated distinction of individual cells (segmentation). Since cell borders are seldom labeled, cells are generally segmented by their nuclei. While tools have been developed for segmenting nuclei in two dimensions, segmentation of nuclei in three-dimensional volumes remains a challenging task. The lack of effective methods for three-dimensional segmentation represents a bottleneck in the realization of the potential of tissue cytometry, particularly as methods of tissue clearing present the opportunity to characterize entire organs. Methods based on deep learning have shown enormous promise, but their implementation is hampered by the need for large amounts of manually annotated training data. In this paper, we describe 3D Nuclei Instance Segmentation Network (NISNet3D) that directly segments 3D volumes through the use of a modified 3D U-Net, 3D marker-controlled watershed transform, and a nuclei instance segmentation system for separating touching nuclei. NISNet3D is unique in that it provides accurate segmentation of even challenging image volumes using a network trained on large amounts of synthetic nuclei derived from relatively few annotated volumes, or on synthetic data obtained without annotated volumes. We present a quantitative comparison of results obtained from NISNet3D with results obtained from a variety of existing nuclei segmentation techniques. We also examine the performance of the methods when no ground truth is available and only synthetic volumes were used for training.

Subject terms: Biological techniques, Computational biology and bioinformatics

Introduction

Over the past 10 years, various technological developments have provided biologists with the ability to collect 3D microscopy volumes of enormous scale and complexity. Methods of tissue clearing combined with automated confocal or lightsheet microscopes have enabled three-dimensional imaging of entire organs or even entire organisms at subcellular resolution. Multiplexing methods have been developed so that one can now simultaneously characterize 50 or more targets in the same tissue. However, as biologists analyze these large volumes (tissue cytometry), they quickly discover that the methods of automated image analysis necessary for extracting quantitative data from volumes of this scale are frequently inadequate for the task. In particular, while effective methods for distinguishing (segmenting) individual cells are available for analyses of two-dimensional images, corresponding methods for segmenting cells in three-dimensional volumes are generally lacking. The problem of three-dimensional image segmentation thus represents a bottleneck in the full realization of 3D tissue cytometry as a tool in biological microscopy.

There are generally two typical categories of approaches for segmentation: non-machine learning based image processing and computer vision techniques and techniques based on machine learning and in particular deep learning1,2. The traditional image processing techniques (e.g. watershed transform, thresholding, edge detection, and morphological operations) can be effective on one type of microscopy volume but may not generalize to other types of volumes without careful parameter tuning. Segmentation techniques based on deep learning have shown great promise, in some cases providing accurate and robust results across a range of image types37.

The utility of deep learning methods is limited by the large amount of manually annotated data (note: we use the terms “ground truth”, “manual annotations”, and “hand annotated volumes” interchangeably in this paper) needed for training and validation. Annotation is a labor-intensive and time-consuming process, especially for a 3D volume. While tools have been developed to facilitate the laborious process of manual annotation811, the generation of training data for 3D microscopy images remains a major obstacle to implementing 3D segmentation approaches based upon deep learning.

The problem of generating sufficient training data can be alleviated using data augmentation, a process in which existing manually annotated training data is supplemented with synthetic data generated from modifications of the manually annotated data1215. An alternative method is to use synthetic data for training14,16,17. One approach for generating synthetic 3D fluorescent microscopy volumes is by stacking 2D synthetic image slices using the 2D distributions of the fluorescent markers and the use of Generative Adversarial Networks (GANs)18,19.

Convolutional neural networks (CNNs) have had great success for solving problems such as object classification, detection, and segmentation20,21. The encoder-decoder architecture has been widely used for biomedical image analysis including volumetric segmentation2224, medical image registration25, nuclear segmentation3,6,15,2632, and 2D cell nuclei tracking33,34. However, most CNNs are designed for segmenting two-dimensional images and cannot be directly used for segmentation of 3D volumes26,29,35. Other methods process images slice by slice and fuse together two dimensional results to form 3D segmentation results that fail to consider the 3D anisotropy of microscopy volumes3,14,27.

True 3D segmentation is important for the analysis of biological structures. Nearly all biological structures require 3D characterization. The most common method for generating 3D images of biological structures is based upon collection of a sequence of 2D images. The axial dimension is sampled at a rate sufficient so that 3D structures are captured in the stack of images. Ideally the sampling rate should at the level of the resolution of the microscopy technique. While 3D microscopy is based upon volumes assembled from series of 2D images, it is important to appreciate that sequential images collected from a 3D volume differ from 2D images collected from a single plane in that sequential images collected from a 3D volume are not independent of one another, and thus contain information that is unique to the axial dimension. Insofar as methods of 3D segmentation recognize that the nature of object boundaries in 3D microscopy images differ in the axial dimension from those in the lateral dimension, the use of this axial information will support more accurate segmentation in the axial dimension.

In this paper, we are interested in segmenting nuclei in 3D microscopy volumes. There are two aspects of segmentation: semantic segmentation and instance segmentation. Semantic segmentation treats multiple objects within one general category as a single entity36; voxels in a volume are indicated as nuclei or not. Instance segmentation identifies objects as different entities. In other words, while semantic segmentation is widely used for separating foreground objects and background structures in an image, instance segmentation is capable of distinguishing individual objects that may overlap one another in an image36. Here we are interested in instance segmentation using deep learning so that we can analyze nuclei whose images overlap with one another. We describe here 3D Nuclei Instance Segmentation Network (NISNet3D), a deep learning-based 3D instance segmentation technique.

NISNet3D is a true 3D instance segmentation method that operates directly on 3D volumes, using 3D CNNs to exploit 3D information in a microscopy volume, thereby generating more accurate segmentations of nuclei in 3D image volumes. It can be trained on both actual microscopy volumes and synthetic microscopy volumes or a combination of both. NISNet3D can also be trained on synthetic data and further lightly retrained on limited number of other types of microscopy data as an incremental improvement.

Results

NISNet3D—general approach

Nuclei instance segmentation methods typically include a foreground and background separation step and a nuclei instance identification and segmentation step. Non-machine learning methods use thresholding such as Otsu’s thresholding for separating background and foreground and use a watershed transformation to identify individual cell nuclei. However, thresholding methods are sensitive to variability in intensity and may not generate accurate segmentation masks. Also, watershed transform segmentation without accurate markers may generate under-segmentation or over-segmentation results. Machine learning methods typically generate more accurate results if they are provided with enough training data. However, most current methods that are designed for nuclei instance segmentation only work with 2D images or use a 2D to 3D construction, which may not fully capture the 3D spatial information.

NISNet3D uses a modified 3D U-Net with attention and residual blocks, a 3D marker-controlled watershed transform, and a nuclei instance segmentation system for separating touching nuclei. NISNet3D also uses an improved distance transform for marker-controlled the watershed transform by generating a 3D gradient volume to obtain more accurate markers for each nuclei. The overall approach of NISNet3D is described here (and shown in Fig. 1). A complete, detailed description of NISNet3D and all methods used here are provided in “Methods”.

Figure 1.

Figure 1

Overview of NISNet3D for 3D nuclei instance segmentation. NISNet3D uses a modified 3D U-Net for nuclei segmentation and 3D vector field array estimation where each voxel represents a 3D vector pointing to the nearest nuclei centroid. NISNet3D then generates a 3D gradient field array from the 3D vector field array and further generates refined markers. Finally, a 3D watershed transform segmentation operates on three inputs (the seeds from the gradients, the binary segmentation masks, and the original microscopy volume) to generate the instance segmentation volume.

3D U-Net was chosen as the base architecture since 2D U-Nets have shown to been effective in many biomedical applications22. While there are existing 3D techniques including a 3D U-Net architecture23, these methods do not exploit the true 3D information in the microscopy volume, address the issue of overlapping nuclei, nor are they designed to analyze large microscopy volumes. The instance segmentation built into NISNet3D generates more accurate markers than the direct distance transform used in the typical watershed transform by learning a 3D vector field array which contains information used to accurately estimate the location of nuclei centroids and boundaries. As shown in Fig. 1, using the 3D vector field array, NISNet3D generates a 3D gradient array to aid in the detection of nuclei boundaries. This approach also refines the 3D markers that are used by the watershed transform for segmentation. Our hypothesis is that the 3D gradient array along the boundaries of touching nuclei should be very large and by extending 3D U-Net’s semantic segmentation using the proposed vector field array, NISNet3D will achieve better instance segmentation.

Quantitative analysis of segmentation quality

Segmentation accuracy in selected subvolumes of each microscopy volume was evaluated by comparison with results obtained by manual annotation with ITK-SNAP8, which serve as ground truth39. Each subvolume was annotated by a single annotator, and each annotation was examined and approved by a biologist (co-author Dunn). We have annotated 1, 16, 9, 5 subvolumes for V1V4, respectively. See Table 1 for the size of each annotated subvolume. Datasets V1V4 and corresponding ground truth subvolumes are available for download39. Detailed information of all original five datasets used in our evaluation is shown in Table 1.

Table 1.

The description of the five datasets used for evaluation.

Original microscopy volumes
Volume ID Volume description Original size (X×Y ×Z) Subvolume size (X×Y ×Z) Number of subvolumes in volume Number of annotated subvolumes Percentage of volume annotated (%) Number of subvolumes generated Percentage of volume generated (%)
V1 Scale37-cleared rat kidney 512×512×200 128×128×64 50 1 2 250 500
V2 Shallow rat liver 512×512×32 128×128×32 16 16 100 250 1562
V3 BABB38-cleared rat kidney 512×512×415 128×128×64 104 9 8.6 250 240
V4 Cleared mouse intestine 512×930×157 128×128×40 114 5 4.3 200 175
V5 Zebrafish brain 2000×1450×397 64×64×64 4392 27 0.6 n/a n/a

Generation of synthetic training data

Deep learning methods generally require large amounts of training samples. Manually annotating ground truth is a tedious task, particularly for 3D microscopy volumes. To address this issue, we generated synthetic microscopy volumes for training NISNet3D.

To generate synthetic microscopy volumes, we first generate synthetic segmentation masks, which will be used as the synthetic ground truth masks during training, by iteratively adding initial binary nuclei to an empty 3D volume. Although nuclei are generally ellipsoidal, many cell types are characterized by nuclei of different shapes, which can be approximated as deformed ellipsoids (Fig. 2), we model these initial nuclei as deformed 3D binary ellipsoids having random sizes and orientations. Specifically, we use an elastic transformation40 to deform the 3D binary masks of the nuclei. Details of the procedures for generating initial binary segmentation maps and deforming them are described in “ Synthetic segmentation mask generation”. Examples of deformed ellipsoids are shown in Fig. 2.

Figure 2.

Figure 2

Generating synthetic nuclei from deformed ellipsoids. The three columns on the left show the XY focal planes of the non-ellipsoidal shaped nuclei in actual microscopy volumes. The voxel resolutions are 1×1×1micron3(X×Y×Z). The columns on the right show 3D renderings of various nuclei from a synthetic binary nuclei segmentation volume after using elastic deformation.

We modify our synthetic microscopy volume generation method SpCycleGAN6,7,28 to model non-ellipsoidal or irregularly shaped nuclei. We have shown in our previous work7,4143 that SpCycleGAN can be used to generate synthetic microscopy volumes that can be used for training. Thus, we used the unpaired image-to-image translation model known as SpCycleGAN7 for generating synthetic microscopy volumes. By unpaired we mean that the binary segmentation masks we created above are not the ground truth of actual microscopy images. SpCycleGAN is a variation of CycleGAN44 that better maintains the spatial relationship of the nuclei (see “Synthetic microscopy volume generation”). Following training SpCycleGAN, we can generate synthetic 3D volumes with known ground truth. We do this by giving the trained SpCycleGAN binary segmentation masks that indicate where to locate the nuclei. Since SpCycleGAN generates 2D slices, we sequentially generate 2D slices for each focal plane and stack them to construct a 3D synthetic volume with nuclei located where we told SpCycleGAN we wanted the nuclei. Hence, we have synthetic volumes and their corresponding ground truth. Thus SpCycleGAN is an unsupervised and unpaired method that requires no manually annotated images for training. Using this SpCycleGAN, we can create large numbers of synthetic ground truth masks and generate many corresponding synthetic microscopy volumes for training. In total, we generated 950 synthetic microscopy subvolumes (128×128×128 voxels) using representative subvolumes of different datasets (see Table 1).

Examples of actual microscopy images and corresponding synthetic microscopy images are shown in Fig. 3.

Figure 3.

Figure 3

Generation of synthetic images and training data from exemplar datasets. Single optical XY-sections of actual microscopy images (top row), synthetic segmentation masks used by SpCycleGAN (middle row), and corresponding synthetic microscopy images (bottom row). V1 is rat kidney tissue that was fixed and cleared using the Scale37 technique, and then imaged by confocal microscopy. V3 is rat kidney tissue that was fixed, cleared using BABB38 and imaged using confocal microscopy. V1 differs from V3 in that V1 includes fluorescent objects that are not nuclei, and thus must be distinguished from nuclei.

The synthetic images generated by SpCycleGAN were used to train the NISNet3D model. Further, to test the necessity of manually annotated images, we trained two versions of NISNet3D for two weighted models, NISNet3D-synth and NISNet3D-slim. NISNet3D-synth was trained using only synthetic datasets whereas NISNet3D-slim was trained on synthetic images and a small number of manually annotated actual microscopy volumes. The details of NISNet3D-synth and NISNet3D-slim training and evaluation are described in “Methods”.

Evaluation of 3D segmentation—general approach

We compare NISNet3D with other deep learning image segmentation methods including VNet24, 3D U-Net23, Cellpose3, DeepSynth6, nnU-Net45 and StarDist3D15. In addition, we also compare NISNet3D to several non-deep learning segmentation methods including the 3D Watershed transform46, Squassh47, and the “IdentifyPrimaryObject” module from CellProfiler48. We used the above methods for comparison with NISNet3D because they have been commonly used in the literature4953. We also use the 2D watershed transform and connected component analysis from VTEA for comparison54. We trained and evaluated the methods using the same dataset as used for the evaluation of NISNet3D. We also used the same training and evaluation strategies used for NISNet3D, as described in “Experimental settings”. It should be emphasized that this means we also use synthetic data to train the comparison methods similar to how we used synthetic data to train NISNet3D.

The parameters of all the comparison segmentation methods were manually adjusted to achieve the best visual segmentation results. We feel this is the best way for a fair comparison of the segmentation methods because we tried to allow each of the methods to demonstrate their best performance. This is further discussed in “Experimental settings”. The values of mP, mR, F1, AP, and mAP were obtained for all of the methods mentioned above using the microscopy datasets V1V5 are shown in Tables 2 and 3. Note that in Table 2, the methods above the double line are results using both synthetic and partially annotated actual microscopy volumes for training. The methods below the double line use only synthetic data for training. We are surprised by how NISNet3D-synth, using only synthetic volumes for training, outperforms NISNet3D-slim that uses a combination of both synthetic data and partially annotated actual volumes in many cases. NISNet3D’s unique strength lies in instance segmentation, providing more accurate discrimination of individual nuclei. For qualitative evaluations the entire 3D volume was used, while for quantitative evaluations we use the annotated subvolumes. Therefore mP, mR, F1, AP, and mAP are computed for each dataset.

Table 2.

Comparison of object-based instance segmentation metrics for microscopy datasets in fluorescence datasets.

graphic file with name 41598_2023_36243_Tab2_HTML.jpg

The best performance with respect to each metric is in bold. We use the mean Precision (mP), mean Recall (mR), mean F1 (mF1) score, and mean Average Precision (mAP) on multiple IoU thresholds. The methods above the double line are results using both synthetic and actual microscopy volumes for training, and use actual volumes for testing. The methods below the double line use only synthetic data for training but use actual volumes for testing.

Table 3.

Comparison of object-based instance segmentation metrics in electron microscopy dataset NucMM.

Methods Microscopy V5
mP mR mF1 AP.50 AP.75 mAP AJI
StarDist3D15 73.44 74.24 73.84 88.59 9.78 61.20 68.56
DeepSynth6 81.63 76.04 78.72 71.47 50.88 63.74 75.54
Cellpose3 96.15 94.47 95.30 94.90 83.82 91.21 81.31
NISNet3D-slim 96.89 96.24 96.56 95.98 88.84 93.62 83.90

The best performance with respect to each metric are in bold. We use the mean Precision (mP), mean Recall (mR), mean F1 (mF1) score, and mean Average Precision (mAP) on multiple IoU thresholds. AJI is the Aggregated Jaccard Index55.

Qualitative and quantitative improvement of instance segmentation with NISNet3D

Results presented in Figs. 45, Supplementary Figures S4, and S5 show that, in general, the non-deep learning approaches (3D Watershed transform46, Squassh47, Cellprofiler’s “IdentifyPrimaryObject” module48, and VTEA’s 2D watershed transform and connected component analysis54) yielded heterogeneous segmentation results characterized by examples of both over- and under-segmentation. For example, VTEA’s 2D watershed transform and connected component analysis and Cellprofiler’s “IdentifyPrimaryObject” module are uniquely susceptible to axial over-segmentation (asterisks in XZ planes of V3 in Fig. 4), due to their dependence on fusing 2D segmentations into a final 3D volume. The deep learning approaches including StarDist3D15, Cellpose3, DeepSynth6 and nnU-Net45 performed far better in instance segmentation with some notable weaknesses compared to NISNet3D (Figs. 45, Supplementary Figures S4, S5). For example, examination of the XY and XZ planes of V3 and V4 shown in Fig. 4 shows multiple examples of nuclei that were either over- or under-segmented by 3D U-Net, StarDist3D, and Cellpose, but were accurately detected and distinguished by NISNet (some indicated with asterisks). Results presented in Fig. 5 demonstrate that NISNet3D performs particularly well in the challenging setting of densely packed nuclei, particularly in poorly labeled samples. Asterisks indicate striking examples of nuclei that were accurately detected and distinguished by NISNet3D, but suffered from over- and under-segmentation by Cellpose and VTEA.

Figure 4.

Figure 4

Qualitative assessment of NISNet3D segmentation in 2D and 3D. Three volumes were segmented by a 3D Cellprofiler “IdentifyPrimaryObject” module, 2D watershed and 3D connected components (VTEA), 3D U-Net, StarDist3D, Cellpose and NISNet3D. Shown are single optical section in XY or orthogonal sections of the image volume without (first column) or with unique segmented instances. NISNet3D demonstrates improved instance segmentation in XY and XZ imaging planes (red asterisks). We use different colors to distinguish different nuclei instances segmented by corresponding methods.

Figure 5.

Figure 5

NISNet3D demonstrates improved instance segmentation. Two regions from (V2 and V4) were segmented with 2D watershed transform and connected component analysis (VTEA), Cellpose, and NISNet3D. A single optical plane and XZ and YZ projections are shown with segmentation instance uniquely colored. NISNet3D improves semantic and instance segmentation over with examples of over- and under-segmentation (red vs yellow asterisks respectively) by both Cellpose or the 2D/3D connected component approach implemented in VTEA. Overall, NISNet3D provides the best instance segmentation in 3D.

Overall, visual comparisons of segmentation results obtained from NISNet3D with those obtained from the Cellpose, DeepSynth, StarDist3D, 3D U-Net, and nnU-Net suggest the NISNet3D addresses segmentation weaknesses of these other deep learning techniques. To assess the qualitative improvement in segmentation results generated by NISNet3D, we measured quantitative metrics of instance and semantic segmentation on manually annotated image volumes. As the primary goal of nuclear segmentation is to detect nuclei, instance segmentation accuracy was evaluated using object-based metrics. To reduce bias56, we calculated the mean Precision, mean Recall and mean F1 scores with multiple IoU thresholds. We set TIoUs={0.25,0.3,,0.45} for datasets V1V4, and set TIoUs={0.5,0.55,,0.75} for dataset V5. Since segmenting 3D objects is more challenging than segmenting 2D objects, the IoU thresholds we used for evaluating 3D segmentation methods (0.25–0.5) are lower than the IoU thresholds reported for evaluating 2D segmentation methods (0.5–0.95), but are comparable to those reported in the literature for 3D segmentation3,15,49,50,5760. The higher thresholds used for evaluation of nuclei in V5 reflect the fact that the nuclei in this volume are less densely packed, and thus overlap less. Note that these thresholds are not used by any of the segmentation methods and thus do not affect how the nuclei are segmented. For a given threshold, the common object detection metric Average Precision (AP)61,62 was calculated by computing the area under the Precision-Recall Curve63. In addition, we use the Aggregated Jaccard Index (AJI)55 to integrate instance and segmentation segmentation errors.

NISNet3D had the highest mAP and mF1 scores amongst the tested segmentation approaches. mP and mR were slightly lower in some cases without impacting the mF1 score. In these cases, where NISNet3D under-performed in mR and mP, this is likely due to over and under segmentation errors from a large marker being generated. Importantly, when the IoU threshold is increased, NISNet3D continues to outperform the other tested approaches (Fig. 6). We observe that all approaches under-performed on V4 likely due to more crowded nuclei and under-sampling in the Z-axis. Analyses of the accuracy of semantic segmentation (shown in Supplementary Tables S1, S2) show that Cellpose3, StarDist15, nnU-Net45, VNet24, and 3D U-Net23 perform similarly with respect to distinguishing nuclei from background.

Figure 6.

Figure 6

Evaluation results using average precision (AP) for multiple intersection-over-unions (IoUs) thresholds, TIoUs, for datasets V1V5. Note we only show the methods that have higher performance.

Discussion

In this paper, we describe NISNet3D, a true 3D nuclei segmentation approach built on 3D U-Net23. Arguably, a significant bottleneck to the development and use of 3D deep learning approaches has been access to high quality training data64. Therefore, to facilitate training NISNet3D we developed a generative model based on our previous work on CycleGAN to make high quality training data without the need for ground truth7,44. To augment this generative approach, we also incorporated elastic deformations to generate non-ellipsoidal models of nuclei to better capture the diversity of nuclei found in tissues. Lastly, to address a need for better instance segmentation in 3D we also implemented an improved instance splitting approach.

We demonstrate that NISNet3D trained exclusively on our synthetic volumes (tested on actual microscopy volumes) performed as well as or better than NISNet3D trained on a combination of manually annotated and synthetic volumes. This suggests that synthetic volumes, made without any manually annotated volumes, are sufficient for training a 3D segmentation deep learning model to produce high-quality instance segmentations of cell nuclei. Further, although NISNet3D semantic segmentation accuracy was similar to other segmentation approaches tested here, NISNet3D instance segmentation was more accurate. We conclude that NISNet3D specifically improves instance segmentation of nuclei through a combination of better 3D training data and our instance-splitting approach.

We acknowledge that NISNet3D may have potential issues with segmenting nuclei that do not conform to our assumption that nuclei are only slightly deformed ellipsoids. For example, multi-lobed nuclei found in leukocytes or multi-nucleated myocytes are cases where our assumption may not hold. Morphological variability is a general issue in machine learning as training must involve all potential morphologies6. During the process of training, machine learning approaches derive features and characteristics from the contents of the training images. Thus, this weakness can be addressed by incorporating more models of varying nuclei morphology, ideally using our SpCyclGAN.

In the future, we will explore the generation of synthetic nuclei segmentation masks to better simulate complex nuclei morphologies and more realistic densities and distributions. Further, we are extending the current SpCycleGAN to generate fully 3D synthetic volumes to include a model of a microscope point spread function. This way, the generated synthetic volumes will more closely model the distribution of actual microscopy volumes, further improving synthetic volumes and the segmentation accuracy of approaches like NISNet3D.

Methods

In this section, we provide more details on the structure of NISNet3D. In our experiments, we used both real annotated volumes and synthetic volumes to train the comparison methods similar to how we used synthetic data to train NISNet3D. We therefore provide more detail as to how we generated the synthetic training data.

Notation and overview

In this paper, we denote a 3D image volume of size X×Y×Z voxels by I, and a voxel having coordinates (xyz) in the volume by I(x,y,z). We use superscripts to distinguish between the different types of volumes and arrays. Iorig will be used to denote an original microscopy volume. The objective is to segment nuclei within Iorig. The nuclei are labeled with unique voxel intensities and comprise the labeled volume Iseg. If we have ground truth from real microscopy volumes then the volume Ilabel corresponds to the annotated nuclei with unique voxel intensities. This is done to distinguish each nuclei instance.

If we use synthetic images for training then Ibi and Isyn will be used to denote a generated synthetic binary segmentation masks and a synthetic microscopy volume, respectively. In addition, nuclei with corresponding masks in Ibi are labeled with unique voxel intensities and to the volume Ilabel. This is done to distinguish each nuclei instance. In either case (actual or synthetic training data) Ilabel will be used during the training of NISNet3D and comparison methods.

In addition to Ilabel, NISNet3D generates from Ilabel a vector field array, Ivec of size X×Y×Z×3, also for training purposes. Each element I(x,y,z)vec of Ivec, located at (xyz), is a 3D vector V(x,y,z) that is associated with the nucleus voxel I(x,y,z)label and points to the centroid of the nucleus. Detailed description of vector field generation will be discussed in “Modified 3D U-Net”.

For a given input volume, Iorig, NISNet3D generates a corresponding volume Imask of size X×Y×Z×3 comprising the binary segmentation results as well as an estimate of the vector field array, I^vec, which is used to locate the nuclei centroids. In addition, I^vec is used to generate a 3D array of gradients, IG of size X×Y×Z, that is used in conjunction with morphological operations and watershed segmentation to distinguish and separate touching nuclei as well as remove small objects. The final output, Iseg, is a color-coded segmentation volume. An overview of the entire system is shown in Fig. 7.

Figure 7.

Figure 7

Overview of 3D nuclei instance segmentation using NISNet3D. The training and synthetic image generation are also shown but are not explicitly part of NISNet3D. NISNet3D is trained with synthetic microscopy volumes generated from SpCycleGAN and/or with annotated actual volumes as indicated. Note that Ibi and Iorig are not matched since they are part of the unpaired image-to-image translation of SpCyCleGAN.

NISNet3D training and instance segmentation

In this section, we describe the NISNet3D architecture, how to train it, and then use NISNet3D for nuclei instance segmentation.

Modified 3D U-Net

NISNet3D utilizes a modified 3D U-Net as shown in Fig. 8 which outputs the same size array as the input.

Figure 8.

Figure 8

NISNet3D uses a modified 3D U-Net architecture with residual blocks, attention gates, and multi-task learning module.

The 3D U-Net encoder consists of multiple Conv3D Blocks (Fig. 9a) and Residual Blocks (Fig. 9b)65. Instead of using max pooling layers, we use Conv3D Blocks with stride 2 for feature down-sampling, which introduces more learnable parameters. Each convolution block consists of a 3D convolution layer with filter size 3×3×3, a 3D batch normalization layer, and a leaky ReLU layer. The decoder consists of multiple TransConv3D blocks (Fig. 9d) and attention gates (Fig. 9c). Each TransConv3D block includes a 3D transpose convolution with filter size 3×3×3 followed by 3D batch normalization and leaky ReLU. We use a self-attention mechanism66 to refine the feature concatenation while reconstructing the spatial information.

Figure 9.

Figure 9

(a) Conv3D Block, (b) residual block, (c) attention gate, (d) TransConv3D block.

Training the modified 3D U-Net

The modified 3D U-Net can be trained on both synthetic microscopy volumes Isyn or actual microscopy volumes Iorig if manual ground truth annotations are available.

As shown in Fig. 7, during training, NISNet3D’s modified 3D U-Net takes an Isyn or Iorig, an Ilabel, and an Ivec as input. Ilabel is the X×Y×Z gray-scale label volume corresponding to Ibi where different nuclei are marked with unique pixel intensities. It used is used by NISNet3D to learn the segmentation masks. To facilitate segmentation NISNet3D identifies nuclei centroids and estimates their locations using a vector field, Ivec. In particular, Ivec is part of the ground truth data used during training to enable the network to learn nuclei centroid locations. In addition, the 3D vector field array helps identify the boundaries of adjacent/touching nuclei since their corresponding vectors tend to point in different directions.

As indicated previously, a vector field, Ivec, is an X×Y×Z×3 array whose elements are 3D vectors that point to the centroids of the corresponding nuclei. Specifically, I(x,y,z)vec is a 3D vector at location (xyz) and points to the nearest nucleus centroid. We also denote the X×Y×Z array consisting of the first component of each 3D vector by Ivec(x). Similarly, we denote the arrays of the second and third components by Ivec(y) and Ivec(z), respectively, that is I(x,y,z)vec=(I(x,y,z)vec(x),I(x,y,z)vec(y),I(x,y,z)vec(z)).

The first step in the 3D vector field array generation (VFG) (see in Fig. 10) is to obtain the centroid of each nucleus in Ilabel. We denote the kth nucleus as the set of voxels with intensity k in Ilabel, and the location of the centroid of the kth nucleus by (xk,yk,zk). We define V(x,y,z)k to be the 3D vector from (xyz) to (xk,yk,zk) and set I(x,y,z)vec=V(x,y,z)k as long as the voxel at (xyz) is not a background voxel. Note that if a voxel I(x,y,z)label is a background voxel (i.e. I(x,y,z)label=0), the corresponding entry I(x,y,z)vec in the vector field array is set to 0, as shown in Eq. (1).

Figure 10.

Figure 10

Steps for 3D vector field array generation (VFG). Each nucleus voxel in the 3D vector field array, Ivec represents a 3D vector that points to the centroid of current nucleus.

Using Isyn or Iorig, Ilabel, and Ivec, NISNet3D then outputs an estimate of the 3D vector field array I^vec (I^vec(x), I^vec(y), I^vec(z)) and the 3D binary segmentation mask volume Imask.

I(x,y,z)vec=V(x,y,z)k,ifI(x,y,z)label00,otherwise 1
Loss functions

Given Isyn or Iorig, Ilabel, and Ivec, the modified 3D U-Net simultaneously learns the nuclei segmentation masks Imask and the 3D vector field array I^vec. In the modified 3D U-Net two branches but no sigmoid function are used to obtain I^vec because the vector represented at a voxel can point to anywhere in an array. In other words, the entries in I^vec can be negative numbers or large numbers. Unlike previous methods28,67,68 that directly learn the distance transform map, the 3D vector field array contains both the distance and direction of the nearest nuclei centroid from the current voxel location. This can help NISNet3D avoid the multiple detection of irregular shaped nuclei. The output 3D vector field array I^vec is compared with the ground truth vector field array Ivec and the error between them minimized using the mean square error (MSE) loss function. Similarly, the segmentation result Imask is compared with the ground truth binary volume Ibi and the difference minimized using a combination of the Focal Loss69 LFL and Tversky Loss70 LTL metrics. Denoting a ground truth binary volume Ibi by S, the corresponding segmentation result Imask by S^, a ground truth vector field array Ivec by V , and the estimated vector field array I^vec by V^, then, the entire loss function is,

L(S,S^,V,V^)=λ3LTL(S,S^)+λ4LFL(S,S^)+λ5LMSE(V,V^) 2

where

LTL=i=1Psi1s^i1i=1Psi1s^i1+α1i=1Psi1s^i0+α2i=1Psi0s^i1,LFL=-1Pi=1P{βsi1s^i0γlog(s^i1)+(1-β)si0s^i1γlog(s^i0)},LMSE=1Pi=1P(vi-v^i)2 3

where α1+α2=1 are two hyper-parameters in Tversky loss70 that control the balance between false positive and false negative detections. β and γ are two hyper-parameters in Focal loss69 where β balances the importance of positive/negative voxels, and γ adjusts the weights for easily classified voxels. viV is the ith element of V, and v^iV^ is the ith entry in V^. Similarly, siS is the ith voxel in S, and s^iS^ is the ith voxel in S^. We define s^i0 to be the probability that the ith voxel in S^ is a nuclei, and s^i1 as the probability that the ith voxel in S^ is a background voxel. Similarly, si1=1 if si is a nuclei voxel and 0 if si is a background voxel, and vice versa for si0. Lastly, P is the total number of voxels in a volume.

Divide and conquer

To segment a large microscopy volume, we propose a divide-and-conquer scheme shown in Fig. 11. We use an analysis window of size K×K×K that slides along each original microscopy volume Iorig of size X×Y×Z and crops a subvolume. Considering that cropping may result in some nuclei being partially included, and that these partially included nuclei which lie on the border of the analysis window may cause inaccurate segmentation results, we construct a padded window by symmetrically padding each cropped subvolume by K4 voxels on each border. Also, the stride of the moving window is set to K2 so with every step it slides, it will have K2 voxels overlapping with the previous window. For the segmentation results of every K×K×K window, only the interior and centered K2×K2×K2 subvolume, which we denoted as the interior window, will be used as the segmentation results. In this paper, we use K=128 for all testing data. Once the analysis window slides along the entire volume, a segmentation volume Imask and 3D vector field array I^vec of size X×Y×Z will be generated. In this way, we can analyze any size input volume, especially very large volumes.

Figure 11.

Figure 11

Proposed divide-and-conquer scheme for segmenting large microscopy volumes.

3D nuclei instance segmentation

Based on the output of the modified 3D U-Net, a 3D gradient field and an array of markers are generated and used to separate densely clustered nuclei. Moreover, the markers are refined to attain better separation. As mentioned previously, each element/vector in the estimated 3D vector field array I^vec point to the centroid of the nearest nucleus. Most often, the vectors associated with voxels on the boundary of two touching nuclei point in different directions that correspond to the locations of the centroids of the touching nuclei, and hence have a large difference/gradient. We employ such larger gradients to detect the boundaries of touching and also overlapping nuclei.

Define I^vec as the gradient of I^vec which is obtained as given as:

I^vec=I^vec(x)I^vec(y)I^vec(z)T=I^vec(x)xI^vec(y)yI^vec(z)zT=SxI^vec(x)SyI^vec(y)SzI^vec(z)T, 4

where I^vec(x), I^vec(y), and I^vec(z) are the x, y, and z sub-arrays of the estimated 3D vector field array I^vec respectively, Sx, Sy, Sz are 3D Sobel filters, and is the convolution operator. We also define the gradient map IG as the maximum of the gradient components along the x, y and z directions :

IG=max(I^vec(x),I^vec(y),I^vec(z)) 5

Elements of IG that correspond to boundaries of touching nuclei have larger values (larger gradients) which can be used to identify individual nuclei. We subsequently threshold IG via a thresholding function τ(x,Tm) where τ(x,Tm)=1 if xTm otherwise 0. This step is used to highlight and distinguish boundary voxles of individual nuclei from spurious edges that may arise from finding the gradients. The value of Tm can affect the number of voxels delineated as boundary voxels. Larger values of Tm result in thinner boundaries whereas smaller values of Tm result in a thicker nuclei boundaries. We then subtract the result from the binary segmentation mask Imask obtained from the modified 3D U-Net, and then apply that to a non-linear half wave rectifier/RELU activation function σ(·) that sets all negative values to 0. We denote the result as Iblob as given in Eq. (6). The difference Imask-τ(IG,Tm) removes boundary voxles of touching nuclei while emphasizing the interior. Since the difference can be negative in some cases, we use σ(·) to eliminate negative values. Hence, Iblob highlights the interior regions of nuclei which maybe used as markers to perform watershed segmentation46,71.

Iblob=σ(Imask-τ(IG,Tm)) 6
Imark=δtf(δtc(Iblob,Bc),Bf) 7

However, in some cases Iblob may still contain boundary information. To refine it we use 3D conditional erosion with a coarse structuring element Bc and a fine structuring Bf as shown in Fig. 12. To achieve this we first identify the connected components72 in Iblob and then iteratively erode each connected component using Bc until its size is smaller than some threshold tc. We then continue eroding each connected component using Bf until its size is smaller than another threshold tf. We denote the final result by Imark, as shown in Eq. (7), where δtc(Iblob,Bc) defines iterative erosion of all connected component in Iblob by a coarse structuring element Bc until the size of each component is smaller than the coarse object threshold tc. Similarly, δtf(Iblob,Bf), denotes iterative erosion by a fine structuring element, Bf, until the size of the object is smaller than the fine object threshold tf. Finally, marker-controlled watershed46 is used to generate instance segmentation masks Iseg using Imark. Small objects in Iseg that are less than 20 voxels in size are then removed, and each object color coded for visualization.

Figure 12.

Figure 12

(a) coarse 3D structuring element Bc and (b) fine 3D structuring element Bf for 3D conditional erosion.

3D nuclei image synthesis

Deep learning methods generally require large amounts of training samples to achieve accurate results. However, manually annotating ground truth is a tedious task and impractical in many situations especially for large 3D microscopy volumes. NISNet3D, like any other trainable segmentation method, can use synthetic 3D volumes, annotated actual 3D volumes, or combinations of both as demonstrated here.

To generate synthetic volumes, we first generate synthetic segmentation masks that are used as ground truth masks, and then translate the synthetic segmentation masks into synthetic microscopy volumes using an unsupervised image-to-image translation model known as SpCycleGAN7.

The way SpCycleGAN works is that we first train SpCycleGAN and then after it is trained, we can generate 3D volumes with its own inherent ground truth. We do this by giving the trained SpCycleGAN binary segmentation masks that indicate where we want the nuclei to be located in synthetic image. SpCycleGAN will then generate a synthetic volume with nuclei located where we told SpCycleGAN we wanted the nuclei. Hence, we have synthetic volumes and their corresponding ground truth. It should be noted that we do not need any ground truth volumes to train SpCycleGAN.

Synthetic segmentation mask generation

To generate a volume of synthetic binary nuclei segmentation masks, we iteratively add N initial binary nuclei to an empty 3D volume of size 128×128×128. Each initial nuclei is modeled as a 3D binary ellipsoid having random size, orientation, and location. The size, orientation, and location of the nth ellipsoid are parameterized by a(n), θ(n), and t(n), respectively, where a(n)=[ax(n),ay(n),az(n)]T denotes a vector of semi-axes lengths of the nth ellipsoid, θ(n)=[θx(n),θy(n),θz(n)]T denotes a vector of rotation angles relative to the X, Y, and Z coordinate axes, and t(n)=[tx(n),ty(n),tz(n)]T denotes the displacement/location vector of the nth ellipsoid relative the origin. The parameters a(n), θ(n), and t(n) are randomly selected based on observations of nuclei characteristics in actual microscopy volumes.

Let J(x,y,z)nuc,(n) denote the nth initial nucleus at location (xyz) having intensity/label n{1,,N}, then

J(x,y,z)nuc,(n)=n,if(xax(n))2+(yay(n))2+(zaz(n))2<10,otherwise 8

The ellipsoids are assigned different intensity values to differentiate them from each other. Subsequently, each J(x,y,z)nuc,(n) undergoes a random translation given by t(n) followed by random rotations specified in θ(n). Denoting the original coordinates by X and the translated and rotated coordinates by X~, respectively then

X~=x~y~z~=Rz(θz(n))Ry(θy(n))Rx(θx(n))x+tx(n)y+ty(n)z+tz(n) 9

where Rx(θx(n)), Ry(θy(n)), and Rz(θz(n)) denote the rotation matrices relative to the XY, and Z axes by angles θx(n), θy(n), and θz(n), respectively. Thus, the final nth ellipsoid I(x,y,z)nuc,(n) is given by I(x,y,z)nuc,(n)=J(x~,y~,z~)nuc,(n) for all n{1,,N}. We finally constrain the transformed nuclei such that they do not overlap by more than tov voxels. The ellipsoids I(x,y,z)nuc,(n), n{1,,N}, are then used to fill in a binary volume Ibi of size X×Y×Z.

Since actual nuclei are not strictly ellipsoidal but look more like deformed ellipsoids (see Fig. 2 (left column)), we use an elastic transformation40,73 to deform Ibi. To achieve this we first generate a coarse displacement vector field, Icoarse, which is an array of size d×d×d×3, whose entries are independent and Gaussian distributed random variables having zero mean and variance σ2 (that is I(k,x,y,z)coarseN(0,σ2)). The size parameter d is used to control the amount of deformation applied to the nuclei. Icoarse is then interpolated to size X×Y×Z×3, via spline interpolation74 or bilinear interpolation40 to produce a smooth displacement vector field Ismooth. The entry I(x,y,z,1)smooth indicates the distance by which voxel I(x,y,z)bi will be shifted along the X-axis. Similarly, the elements I(x,y,z,2)smooth and I(x,y,z,3)smooth provide the distances by which I(x,y,z)bi will be shifted along the Y and Z-axes, respectively. In our experiments, we used spline interpolation and values for d{4,5,10}. Each element in Icoarse serves as an anchor point controlling the magnitude of elastic transform. Larger values of d result in larger deformations being applied through Ismooth whereas smaller d values produce less deformation. Examples of deformed ellipsoids are shown in Fig. 2 (right column).

Synthetic microscopy volume generation

In the previous section, we described how we generate binary segmentation masks. Here we describe how we use the masks to generate synthetic microscopy volumes that have nuclei located as determined the binary segmentation mask. In particular, we use the unpaired image-to-image translation model known as SpCycleGAN7 for generating synthetic microscopy volumes. SpCycleGAN7 is a variation of CycleGAN44 where an additional spatial constraint is added to the loss function to maintain the spatial locations of the nuclei.

SpCycleGAN is shown in Fig. 13. The way SpCycleGAN works is that we first train SpCycleGAN and then after it is trained, we can generate 3D volumes with known ground truth. We do this by giving the trained SpCycleGAN binary segmentation masks that indicate where we want the nuclei to be located. SpCycleGAN will then generate a synthetic volume with nuclei located where we told SpCycleGAN we wanted the nuclei. Hence, we have synthetic volumes and their corresponding ground truth. In fact, the synthetic 3D volumes we generate using SpCycleGAN do not need any ground truth annotations at all because SpCycleGAN is an unsupervised and unpaired method. We can create many synthetic ground truth masks and generate many corresponding synthetic microscopy volumes that have nuclei located at the locations we choose using SpCycleGAN. By unpaired we mean that the inputs to SpCycleGAN are volumes of binary segmentation masks and actual microscopy images that do not correspond with each other. In addition, the binary segmentation masks created above are not used as ground truth volumes for performing segmentation of actual microscopy images.

Figure 13.

Figure 13

The architecture of SpCycleGAN that is used for generating synthetic microscopy volumes.

As shown in Fig. 7, we use slices from volumes of the binary segmentation masks, Ibi,(m),m{1,,M}, where M is the number of training samples/volumes, and slices from actual microscopy volumes Iorig,(m),m{1,,M} for training SpCycleGAN. After training we generate L synthetic microscopy volumes, Isyn,(l),l{1,,L}, using synthetic microscopy segmentation masks, Ibi,(l),l{1,,L} other than the ones used for training. Note that since SpCycleGAN generates 2D slices, we use the slices to construct a 3D synthetic volume. We acknowledge that this can be a problem because SpCycleGAN may not capture information across the volume stack.

SpCycleGAN consists of two generators G and F, two discriminators D1 and D2. G learns the mapping from Ibi to Iorig whereas F performs the reverse mapping. Also, SpCycleGAN introduced a segmentor S for maintaining the spatial location between Ibi and F(G(Ibi)). The entire loss function of SpCycleGAN is:

L(G,F,S,D1,D2)=LGAN(G,D1,Ibi,Iorig)+LGAN(F,D2,Iorig,Ibi)+λ1Lcycle(G,F,Iorig,Ibi)+λ2Lspatial(G,S,Iorig,Ibi) 10

where λ1 and λ2 are weight coefficients controlling the loss balance between Lcycle and Lspatial, and

LGAN(G,D1,Ibi,Iorig)=EIorig[log(D1(Iorig))]+EIbi[log(1-D1(G(Ibi)))]LGAN(F,D2,Iorig,Ibi)=EIbi[log(D2(Ibi))]+EIorig[log(1-D2(F(Iorig)))]Lcycle(G,F,Iorig,Ibi)=EIbi[F(G(Ibi))-Ibi1]+EIorig[G(F(Iorig))-Iorig1]Lspatial(G,S,Ibi,Iorig)=EIbi[S(G(Ibi))-Ibi2] 11

where ·1 and ·2 denote the L1 norm and L2 norm, respectively. EI is the expected value over all the input volumes of a batch to the network.

We use 512 slices of V1, 512 slices of V2, 4096 slices of V3, and 2048 slices of V4 for training SpCycleGAN.

There is a limitation of our method since we need to model the nuclei shape in the synthetic binary mask as close as possible to the real nuclei in the microscopy volume. This way SpCycleGAN can generate more realistic microscopy images.

Comparison methods

We compared NISNet3D with several deep learning image segmentation methods including VNet24, 3D U-Net23, Cellpose3, DeepSynth6, nnU-Net45 and StarDist3D15. In addition, we also compare NISNet3D with several commonly used biomedical image analysis tools and software including 3D Watershed46, Squassh47, and Cellprofiler’s “IdentifyPrimaryObject” module48. We also compare to a blob-slice method using the 2D watershed transform with 3D connected components from VTEA54. In the cases of software packages Cellprofiler and VTEA, we configured them for segmentation using our data sets. The details are described in the next subsection.

VNet24 and 3D U-Net23 are two popular 3D encoder-decoder networks with shortcut concatenations designed for biomedical image segmentation. Cellpose3 uses a modified 2D U-Net for estimating image segmentation and spatial flows, and uses a dynamic system3 to cluster pixels and further separate touching nuclei. When segmenting 3D volumes, Cellpose works from three different directions slice by slice and combines the 2D segmentation results into a 3D segmentation volume3. DeepSynth uses a modified 3D U-Net to segment 3D microscopy volumes and the watershed transform to separate touching nuclei6. nnU-Net45 is a self-configuring U-Net-based method that mainly does semantic segmentation for biomedical applications and uses connected component analysis of the center class, followed by iterative outgrowing into the nuclei border region on the semantic segmentation results. Similarly, StarDist3D uses a modified 3D U-Net to estimate the star-convex polyhedra used to represent nuclei15.

Alternative non-deep learning based techniques such as 3D Watershed46 uses the watershed transformation71 and conditional erosion46 for nuclei segmentation. Similarly, Squassh, an ImageJ plugin for both 2D and 3D microscopy image segmentation, uses active contours47. CellProfiler is an image processing toolbox and provides customized image processing and analysis modules48 We used Cellprofiler’s “IdentifyPrimaryObject” module for our study. VTEA is an ImageJ plugin that combines various approaches including Otsu’s thresholding and the watershed transform to segment 2D nuclei slice by slice and reconstruct the results into a 3D volume54.

For comparison purposes, we trained and evaluated the methods above using the same dataset as used for NISNet3D. We also used the same training and evaluation strategies used for NISNet3D as described in “Experimental settings”. This means that the methods that required training were also trained on synthetic data. We provide a discussion of the results in “Discussion”. Note that the 3D Watershed transform, Squassh, CellProfiler, and VTEA do not need to be trained because they use more traditional image analysis techniques.

nnU-Net is primarily used for semantic segmentation for biomedical applications36,45. In order to obtain nuclei instance segmentation, nnU-Net uses post-processing with connected component analysis of the center class, followed by iterative outgrowing into the nuclei border region from the semantic segmentation results. We used nnU-Net’s post-processing steps to obtain the instance segmentation results presented. We observe that nnU-Net has difficulty separately touching nuclei on an object-level basis. This is due to the connected component analysis during the post-processing where touching nuclei will be grouped as one object.

Experimental settings

The parameters used for generating Ibi are provided in Table 4 where (amin, amax) is the range of the ellipsoid semi-axes lengths, tov is the maximum allowed overlapping voxels between two nuclei, and N is the total number of nuclei in a synthetic volume. These parameters are based on visual inspection of nuclei characteristics in actual microscopy volumes.

Table 4.

The training evaluation strategies for NISNet3D and comparison methods.

Parameters for generating synthetic microscopy volumes NISNet3D-slim training strategies NISNet3D-slim
Volume name amin amax to N d σ Version Training Evaluation
V1 4 8 5 500 10 1 M1 50 volumes of synthetic V1 Tested on 1 subvolume of V1
V2 10 14 10 70 4 1 M2 50 volumes of synthetic V2 Tested on 16 subvolumes of V2
V3 6 10 200 200 5 2 M3 50 volumes of synthetic V3 3-fold cross-validated on 9 subvolumes of V3
V4 8 10 100 560 4 4 M4 lightly retrained using M3 on 1 subvolume of V4 Tested on 4 subvolumes of V4
V5 M5 27 subvolumes of V5 Tested on 27 subvolumes of V5

SpCycleGAN was trained on unpaired Ibi and Iorig and the trained model was used to generate synthetic microscopy volumes. The weight coefficients of the loss functions were set to λ1=λ2=10 for the SpCycleGAN7. The synthetic microscopy volumes were verified by a biologist (co-author Dunn).

For the 3D Watershed transform, we used a 3D Gaussian filter to preprocess the images and used Otsu’s method to segment/isolate objects from the background of each volume. Subsequently the 3D conditional erosion described in “3D nuclei instance segmentation” was used to obtain the markers, and the marker-controlled 3D watershed method, which was implemented by the Python Scikit-image library, was used to separate touching nuclei.

For Cellprofiler, we used customized image processing modules including inhomogeneity correction, median filtering, and morphological erosion were used to preprocess the images, and the default “IdentifyPrimaryObject” module was used to obtain 2D segmentation masks for each slice. “IdentifyPrimaryObjects” uses the 2D watershed transform segmentation for identification of objects in 3D biological structures48.

With regards to Squassh47, we used “Background subtraction” with a manually tuned “rolling ball window size” parameter. The rest of the parameters were set to their default values. We observed that Squassh did fairly well on volume V2 but totally failed on V4 due to the densely clustered nuclei. For VTEA54, a Gaussian filter and background subtraction were used to preprocess the image. The object building method in VTEA was set to “Connect 3D”, and the segmentation threshold determined automatically. We manually tuned the parameters “Centroid offset”, “Min vol”, and “Max vol” to obtain the best visual segmentation results. Finally, the watershed transform was used for 2D segmentation. The 2D segmentations were then merged into a 3D segmentation volume using a blob-slice method54,75. For Cellpose, we used the “nuclei” style and since the training of Cellpose is only limited to 2D images, we trained Cellpose on every XY focal planes of our subvolumes following the training schemes given in Table 4. For nnU-Net, we trained nnU-Net for 500 epochs using the “3d_fullres” configuration.

For VNet24, 3D U-Net23, and DeepSynth6 we improved the segmentation results by using our 3D conditional erosion described in “3D nuclei instance segmentation” followed by 3D marker-controlled watershed transform to separate touching nuclei.

Both SpCycleGAN and NISNet3D were implemented using PyTorch. We used 9-block ResNet for the generators G, F, and the segmentor S (see Fig. 13). The discriminators D1 and D2 (Fig. 13) were implemented with the “PatchGAN” classifier76. SpCycleGAN was also trained with an Adam optimizer77 for 200 epochs with an initial learning rate 0.0002 that linearly decays to 0 after the first 100 epochs. Figure 2.3 depicts the generated synthetic nuclei segmentation masks and corresponding synthetic microscopy images.

As mentioned above, we investigated two versions of NISNet3D: NISNet3D-slim (which includes 5 models M1M5) and NISNet3D-synth (which includes 1 model Msyn) for different application scenarios. NISNet3D-synth is designed for the situation where no ground truth annotations are available. In contrast, NISNet3D-slim is used for the case where limited ground truth annotated volumes are available. In the case of NISNet3D-slim, synthetic volumes are used for training and a small amount of actual ground truth data is for used for light retraining. We train NISNet3D using both of the strategies to show the effectiveness of our approach. We trained 5 versions of NISNet3D-slim, denoted as M1M5, using three training methods (See Table 4). First, we trained three models M1, M2, and M3 using the synthetic versions of the three volumes V1V3, respectively. Next, we transfer the weights from M3 and continue training on a limited number of actual microscopy subvolumes of V4 to produce the fourth model M4. Finally, we directly train M5 on subvolumes of actual microscopy data V5 only. After training, we evaluated the NISNet3D-slim models M1M5 using two different evaluation schemes. This is summarized in Supplementary Figure S3. In one scheme we directly test the models M1, M2, M4, and M5 using all the subvolumes of the actual microscopy volumes V1, V2, V4, and V5, respectively, since they were not used for training. In the case of M3, we first train M3 on 50 volumes of synthetic V3 to obtain a pre-trained M3pre.

For the second scheme we use 3-fold cross-validation to lightly retrain M3pre to obtain the final M3. Specifically, we randomly shuffled 9 subvolumes of V3 and divided them into 3 equal sets where each set contains 3 subvolumes, and then iteratively lightly retrain the M3pre on one of the sets and test on the other two sets. We use the average of the evaluation results from the three iterations. We used cross-validation to show the effectiveness of our method when the evaluation data is limited. Note that when lightly retraining M3pre we update all its parameters while continue to train on actual microscopy volumes. The training and evaluation scheme for all models is provided in Table 4.

We then investigated how all the methods work when there is no ground truth data for training. For this we train NISNet3D, now known as NISNet3D-synth for these experiments. NISNet3D-synth, Msyn, was trained on 950 synthetic microscopy subvolumes that include the synthetic versions of V1V4. Our experience has been that we need more than 800 synthetic microscopy subvolumes for training the types of 3D volumes described in Table 1. We then tested Msyn separately on each manually annotated original microscopy datasets described in Table 1. A figure summarizing the training scheme of NISNet3D-synth and NISNet3D-slim can be found in Supplementary Figure S3.

Both NISNet3D-slim and NISNet3D-synth were trained for 100 epochs using the Adam optimizer77 with a constant learning rate of 0.001. The weight coefficients of the loss function were set to λ3=1, λ4=10, λ5=10 based on the settings in SpCycleGAN7,78, and the hyper-parameters α1, α2 in LTL were set to 0.3 and 0.7 based on the highest performance configuration70, and the hyper-parameters β, γ of LFL79 were set to 0.8 and 2 to minimize the training losses. The nuclei instance segmentation parameters used in our experiments are provided in Table 5. These parameters controls how well they split two touching nuclei. Lower Tm will split touching nuclei better. If nuclei size is smaller we should set smaller Tc and Tf. We choose these parameters based on the feedback of segmentation qualities (the ability to split touching nuclei visually) from biologist.

Table 5.

Parameters used for NISNet3D nuclei instance segmentation.

Parameters NISNet3D-slim NISNet3D-synth
V1 V2 V3 V4 V5 V1 V2 V3 V4
Tm 5 5 1 1 0 0 0 4 0
Tc 700 3000 2000 2000 2000 2000 2000 2000 2000
Tf 200 500 700 300 200 200 200 500 200

Evaluation metrics

We use object-based metrics to evaluate nuclei instance segmentation accuracy. We define NTPt as the number of True Positive detections when the Intersection-over-Union (IoU)80 between a detected nucleus and a ground truth nucleus is greater than some threshold of t. Similarly, NFPt is the number of False Positives greater than t, and NFNt is the number of False Negatives81,82 greater than t, respectively. NTPt measures how many nuclei in a volume are correctly detected, and the higher the value of NTPt the more accurate the detection. NFPt represents the detected nuclei that are not actually nuclei but are false detections, and NFNt represents the number of nuclei that were not detected. A precise and accurate detection method should have high NTPt but low NFPt and NFNt values.

Based on NTPt, NFPt, and NFNt we define the following metrics that are used to reduce bias56: mean Precision (mP=1|TIoUs|tTIoUsNTPtNTPt+NFPt), mean Recall (mR=1|TIoUs|tTIoUsNTPtNTPt+NFNt), and mean F1 score (mF1=1|TIoUs|tTIoUs2NTPt2NTPt+NFPt+NFNt) that are the mean of Precision, Recall, and the F1 score, respectively, over a set of multiple IoU thresholds TIoUs. We set TIoUs={0.25,0.3,,0.45} for datasets V1V4, and set TIoUs={0.5,0.55,,0.75} for dataset V5. In addition, we obtained the Average Precision (AP)61,62 (a commonly used object detection metric), by estimating the area under the Precision-Recall Curve63, using the same sets of thresholds in TIoUs. For example, AP.25 is the average precision evaluated at an IoU threshold of 0.25. The mean Average Precision (mAP) is then obtained as mAP=1|TIoUs|tTIoUsAPt.

The reason for using different thresholds TIoUs for V1V4 compared to those used for V5 is that throughout the experiments we observed that the nuclei in datasets V1V4 are more challenging to segment than the nuclei in V5. Using the same IoU thresholds for evaluating all the datasets resulted in a lower evaluation accuracy for the volumes V1V4 than for V5. Thus, we chose two different sets of IoU thresholds for V1V4, and V5, respectively.

Finally, we use the Aggregated Jaccard Index (AJI)55 to integrate object and voxel errors. The AJI is defined as:

AJI=i=1N|GiSmi|i=1N|GiSmi|+jU|Sj| 12

where Gi denotes the ith nucleus in the ground truth volume G having a total number of N nuclei, U is the volume of segmented nuclei without corresponding ground truth, and Smi is the mth connected component in the segmentation mask which has the largest Jaccard Index/similarity with Gi. Note that each segmented nucleus with index m that belongs to the mth connected component cannot be used more than once.

The values of all the parameters for each method were chosen to achieve the best visual results, as discussed in “Discussion”. The values of the various metrics used are given in Tables 2 and 3 for each microscopy dataset. Figure 6 shows the AP scores using multiple IoU thresholds for each subvolume of the dataset V1V5. The orthogonal views (XY focal planes and XZ focal planes) of the segmentation masks are overlaid on the original microscopy subvolume for each method on V1V5 as shown in Fig. 4, Supplementary Figures S4 and S5. Note that the different colors correspond to different nuclei.

Test microscopy volumes and ground truth annotations

We used four different actual microscopy 3D volumes in our experiments, denoted by V1V4, having fluorescent-labeled (DAPI or Hoechst 33342 stain) nuclei that were collected from cleared rat kidneys (Scale37 and BABB38), rat livers, and cleared mouse intestines using confocal microscopy. These volumes were used to demonstrate the performance of NISNet3D, to generate training data for NISNet3D-slim, and to generate annotations that provide the ground truth for quantitative evaluation of NISNet3D and other segmentation methods. V1 differs from V3 in that V1 includes fluorescent objects that are not nuclei, and thus must be distinguished from nuclei37. In addition, we also used a publicly available electron microscopy zebrafish brain volume, V5, known as NucMM83. We used the NucMM dataset and its provided ground truth to demonstrate that NISNet3D not only works on fluorescent microscopy volumes but also works with other high resolution imaging modalities.

Supplementary Information

Acknowledgements

This research was partially supported by a George M. O’Brien Award from the National Institutes of Health under Grant NIH/NIDDK P30 DK079312 and the endowment of the Charles William Harrison Distinguished Professorship at Purdue University. The original image volumes used in this paper were provided by Malgorzata Kamocka, Sherry Clendenon, and Michael Ferkowicz at Indiana University. We gratefully acknowledge their cooperation.

Author contributions

L.W., E.D., and P.S. conceived and designed NISNet3D. L.W. and A.C. designed the convolutional network and developed the segmentation approach, generated the synthetic data, conducted the segmentations, generated ground truth data, and conducted the comparisons with alternative methods. L.W., P.S., A.C., and E.D. wrote the manuscript. L.W., A.C., P.S., S.W., K.D., and E.D. revised the manuscript.

Data availibility

The datasets analyzed during the current study are available in the Zenodo repository https://doi.org//10.5281/zenodo.706514739.

Code availibility

The NISNet3D source code package is available at https://gitlab.com/viper-purdue/nisnet-release. The source code is released under Creative Commons License Attribution-NonCommercial-ShareAlike-CC BY-NC-SA. The source code cannot be used for commercial purposes. Address all correspondence to Edward J. Delp, ace@ecn.purdue.edu.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-023-36243-9.

References

  • 1.Lucas AM, et al. Open-source deep-learning software for bioimage segmentation. Mol. Biol. Cell. 2021;32:823–829. doi: 10.1091/mbc.E20-10-0660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Piccinini F, et al. Software tools for 3d nuclei segmentation and quantitative analysis in multicellular aggregates. Comput. Struct. Biotechnol. J. 2020;18:1287–1300. doi: 10.1016/j.csbj.2020.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Stringer C, Wang T, Michaelos M, Pachitariu M. Cellpose: A generalist algorithm for cellular segmentation. Nat. Methods. 2021;18:100–106. doi: 10.1038/s41592-020-01018-x. [DOI] [PubMed] [Google Scholar]
  • 4.Greenwald NF, et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. 2022;40:555–565. doi: 10.1038/s41587-021-01094-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kromp F, et al. Evaluation of deep learning architectures for complex immunofluorescence nuclear image segmentation. IEEE Trans. Med. Imaging. 2021;40:1934–1949. doi: 10.1109/TMI.2021.3069558. [DOI] [PubMed] [Google Scholar]
  • 6.Dunn KW, et al. Deepsynth: Three-dimensional nuclear segmentation of biological images using neural networks trained with synthetic data. Sci. Rep. 2019;9:18295–18309. doi: 10.1038/s41598-019-54244-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Fu, C. et al. Three dimensional fluorescence microscopy image synthesis and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2302–2310 (2018). noteSalt Lake City, UT.
  • 8.Yushkevich PA, et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage. 2006;31:1116–1128. doi: 10.1016/j.neuroimage.2006.01.015. [DOI] [PubMed] [Google Scholar]
  • 9.Berger DR, Seung HS, Lichtman JW. Vast (volume annotation and segmentation tool): Efficient manual and semi-automatic labeling of large 3d image stacks. Front. Neural Circ. 2018;12:88. doi: 10.3389/fncir.2018.00088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hollandi R, Diósdi Á, Hollandi G, Moshkov N, Horváth P. Annotatorj: An imagej plugin to ease hand annotation of cellular compartments. Mol. Biol. Cell. 2020;31:2179–2186. doi: 10.1091/mbc.E20-02-0156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Borland D, et al. Segmentor: A tool for manual refinement of 3d microscopy annotations. BMC Bioinform. 2021;22:1–12. doi: 10.1186/s12859-021-04202-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wang, J. & Perez, L. The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621 (arXiv preprint) (2017).
  • 13.Mikołajczyk A, Grochowski M. Data augmentation for improving deep learning in image classification problem. Int. Interdiscip. PhD Works. 2018;20:117–122. [Google Scholar]
  • 14.Yang L, et al. Nuset: A deep learning tool for reliably separating and analyzing crowded cells. PLoS Comput. Biol. 2020;16:e1008193. doi: 10.1371/journal.pcbi.1008193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Weigert, M., Schmidt, U., Haase, R., Sugawara, K. & Myers, G. Star-convex polyhedra for 3d object detection and segmentation in microscopy. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 3666–3673 (2020).
  • 16.Sadanandan SK, Ranefall P, Le Guyader S, Wählby C. Automated training of deep convolutional neural networks for cell segmentation. Sci. Rep. 2017;7:1–7. doi: 10.1038/s41598-017-07599-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Caicedo JC, et al. Evaluation of deep learning strategies for nucleus segmentation in fluorescence images. Cytometry A. 2019;95:952–965. doi: 10.1002/cyto.a.23863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Baniukiewicz P, Lutton EJ, Collier S, Bretschneider T. Generative adversarial networks for augmenting training data of microscopic cell images. Front. Comput. Sci. 2019;10:25. [Google Scholar]
  • 19.Goodfellow IJ, et al. Generative adversarial networks. IEEE Signal Process. Mag. 2014;1406:2661. [Google Scholar]
  • 20.Liu L, et al. Deep learning for generic object detection: A survey. Int. J. Comput. Vision. 2020;128:261–318. [Google Scholar]
  • 21.Carneiro G, Zheng Y, Xing F, Yang L. Review of deep learning methods in mammography, cardiovascular, and microscopy image analysis. Deep Learn. Convolut. Neural Netw. Med. Image Comput. 2017;20:11–32. [Google Scholar]
  • 22.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Med. Image Comput. Comput. Assist. Intervention. 2015;9351:231–241. [Google Scholar]
  • 23.Çiçek Ö, Abdulkadir A, Lienkamp S, Brox T, Ronneberger O. 3D u-net: Learning dense volumetric segmentation from sparse annotation. Med. Image Comput. Comput. Assist. Intervention. 2016;9901:424–432. [Google Scholar]
  • 24.Milletari, F., Navab, N. & Ahmadi, S. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In International Conference on 3D Vision 565–571 (2016).
  • 25.Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV. Voxelmorph: A learning framework for deformable medical image registration. IEEE Trans. Med. Imaging. 2019;38:1788–1800. doi: 10.1109/TMI.2019.2897538. [DOI] [PubMed] [Google Scholar]
  • 26.Graham S, et al. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med. Image Anal. 2019;58:101563. doi: 10.1016/j.media.2019.101563. [DOI] [PubMed] [Google Scholar]
  • 27.Fu, C. et al. Nuclei segmentation of fluorescence microscopy images using convolutional neural networks. In Proceedings of the IEEE International Symposium on Biomedical Imaging 704–708 (2017).
  • 28.Ho, D. J., Fu, C., Salama, P., Dunn, K. W. & Delp, E. J. Nuclei segmentation of fluorescence microscopy images using three dimensional convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops 834–842 (2017).
  • 29.Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention 265–273 (2018).
  • 30.Ho DJ, et al. Sphere estimation network: Three-dimensional nuclei detection of fluorescence microscopy images. J. Med. Imaging. 2020;7:1–16. doi: 10.1117/1.JMI.7.4.044003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lux, F. & Matula, P. Dic image segmentation of dense cell populations by combining deep learning and watershed. In IEEE International Symposium on Biomedical Imaging 236–239 (2019).
  • 32.Arbelle A, Cohen S, Raviv TR. Dual-task convlstm-unet for instance segmentation of weakly annotated microscopy videos. IEEE Trans. Med. Imaging. 2022;41:1948–1960. doi: 10.1109/TMI.2022.3152927. [DOI] [PubMed] [Google Scholar]
  • 33.Bao, R., Al-Shakarji, N. M., Bunyak, F. & Palaniappan, K. Dmnet: Dual-stream marker guided deep network for dense cell segmentation and lineage tracking. In IEEE International Conference on Computer Vision Workshops 3354–3363 (2021). [DOI] [PMC free article] [PubMed]
  • 34.Scherr T, Löffler K, Böhland M, Mikut R. Cell segmentation and tracking using cnn-based distance predictions and a graph-based matching strategy. PLoS One. 2020;15:e0243219. doi: 10.1371/journal.pone.0243219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Mandal, S. & Uhlmann, V. Splinedist: Automated cell segmentation with spline curves. In Proceedings of the International Symposium on Biomedical Imaging 1082–1086 (2021).
  • 36.Ruiz-Santaquiteria J, Bueno G, Deniz O, Vallez N, Cristobal G. Semantic versus instance segmentation in microscopic algae detection. Eng. Appl. Artif. Intell. 2020;87:103271. [Google Scholar]
  • 37.Hama H, et al. Scale: A chemical approach for fluorescence imaging and reconstruction of transparent mouse brain. Nat. Neurosci. 2011;14:1481–1488. doi: 10.1038/nn.2928. [DOI] [PubMed] [Google Scholar]
  • 38.Clendenon SG, Young PA, Ferkowicz M, Phillips C, Dunn KW. Deep tissue fluorescent imaging in scattering specimens using confocal microscopy. Microsc. Microanal. 2011;17:614–617. doi: 10.1017/S1431927611000535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Chen, A. et al. 3d ground truth annotations of nuclei in 3d microscopy volumes. bioRxiv (2022).
  • 40.Simard, P. Y., Steinkraus, D. & Platt, J. C. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the International Conference on Document Analysis and Recognition 958–963 (2003).
  • 41.Chen, A. et al. Three dimensional synthetic non-ellipsoidal nuclei volume generation using Bezier curves. In Proceedings of the IEEE International Symposium on Biomedical Imaging (2021).
  • 42.Wu, L. et al. Rcnn-slicenet: A slice and cluster approach for nuclei centroid detection in three-dimensional fluorescence microscopy images. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 3750–3760 (2021).
  • 43.Wu, L., Chen, A., Salama, P., Dunn, K. W. & Delp, E. J. An ensemble learning and slice fusion strategy for three-dimensional nuclei instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2022).
  • 44.Zhu, J., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision 2242–2251 (2017).
  • 45.Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnu-net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods. 2021;18:203–211. doi: 10.1038/s41592-020-01008-z. [DOI] [PubMed] [Google Scholar]
  • 46.Yang X, Li H, Zhou X. Nuclei segmentation using marker-controlled watershed, tracking using mean-shift, and Kalman filter in time-lapse microscopy. IEEE Trans. Circ. Syst. I Regul. Pap. 2006;53:2405–2414. [Google Scholar]
  • 47.Rizk A, et al. Segmentation and quantification of subcellular structures in fluorescence microscopy images using squassh. Nat. Protoc. 2014;9:586–596. doi: 10.1038/nprot.2014.037. [DOI] [PubMed] [Google Scholar]
  • 48.McQuin C, et al. Cellprofiler 3.0: Next-generation image processing for biology. PLoS Biol. 2018;16:e2005970-1-17. doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Hirsch, P. & Kainmueller, D. An auxiliary task for learning nuclei segmentation in 3d microscopy images. In booktitleProceedings of the Third Conference on Medical Imaging with Deep Learning, vol. 121 of Proceedings of Machine Learning Research (Arbel, T. et al. eds) 304–321 (2020).
  • 50.Englbrecht F, Ruider IE, Bausch AR. Automatic image annotation for fluorescent cell nuclei segmentation. PLoS One. 2021;16:1–13. doi: 10.1371/journal.pone.0250093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Lee MY, et al. Cellseg: A robust, pre-trained nucleus segmentation and pixel quantification software for highly multiplexed fluorescence images. BMC Bioinform. 2022;23:46. doi: 10.1186/s12859-022-04570-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Cutler KJ, et al. Omnipose: A high-precision morphology-independent solution for bacterial cell segmentation. Nat. Methods. 2022;19:1438–1448. doi: 10.1038/s41592-022-01639-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Mougeot G, et al. Deep learning—promises for 3D nuclear imaging: a guide for biologists. J. Cell Sci. 2022 doi: 10.1242/jcs.258986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Winfree S, et al. Quantitative three-dimensional tissue cytometry to study kidney tissue and resident immune cells. J. Am. Soc. Nephrol. 2017;28:2108–2118. doi: 10.1681/ASN.2016091027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kumar N, et al. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans. Med. Imaging. 2017;36:1550–1560. doi: 10.1109/TMI.2017.2677499. [DOI] [PubMed] [Google Scholar]
  • 56.Hosang J, Benenson R, Dollár P, Schiele B. What makes for effective detection proposals? IEEE Trans. Pattern Anal. Mach. Intell. 2016;38:814–830. doi: 10.1109/TPAMI.2015.2465908. [DOI] [PubMed] [Google Scholar]
  • 57.Shen X, Stamos I. 3d object detection and instance segmentation from 3d range and 2d color images. Sensors. 2021 doi: 10.3390/s21041213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Basu A, Senapati P, Deb M, Rai R, Dhal KG. A survey on recent trends in deep learning for nucleus segmentation from histopathology images. Evol. Syst. 2023 doi: 10.1007/s12530-023-09491-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Schürch CM, et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell. 2020;182:1341–1359.e19. doi: 10.1016/j.cell.2020.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Caicedo JC, et al. Nucleus segmentation across imaging experiments: The 2018 data science bowl. Nat. Methods. 2019;16:1247–1253. doi: 10.1038/s41592-019-0612-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Everingham M, et al. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vision. 2015;111:98–136. [Google Scholar]
  • 62.Padilla, R., Netto, S. L. & da Silva, E. A. B. A survey on performance metrics for object-detection algorithms. In Proceedings of the International Conference on Systems, Signals and Image Processing 237–242 (2020).
  • 63.Davis, J. & Goadrich, M. The relationship between precision-recall and roc curves. In Proceedings of the international conference on Machine learning 233–240 (2006).
  • 64.Winfree S. User-accessible machine learning approaches for cell segmentation and analysis in tissue. Front. Physiol. 2022 doi: 10.3389/fphys.2022.833333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
  • 66.Oktay, O. et al. Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999 (arXiv preprint) (2018).
  • 67.Han, S. et al. Nuclei counting in microscopy images with three dimensional generative adversarial networks. In Proceedings of the SPIE Conference on Medical Imaging10949, 753 – 763 (2019).
  • 68.Zhang, D. et al. Nuclei instance segmentation with dual contour-enhanced adversarial network. In Proceedings of the IEEE International Symposium on Biomedical Imaging 409–412 (2018).
  • 69.Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision 2999–3007 (2017).
  • 70.Salehi, M., Sadegh, S., Erdogmus, D. & Gholipour, A. Tversky loss function for image segmentation using 3d fully convolutional deep networks. IN Proceedings of International Workshop on Machine Learning in Medical Imaging 379–387 (2017).
  • 71.Soille, P. & Vincent, L. M. Determining watersheds in digital pictures via flooding simulations. Visual Communications and Image Processing’90: Fifth in a Series1360, 240–250 (1990).
  • 72.Di Stefano, L. & Bulgarelli, A. A simple and efficient connected components labeling algorithm. In Proceedings of the International Conference on Image Analysis and Processing 322–327 (1999).
  • 73.Svoboda D, Kozubek M, Stejskal S. Generation of digital phantoms of cell nuclei and simulation of image formation in 3d image cytometry. Cytometry Part A J. Int. Soc. Adv. Cytometry. 2009;75:494–509. doi: 10.1002/cyto.a.20714. [DOI] [PubMed] [Google Scholar]
  • 74.McKinley S, Levine M. Cubic spline interpolation. Coll. Redwoods. 1998;45:1049–1060. [Google Scholar]
  • 75.Santella A, Du Z, Nowotschin S, Hadjantonakis AK, Bao Z. A hybrid blob-slice model for accurate and efficient detection of fluorescence labeled nuclei in 3d. BMC Bioinform. 2010;11:1–13. doi: 10.1186/1471-2105-11-580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Isola, P., Zhu, J., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 5967–5976 (2017).
  • 77.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (arXiv preprint) (2017).
  • 78.Wang J, et al. 3D GAN image synthesis and dataset quality assessment for bacterial biofilm. Bioinformatics. 2022;38:4598–4604. doi: 10.1093/bioinformatics/btac529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In IEEE International Conference on Computer Vision 2999–3007 (2017).
  • 80.Rahman, M. A. & Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In International Symposium on Visual Computing 234–244 (2016).
  • 81.Rezatofighi, H. et al. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 658–666 (2019).
  • 82.Powers, D. M. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. arXiv:2010.16061 (arXiv preprint) (2020).
  • 83.Lin, Z. et al. Nucmm dataset: 3d neuronal nuclei instance segmentation at sub-cubic millimeter scale. In International Conference on Medical Image Computing and Computer-Assisted Intervention 164–174 (2021).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The datasets analyzed during the current study are available in the Zenodo repository https://doi.org//10.5281/zenodo.706514739.

The NISNet3D source code package is available at https://gitlab.com/viper-purdue/nisnet-release. The source code is released under Creative Commons License Attribution-NonCommercial-ShareAlike-CC BY-NC-SA. The source code cannot be used for commercial purposes. Address all correspondence to Edward J. Delp, ace@ecn.purdue.edu.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES