Abstract
New tissue-clearing techniques and improvements in optical microscopy have rapidly advanced capabilities to acquire volumetric imagery of neural tissue at resolutions of one micron or better. As sizes for data collections increase, accurate automatic segmentation of cell nuclei becomes increasingly important for quantitative analysis of imaged tissue. We present a cell nucleus segmentation method that is formulated as a parameter estimation problem with the goal of determining the count, shapes, and locations of nuclei that most accurately describe an image. We applied our new voting-based approach to fluorescence confocal microscopy images of neural tissue stained with DAPI, which highlights nuclei. Compared to manual counting of cells in three DAPI images, our method outperformed three existing approaches. On a manually labeled high-resolution DAPI image, our method also outperformed those methods and achieved a cell count accuracy of 98.99% and mean Dice coefficient of 0.6498.
1. INTRODUCTION
With the advent of tissue clearing techniques and large-scale microscopy imaging methods, it is possible to image large tissue samples while sampling at a sub-micron level. Improvements in acquisition rate have introduced new challenges when analyzing these large datasets. Methods that can streamline image analysis will be of particular value to reduce processing burden. Segmentation of cell nuclei in microscopy images has been a topic of heightened interest because of its utility in biological studies. Algorithms such as watershed and Laplacian filtering can be inaccurate when applied to more complex, high cell-density images [1, 2, 3]. Recent segmentation approaches have addressed this problem using more state-of-the-art methods such as active contours, deep learning, and convex decomposition [4, 5, 6, 7, 8]. More general approaches for segmentation have been developed through analysis toolkits such as FARSight [9] and CellProfiler [10]. While both packages are limited to processing of 2-D images, CellProfiler provides a set of modular tools that can be used to create a custom segmentation pipeline.
We present a method that combines a 3-D extension of the Iterative Radial Voting algorithm [11, 12] with the Random Sample Consensus (RANSAC) algorithm [13]. Iterative Radial Voting, a discriminative model, predicts cell locations and candidate cell nucleus boundary points. RANSAC, a generative model, is then used at each location to fit a set of ellipsoid parameters to the region’s local nucleus shape.
2. METHODS
We focus on identification of cell nuclei in neocortex samples stained using 4,6-diamidino-2-phenylindole (DAPI), a fluorescent stain that binds to nucleic acids and fluoresces blue when exposed to UV light. Specifically, DAPI binds to the double-stranded DNA within the tissue, which is found only in the nucleus. DNA is distributed throughout the nucleus, which enables nuclear boundaries to be imaged using DAPI. Other cellular structures, such as axons and dendrites, do not bind to DAPI. As a result, only nuclei are visible in the microscopy image, and they present varied intensities within the nuclei, corresponding to non-uniform DNA concentrations.
Within a typical image (see Fig. 1a), there may be hundreds of nuclei intersecting the plane of section. The density of the nuclei within the image is high relative to those acquired from other types of tissue, for which more rudimentary segmentation approaches, e.g., watershed, may be effective. Because the imaged sections are thick relative to the nuclei dimensions and the distances between nuclei, the boundaries of nuclei within an image may overlap, even though they do not intersect in physical space (see Fig. 1b). It is thus important that we perform this segmentation task in 3D and identify nuclei shapes across the image slices. We do this using a locally generative method, which models the imaged nuclei as a collection of ellipsoids of varying sizes, positions, and orientations. Once nuclei positions have been identified, the generative method seeks to identify the optimal set of ellipsoid parameters that best matches the shape of the nucleus.
Fig. 1: CLARITY image.
(a) A 2048×2048 image of DAPI-stained PCW17 human neocortex showing numerous cell nuclei. (b) Close-up of boxed region. Nuclei appear to have intersecting boundaries because of thick imaging sections.
Algorithm.
We use three descriptors to characterize cell nuclei. The cell count c is a scalar representing the total number of cells in the stack. For each nucleus, indexed by i, we define its location, li, as a point in voxel space and its shape, si, as the orientation and axis lengths of an ellipsoid. The collection of nuclei is thus described by c, the set of nucleus positions L = {l1, l2,…,lc}, and the set of nucleus shapes. The objective of our algorithm is to find the set of parameters that best fit the input image. We first use Iterative Radial Voting to estimate the cell count and a set of approximate locations for each of these nuclei. By determining the cell count and defining a range of possible ellipsoid axis lengths, we reduce the solution space for our generative method to a more tractable size. We then apply RANSAC around each location li to find a set of ellipsoid parameters most representative of the local nuclear shape.
Estimating cell count and nucleus locations.
Iterative Radial Voting [11, 12] identifies spherical objects using radial symmetry and is able to identify overlapping objects. It operates by iteratively tallying votes at each voxel based on image gradients, which contribute votes based on their magnitudes and directions. At each iteration, voting directions are adjusted to focus the predicted object centers in the voting landscape and improve the distinction between overlapping objects. The iterative radial voting algorithm works as follows:
Calculate the image’s normalized gradient Gn and Euclidean distance transform D. Define a gradient threshold and a pixel subset s = {p0 | |Gn(p0) > γ }. Initialize the voting .direction Γ(p0) to be Gn(p0).
For each pixel p0 in subset s initialize a cone C(p1; ϕ, h, p0) with height h and radius r = h × arctan(ϕ) at the pixel’s location p0 and oriented along its voting direction Γ(p0).
Update the voting landscape.
Update the voting direction Γ(p0).
Decrement the cone’s angular range ϕnew = ϕ − Δϕ and repeat steps 2 to 5 until ϕ ≤ ϕstopping.
Once the iterations have stopped, V (p0) is thresholded and pixel clusters are identified. The nucleus centroids L are calculated from each cluster’s center of mass, yielding the nucleus locations of the image and the cell count c.
Nucleus segmentation.
Random sample consensus estimates model parameters in the presence of outliers. While RANSAC requires the model to be predefined, it is robust to noise and is highly efficient. The underlying method operates by selecting a small, random subset of points from the set and fitting the model to them. The model’s fit is evaluated against the remaining set of points and the prior step is repeated until the model is determined to estimate the entire set well. This ellipse fitting RANSAC model operates as follows:
Identify the set of boundary points B(p) in a rectangular neighborhood around each centroid li from the gradient image Gn, generating binary image B.
Skeletonize each z-slice of B, S(p) = skel(B(pz = z)).
From the set of non-zero coordinates in S(pz = z), randomly select 5 points and perform a least-squares ellipse fitting to obtain ellipse coordinates (ex, ey), 5 points being the fewest necessary to perform this fitting. The number of inliers n between the ellipse and binary image S(pz = z) is calculated as the number of points less than distance d to the ellipse,
For each z value, repeat step 3 until n > nthreshold for an ellipse whose major axis lies within a pre-defined lower and upper bound [Mmin, Mmax] or a certain number of iterations has been completed.
Define the set of ellipse points across all z planes, E ={ (ex, ey, z) ∀z ∈ S(p)}.
An ellipsoid is generated from the set of points E using least-squares estimation.
Data.
Post-conception week 17 (PCW17) human neocortex was treated with A1P4 CLARITY hydrogel, polymerized, sectioned into 1mm slices, cleared, and stained with DAPI to reveal nuclei [14]. Fetal tissue was obtained according to IRB guidelines following voluntary termination of pregnancy. Three stacks of 54 images were acquired on a Zeiss LSM 780 confocal laser scanning microscope using a 20X objective with 16-bit resolution and interslice spacing 0.8682μm. Image dimensions and resolutions were: 1) 2048×2048 pixels, pixel size 0.2076μm×0.2076μm; 2) 1024×1024 pixels, pixel size 0.4152μm × 0.4152μm; and 3) 512 × 512 pixels, pixel size 0.8303μm × 0.8303μm.
We performed manual cell counting on slice 27 of all three image stacks using the ImageJ plug-in Cell Counter [15]. We counted 1539 unique nuclei in the 512 × 512 image, 2171 nuclei in the 1024 × 1024 image, and 2015 nuclei in the 2048 × 2048 image. The 2048 × 2048 image stack was also labeled manually to identify all pixels corresponding to each cell intersecting a single slice of the stack (slice 27). Labeling followed a written protocol that specified how to identify nuclei and how to label overlapping nuclei. Each nucleus in the slice was assigned a unique integer label, which was also assigned to any pixels determined to belong to the same nucleus in other slices in the image stack. This process identified 1985 unique nuclei that intersected slice 27 and required approximately 100 hours of person effort. We note that the two cell counts for this image were performed by different raters using different methods, and thus differ slightly (1.51%).
Preprocessing.
We applied two preprocessing steps to normalize the intensity within and between the slices. Image intensity is dependent on the intensity of the laser illuminating the plane of section during imaging. Light travels through the sample to illuminate the plane of section, thus deeper sections are illuminated with less light, producing an image intensity drop-off with z. We corrected for this by: 1) computing the mean of each z-slice image; 2) fitting a 3rd-order curve to the mean values; and 3) dividing each z-slice by the corresponding value in the fitted curve. Within each slice, we normalized the intensity values by applying an anisotropic diffusion filter. This removes noise while maintaining nucleus boundaries, which smooths the image content within the nucleus. Optical section thickness creates a physical limitation to interslice resolution, making sampling sparser across the interslice axis relative to the in-plane axes. As a result, voxel dimensions are larger in the z-component than in x and y, increasing the gradient contribution along this axis. We used a cubic spline to interpolate the image along the z-axis and then subsampled the gradient of this interpolated image to create a set of image gradients with normalized components.
Validation.
For each image stack, we applied our segmentation method to a subset of 11 slices centered on slice z = 27, for which cell nuclei had been counted manually and, in the case of the 2048 × 2048 image, labeled manually. For each result, we then calculated the intersection of each identified ellipsoid model with that slice and labeled each corresponding voxel within the ellipsoid with an integer designating the ellipsoid. This produced a 2D image, S(x, y), of individually labeled nuclei (see Fig. 2).
Fig. 2: Segmentation results.
(a) Original microscopy image section. (b) Image section feature map, calculated as its gradient magnitude. (c) Voting landscape. (d) Manual segmentation. (e) Segmentation generated by our algorithm. Section was selected to show correct and incorrect segmentation, e.g., the algorithm failed to detect a background nucleus in Fig. 2d’s upper-left corner.
We also processed the data using three other methods: Watershed (implemented in MATLAB), FARSight[9], and CellProfiler[10]. We ran Watershed 3D after Laplacian filtering of the preprocessed image to improve its segmentation results. We ran FARSight on both the preprocessed image and on the original image; the original data produced better results, which we present in this work. In CellProfiler, Gaussian global thresholding and shape-based object splitting with a predefined diameter range were used to segment the original microscopy image. For each method, this produced an integer labeling of each identified nucleus in slice z = 27 for each image stack. Nuclei counts were evaluated using accuracy a = 1 − |nseg − ngt|/ngt, where nseg and ngt are the number of nuclei in the automatically segmented and manually counted images, respectively.
For the high resolution image, we compared the label result, S(x, y), of each algorithm against the corresponding slice, T (x, y), extracted from slice z = 27 of the manually labeled data. In this case, we computed accuracy using the count from the manual labeling. We computed recall and precision to determine the number of correctly detected nuclei. Label identifiers in S(x, y) and T (x, y) were matched to minimize the total distance between segmented nuclei and their ground-truth counterparts, where each ground-truth label was matched to exactly one segmented nuclei. The number of true positives TP was calculated as the number of segmented nuclei that were matched to nuclei from the manually labeled data. This was done by searching a spherical neighborhood around each manually labeled nucleus for a corresponding segmented nucleus. If a nucleus was found in this neighborhood that had not already been matched to a different manually labeled nucleus, it was counted as a true positive. If multiple unmatched nuclei were found, the nearest was assigned as the corresponding match. The number of false positives was calculated as FP = nseg − TP and the number of false negatives was calculated as FN = ngt − TP, yielding:
Because manual and automated labels were assigned independently, there is not a direct correspondence of label identifiers between T and S. For each labeled nucleus λi in T(x, y), we determined the corresponding nucleus in λj in S(x, y) by finding the label value for the nucleus in T that had the largest overlap with the nucleus in S, to produce a new labeled image T′. We then computed the Dice similarity coefficient [16], where Sλ and are the subsets of pixels in S and T′ that are labeled as λ, respectively. If no nucleus in S was found to overlap the nucleus in T′, a Dice score of zero was assigned.
3. RESULTS
We implemented our method using MATLAB and applied it to the test data. Processing the 2048×2048×11 image stack required 8.2h on a 4.2GHz Intel i7–7700k with 64GB RAM. Fig. 2 shows an example of the segmentation output and manual labeling for a small subset of the 2048 × 2048 image, selected to demonstrate successful and unsuccessful detections.
Cell count accuracy relative to the manually counted data is shown in Table 1. All methods showed reduced accuracy at lower resolutions, likely due to poorly defined boundaries between overlapping nuclei. Our method achieved the highest count accuracy at each resolution. For the 2048×2048 image, our method identified 1965 cells, yielding a count accuracy of 97.52% compared to the 2015 cells counted manually.
Table 1: Nuclei count accuracy.
Nuclei count accuracy for each test image resolution comparing each algorithm’s results with the manual count obtained using ImageJ.
| Algorithm | 512×512 | 1024×1024 | 2048×2048 |
|---|---|---|---|
| Voting | 70.83% | 90.23% | 97.52% |
| CellProfiler | 67.06% | 64.67% | 74.99% |
| FARSight | 43.60% | 61.95% | 84.91% |
| Watershed | 6.36% | 35.00% | 61.14% |
Detailed comparisons with manually labeled data were performed on the 2048 × 2048 image, including accuracy, recall, precision, and F-scores (Tab. 2), and mean Dice similarity coefficients and Dice score percentiles (Tab. 3). Our method segmented 62.34% of the cells with a Dice measure greater than 0.7, compared to 43.68% for CellProfiler, 32.31% for FARSight and 0.51% for Watershed. At a Dice threshold of 0.85, our method labeled 44.22% of the cells at this level, while CellProfiler labeled 29.32% and FARSight labeled 2.11% of the cells at this level; Watershed 3D segmented no cells with Dice scores above or equal to 0.85.
Table 2: Nuclei classification performance.
Classification performance for each algorithm compared to the manually labeled 2048 × 2048 nuclei image.
| Algorithm | Accuracy | Recall | Precision | F-score |
|---|---|---|---|---|
| Voting | 98.99% | 0.9768 | 0.9833 | 0.9800 |
| CellProfiler | 76.12% | 0.6766 | 0.9414 | 0.7873 |
| FARSight | 86.20% | 0.5864 | 0.7395 | 0.6541 |
| Watershed | 62.07% | 0.5441 | 0.9532 | 0.6928 |
Table 3: Segmentation performance.
Average Dice similarity coefficients and Dice percentiles relative to a manually labeled set of data. Percentiles reflect the fraction of segmented nuclei with Dice scores greater than 0.7 and 0.85.
| Algorithm | Mean Dice | % Dice≥0.7 | % Dice≥0.85 |
|---|---|---|---|
| Voting | 0.6498 | 62.34% | 44.22% |
| CellProfiler | 0.5963 | 43.68% | 29.32% |
| FARSight | 0.5061 | 32.31% | 2.11% |
| Watershed | 0.2224 | 0.51% | 0.00% |
4. DISCUSSION
The main contribution of this work is a new method for segmenting cell nuclei in fluorescence microscopy images using a 3-D extension to the existing Iterative Radial Voting algorithm described in [11] combined with a RANSAC-based ellipsoid fitting model. Our approach produced strong agreement with manually labeled high-resolution data based on measures of total cell count (98.99% accuracy) and average DICE similarity coefficient (0.6498) computed over all labeled cell nuclei. For this CLARITY data set, our method outperformed CellProfiler, FARSight, and Watershed. The purpose of this comparison was to demonstrate the application of segmenting overlapping nuclei in high-resolution CLARITY images, a problem not specifically addressed by these existing cell segmentation toolkits. While we do not claim that our method is a superior general purpose segmentation algorithm, we have shown that it can segment nuclei in DAPI-stained CLARITY images with higher accuracy at a number of different resolutions.
These promising results suggest several future directions for further evaluation and improvements to the method. One key improvement would be to reduce the time required to apply the method, particularly in the radial-voting portion of our algorithm. This might be achieved by improving the efficiency of the voting scheme or by developing a GPU-based implementation. The method would also benefit from a more extensive evaluation, based on additional manually labeled datasets and comparison with additional competing algorithms. We expect that these further studies will help to make our method more practical as a tool for quantitative analysis of cellular microscopy imaging.
Acknowledgments
This work was supported by NIH grants to D.H.G. (1U01 MH105991–03) and to D.W.S. (R01 NS074980), and the California Institute for Regenerative Medicine (CIRM)-BSCRC Training Grant (TG2–01169) to L.T.U. Cortical tissue was collected from the UCLA CFAR (5P30 AI028697).
5. REFERENCES
- [1].Harder Nathalie, Bodnar Megan, Eils Roland, Spector David L, and Rohr Karl, “3D segmentation and quantification of mouse embryonic stem cells in fluorescence microscopy images,” in Biomedical Imaging, 2011 IEEE International Symposium on IEEE, 2011, pp. 216–219. [Google Scholar]
- [2].Lin Gang, Adiga Umesh, Olson Kathy, Guzowski John F, Barnes Carol A, and Roysam Badrinath, “A hybrid 3D watershed algorithm incorporating gradient cues and object models for automatic segmentation of nuclei in confocal image stacks,” Cytometry Part A, vol. 56, no. 1, pp. 23–36, 2003. [DOI] [PubMed] [Google Scholar]
- [3].Loukas Constantinos G, Wilson George D, Vojnovic Borivoj, and Linney Alf, “An image analysis-based approach for automated counting of cancer cell nuclei in tissue sections,” Cytometry Part A, vol. 55, no. 1, pp. 30–42, 2003. [DOI] [PubMed] [Google Scholar]
- [4].Dufour Alexandre, Shinin Vasily, Tajbakhsh Shahragim, Nancy Guillén-Aghion J-C Olivo-Marin, and Zimmer Christophe, “Segmenting and tracking fluorescent cells in dynamic 3-d microscopy with coupled active surfaces,” IEEE Transactions on Image Processing, vol. 14, no. 9, pp. 1396–1410, 2005. [DOI] [PubMed] [Google Scholar]
- [5].Mathew Biena, Schmitz Alexander, Silvia Muñoz-Descalzo Nariman Ansari, Pampaloni Francesco, Stelzer Ernst HK, and Fischer Sabine C, “Robust and automated three-dimensional segmentation of densely packed cell nuclei in different biological specimens with lines-of-sight decomposition,” BMC Bioinformatics, vol. 16, no. 1, pp. 1, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Ronneberger Olaf, Fischer Philipp, and Brox Thomas, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2015, pp. 234–241. [Google Scholar]
- [7].Stegmaier Johannes, Amat Fernando, William C Lemon Katie McDole, Wan Yinan, Teodoro George, Mikut Ralf, and Keller Philipp J, “Real-time three-dimensional cell segmentation in large-scale microscopy data of developing embryos,” Developmental Cell, vol. 36, no. 2, pp. 225–240, 2016. [DOI] [PubMed] [Google Scholar]
- [8].Yin Zhaozheng, Bise Ryoma, Chen Mei, and Kanade Takeo, “Cell segmentation in microscopy imagery using a bag of local bayesian classifiers,” in Biomedical Imaging, 2010 IEEE International Symposium on IEEE, 2010, pp. 125–128. [Google Scholar]
- [9].Roysam B, Shain W, Robey E, Chen Y, Narayanaswamy A, Tsai CL, Al-Kofahi Y, Bjornsson C, Ladi Ena, and Herzmark Paul, “The farsight project: associative 4d/5d image analysis methods for quantifying complex and dynamic biological microenvironments,” Microscopy and Microanalysis, vol. 14, no. S2, pp. 60, 2008.18171500 [Google Scholar]
- [10].Carpenter Anne E, Jones Thouis R, Lamprecht Michael R, Clarke Colin, Kang In Han, Friman Ola, Guertin David A, Chang Joo Han, Lindquist Robert A, Moffat Jason, et al. , “Cellprofiler: image analysis software for identifying and quantifying cell phenotypes,” Genome biology, vol. 7, no. 10, pp. R100, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Parvin Bahram, Yang Qing, Han Ju, Chang Hang, Rydberg Bjorn, and Barcellos-Hoff Mary Helen, “Iterative voting for inference of structural saliency and characterization of subcellular events,” IEEE Transactions on Image Processing, vol. 16, no. 3, pp. 615–623, 2007. [DOI] [PubMed] [Google Scholar]
- [12].Han J, Chang H, Yang Q, Fontenay G, Groesser T, Barcellos-Hoff M Helen, and Parvin B, “Multiscale iterative voting for differential analysis of stress response for 2d and 3d cell culture models,” Journal of microscopy, vol. 241, no. 3, pp. 315–326, 2011. [DOI] [PubMed] [Google Scholar]
- [13].Schnabel Ruwen, Wahl Roland, and Klein Reinhard, “Efficient RANSAC for point-cloud shape detection,” in Computer graphics forum. Wiley Online Library, 2007, vol. 26, pp. 214–226. [Google Scholar]
- [14].Chung Kwanghun and Deisseroth Karl, “CLARITY for mapping the nervous system,” Nature Methods, vol. 10, no. 6, pp. 508–513, 2013. [DOI] [PubMed] [Google Scholar]
- [15].Schindelin Johannes, Ignacio Arganda-Carreras Erwin Frise, Kaynig Verena, Longair Mark, Pietzsch Tobias, Preibisch Stephan, Rueden Curtis, Saalfeld Stephan, Schmid Benjamin, et al. , “Fiji: an open-source platform for biological-image analysis,” Nature methods, vol. 9, no. 7, pp. 676–682, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Dice Lee R, “Measures of the amount of ecologic association between species,” Ecology, vol. 26, no. 3, pp. 297–302, 1945. [Google Scholar]


