Abstract
Detection of nuclei is an important step in phenotypic profiling of histology sections that are usually imaged in bright field. However, nuclei can have multiple phenotypes, which are difficult to model. It is shown that convolutional neural networks (CNN)s can learn different phenotypic signatures for nuclear detection, and that the performance is improved with the feature-based representation of the original image. The feature-based representation utilizes Laplacian of Gaussian (LoG) filter, which accentuates blob-shape objects. Several combinations of input data representations are evaluated to show that by LoG representation, detection of nuclei is advanced. In addition, the efficacy of CNN for vesicular and hyperchromatic nuclei is evaluated. In particular, the frequency of detection of nuclei with the vesicular and apoptotic phenotypes is increased. The overall system has been evaluated against manually annotated nuclei and the F-Scores for alternative representations have been reported.
I. Introduction
Cellular organization is an important index for profiling diseased regions of microanatomy and histopathology. For example, the normal cellular organization is often lost as a result of rapid proliferation in malignant tissue. More specifically, the degree of cellularity is one of the indices for (i) diagnosis of Glioblastoma Multiforme (GBM) as a result of increased proliferation of glial cells, (ii) evaluating the efficacy of a neoadjuvant chemotherapy of breast carcinoma [1], (iii) grading prostate cancer-based Gleason score [2]. Furthermore, cellularity is often heterogeneous, which is potentially the results of cellular plasticity for recruiting lymphocytes, promoting angiogenesis, and potential hypoxia. The goal of this paper is to develop validated computational tools for quantifying cellularity from a large cohort of H&E stained histology sections so that clinical relevance can be investigated, where quantification of cellularity depends on nuclear detection. However, a large cohort of H&E stained histology sections often suffers from technical and biological variations, where a number of methods have been proposed in the context of nuclear segmentation [3], [4]. In this context, technical variations refer to variations in fixation and staining, and biological heterogeneity refers to the fact that no two patients are alike and local and global patterns of diseased tissues vary widely.
There are many variations of the nuclear phenotypes, which provide insights into the cellular states. Often, detection of nuclei is limited to those with hyperchromatic signature, which have an appearance of the dark signal against the background. However, one of the main challenges is the detection of vesicular and necrotic phenotypes, which are difficult to model using procedural methods. Therefore, one of our goals has been to evaluate whether automatic feature learning can improve detection of these phenotypes. Our approach toward automatic feature learning is convolutional neural networks (CNN) and is evaluated with alternative representations of the raw data. Another novelty of our study is that human engineered features improve nuclear detection using CNN. The human engineered feature is based on Laplacian of Gaussian (LoG) filter, where blob-like objects are accentuated. In short, LoG filter responses provide an improved representation of the spatial landscape for training a CNN.
Organization of this paper is as follow: Section II reviews previous research. Section III describes the details of the proposed method. Section IV presents our preliminary experimental results and performance of alternative architectures. Lastly, Section V concludes the paper.
II. Background
The topic of nuclear detection and segmentation have been explored widely [5], [6], [7], [8]. Traditionally, nuclear detection has relied on procedural models of the field of computer vision [9]. However, more recently and because of the popularity of deep learning, CNN has been evaluated for the purpose of nuclear detection.
Various CNN configurations have been suggested for detection of nuclei in histology images. Xie et al. [10] suggested a deep voting method, which is a CNN based approach that used nucleus centroids localization by assigning each input a voting confidence. Sirinukuwattana et al. [11] also proposed a spatially constrained CNN to do nuclei detection. They forced spatial constraint at the prediction of the likelihood of a pixel by assigning higher probability values to the pixels located in the vicinity of the nuclei centers.
Although CNNs have been applied to nuclear detection in histology sections using raw representation, CNNs have not been applied to nuclei with vesicular or necrotic phenotypes and the impact of engineered features has not been evaluated extensively. This paper examines various permutations of input representations (e.g., RGB, gray, engineered features) coupled with network architecture.
III. Proposed Method
Detection of nuclei can be accomplished by using a CNN as a classifier and applying sliding window through the whole image. The result will be a probability map which indicates the probability of each pixel to be the centroid of a nucleus. A CNN classifier consists of two parts: (i) the feature extraction part that includes a few convolution layers followed by pooling layers and an activation function such as a sigmoid, tanh, and ReLU; and (ii) the classification part which is a few fully connected layers complemented by a loss function.
There are many permutations of the CNN architecture (e.g., in terms of convolution size, the size of the filter bank, activation, contrast enhancement); thus, several variations of CNN architecture were designed and evaluated. We have learned that CNN with two convolutional layers, with 2 × 2 max-pooling, and ReLU as activation function provided the better performance. This network is shown in Figure 1. The last stage has a LogSoftmax function that computes the probability of each pixel as being the centroid of the nuclei. Table I indicates the best architecture of CNN following our analysis.
TABLE I.
Layer | Type | Input/Output Dimensions | Filter Dimensions |
---|---|---|---|
0 | Input | 51 × 51 × 1 | |
1 | Conv | 28 × 28 × 256 | 24 × 24 × 1 × 256 |
2 | Max-Pooling | 14 × 14 × 256 | 2 × 2 |
3 | Conv | 10 × 10 × 128 | 5 × 5 × 256 × 128 |
4 | Max-Pooling | 5 × 5 × 128 | 2 × 2 |
5 | Full | 1 × 2 | - |
6 | LogSoftmax | - | - |
Gray or RGB normalized images have been used widely as inputs to deep networks; however, nuclei detection can benefit from engineered features that accentuate their blob-shape property. One of the most encouraging filters for blob shape detection is the LoG, which is being evaluated as an alternative to the raw grayscale image. Nevertheless, there are several permutations of the input representations. For example, one can apply the LoG filter to the gray level representation of the original image or to the nuclear channel following color decomposition.
In order to separate the nuclear channel of a color histology image, color decomposition is required. Usually, color decomposition requires estimation of the stain matrix, which indicates the ratio proportions of red, green, and blue in each stain channel. Another method to estimate the stain matrix is based on the singular value decomposition proposed by Macenko et al. [12], which is publicly available and has been evaluated in our study.
With respect to training of CNN, there are two dominant strategies for patch selection, which includes either random selection, from the image, or selection from nuclei centered patches. In the former, data augmentation is less important because random selection can intrinsically increase the sample size. In the latter case, data augmentation is highly desirable and necessary. Strategies for data augmentation include affine transform, perturbations by manipulating the basis functions, and elastic deformation. Our analysis revealed that the policy of random selection provided a more diverse signature and is more effective than nuclei centered patches.
IV. Experiments
In order to evaluate the proposed concept, several configurations are implemented and performance is quantified. The validation dataset consists of 29 histology sections of size 1k × 1k, which includes 21 brain and 8 breast images. These images have been hand segmented totaling 13,766 nuclei. Images were equally divided between training and testing samples (e.g., 50–50). Implementation of the color decomposition method has been borrowed from stain normalization toolbox [13]. The batch gradient descent with the batch size of 256 is used for back propagation optimization. The learning rate is set at 10−5 and the learning rate decay is set at 10−7. L1 and L2 regularizations were performed with weights of 0.001 and 0.01, respectively. Since proper initialization is critical for deep networks, the weights and biases are initialized using the proposed method in [14]. Accordingly, the biases are initialized to be zero and the weights, in each layer, are initialized with a uniform distribution as follow:
(1) |
Where, Wij is the layers weights, U indicates uniform distribution, and n is the size of the previous layer. The input samples have been scaled to have zero mean and be in the range of [−1, 1]. Nuclei detection accuracy of the various approaches are calculated based on precision, recall, and F-Score as follow. Since some of the nuclei may be detected more than one time, the percentage of over-detected nuclei are also reported.
(2) |
(3) |
(4) |
Table II shows the recall, precision, F-score and percent of over-detected nuclei for different representations discussed in this study for detection of nuclei in histology sections. The results indicated that LoG representation of nuclear channel has superior performance. In addition, the use of the nuclear channel, following color decomposition, improves performance over the RGB representation. Detection of nuclei for two samples having hyperchromatic and vesicular nuclear phenotypes are shown in Figures 2 and 3, respectively. Figure 2 shows a sample with hyperchromatic nuclear phenotype and the detection results with CNN based on different input representations that include RGB, nuclear channel following color decomposition, and the LoG of the nuclear channel. Figure 3 shows a sample with vesicular nuclear phenotype with detection results shown using CNN. These results indicate qualitatively that LoG response of the nuclear channel, following color decomposition, performs well for detection of hyperchromatic and vesicular phenotypes. Figure 4 shows a subset of learned filters of the first layer of CNN. These filters encode different shapes, size, and phenotypes that appear in the dataset. Finally, we performed bootstrapping technique, which is retraining the network by misclassified samples, where the F-Score improved by another 4%.
TABLE II.
Method | Recall | Precision | F-Score | Over-detected |
---|---|---|---|---|
LoG of nuclear channel+CNN | 0.6978 | 0.7433 | 0.7222 | 0.0805 |
Nuclear channel+CNN | 0.6301 | 0.7151 | 0.6699 | 0.1158 |
RGB+CNN | 0.3836 | 0.8894 | 0.5361 | 0.1586 |
V. Conclusion
Experiments in this paper indicate that nuclei detection can be improved with training a CNN with the LoG representation following color decomposition. The LoG filter has a tendency for accentuating the underlying spatial distribution of the nuclei regions and to perform a rudimentary initial detection. Furthermore, one of the major challenges for nuclear detection has been the vesicular phenotypes, which can be biological or caused by poor sample preparation. However, the proposed model has significantly improved detection of this class of phenotypes. These observations suggest that applications of engineered features and color decomposition are important for the improved performance of nuclear detection using CNN.
Acknowledgments
This work was supported, in part, by a grant from the NIH under the award number R01CA140663.
References
- 1.Rajan R, Poniecka A, Smith TL, Yang Y, Frye D, Pusztai L, Fiterman DJ, Gal-Gombos E, Whitman G, Rousier R, Green M, Kuerer H, Buzdar AU, Hortobagyi GN. Change in tumor cellularity of breast carcinoma after neoadjuvant chemotherapy as a variable in the pathologic assessment of response. Cancer. 2004 Apr;100(7):1365–1373. doi: 10.1002/cncr.20134. [DOI] [PubMed] [Google Scholar]
- 2.Lawrence EM, Warren AY, Priest AN, Barret T, Goldman DA, Gill AB, Gnanapragasam VJ, Sala E, Gallagher FA. Evaluating prostate cancer using fractional tissue composition of radical prostatectomy specimens and pre-operative diffusional kurtosis magnetic resonance imaging. PLoS ONE. 2016;11(7):e0159652. doi: 10.1371/journal.pone.0159652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Chang H, Han J, Borowsky A, Loss L, Gray JW, Spellman PT, Parvin B. Invariant delineation of nuclear architecture in glioblastoma multiforme for clinical and molecular association. IEEE Trans Med Imaging. 2013 Apr;32(4):670–682. doi: 10.1109/TMI.2012.2231420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kothari S, Phan JH, Moffitt RA, Strokes TH, Hassberger SE, Chaudry Q, Young AN, Wang MD. Automatic batch-invariant color segmentation of histological cancer images. Proc IEEE Int Symp Biomed Imaging. 2011;2011:657–660. doi: 10.1109/ISBI.2011.5872492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Xing F, Yang L. Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: A comprehensive review. IEEE Reviews in Biomedical Engineering. 2016;9:234–263. doi: 10.1109/RBME.2016.2515127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kothari S, Phan JH, Osunkoya AO, Wang MD. Biological interpretation of morphological patterns in histopathological whole slide images. 2012 doi: 10.1145/2382936.2382964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Latson L, Sebek N, Powell K. Automated cell nuclear segmentation in color images of hematoxylin and eosin-stainedbreast biopsy. Analytical and Quantitative Cytology and Histology. 2003;26(6):321–331. [PubMed] [Google Scholar]
- 8.Al-Kofahi Y, Lassoued W, Lee W, Roysam B. Improved automatic detection and segmentation of cell nuclei in histopathology images. IEEE Trans Biomed Eng. 2010;57(4):841–852. doi: 10.1109/TBME.2009.2035102. [DOI] [PubMed] [Google Scholar]
- 9.Parvin B, Yang Q, Han J, Chang H, Rydberg B, Barcellos-Hoff MH. Iterative voting for inference of structural saliency and characterization of subcellualr events. IEEE Trans on Imag Proc. 2007;16:615–623. doi: 10.1109/tip.2007.891154. [DOI] [PubMed] [Google Scholar]
- 10.Xie Y, Kong X, Xing F, Liu F, Su H, Yang L. Medical Image Computing and Computer-Assisted Intervention-MICCAI. New York: Springer; 2015. Deep voting: A robust approach toward nucleus localization in microscopy images; pp. 374–382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sirinukunwattana K, Raza S, Tsang Y, Snead D, Cree IA, Rajpoot NM. Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Transactions on Medical Imaging. 2016 May;35 doi: 10.1109/TMI.2016.2525803. [DOI] [PubMed] [Google Scholar]
- 12.Macenko M, Neithammer M, Marron JS, Borland D, Woosley JT, Guan X, Schmitt C, Thomas NE. A method for normalizing histology slides for quantitative analysis. IEEE International Symposium on Biomedical Imaging: From Nano to Macro; Boston, MA. June 2009.pp. 1107–1110. [Google Scholar]
- 13.Rajpoot Nasir. Stain normalization toolbox. Jan, 2015. [Google Scholar]
- 14.Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS); Jun, 2014. pp. 1929–1958. [Google Scholar]