Abstract
Context:
According to Nottingham grading system, mitosis count plays a critical role in cancer diagnosis and grading. Manual counting of mitosis is tedious and subject to considerable inter- and intra-reader variations.
Aims:
The aim is to improve the accuracy of mitosis detection by selecting the color channels that better capture the statistical and morphological features, which classify mitosis from other objects.
Materials and Methods:
We propose a framework that includes comprehensive analysis of statistics and morphological features in selected channels of various color spaces that assist pathologists in mitosis detection. In candidate detection phase, we perform Laplacian of Gaussian, thresholding, morphology and active contour model on blue-ratio image to detect and segment candidates. In candidate classification phase, we extract a total of 143 features including morphological, first order and second order (texture) statistics features for each candidate in selected channels and finally classify using decision tree classifier.
Results and Discussion:
The proposed method has been evaluated on Mitosis Detection in Breast Cancer Histological Images (MITOS) dataset provided for an International Conference on Pattern Recognition 2012 contest and achieved 74% and 71% detection rate, 70% and 56% precision and 72% and 63% F-Measure on Aperio and Hamamatsu images, respectively.
Conclusions and Future Work:
The proposed multi-channel features computation scheme uses fixed image scale and extracts nuclei features in selected channels of various color spaces. This simple but robust model has proven to be highly efficient in capturing multi-channels statistical features for mitosis detection, during the MITOS international benchmark. Indeed, the mitosis detection of critical importance in cancer diagnosis is a very challenging visual task. In future work, we plan to use color deconvolution as preprocessing and Hough transform or local extrema based candidate detection in order to reduce the number of candidates in mitosis and non-mitosis classes.
Keywords: Breast cancer grading, classification, feature computation, histopathology, mitosis detection, nuclei detection, texture analysis
INTRODUCTION
Quantitative and qualitative assessment of biological objects in histopathological images plays a key role in breast cancer prognosis. According to Nottingham Grading System,[1] an international grading system for breast cancer recommended by the World Health Organization, mitosis count is one of the main factors in breast cancer grading. Indeed, mitotic count provides clues to estimate the proliferation and the aggressiveness of the tumor[2] being a critical step in histological grading of several types of cancers. In clinical practice, the pathologists count mitosis after a tedious microscopic examination of hematoxylin and eosin (H and E), stained slides at high magnification, usually ×40. The area visible in the microscope under a ×40 magnification lens is called a high power field (HPF). This mitotic counting process is cumbersome and often subject to sampling bias due to massive histological images. This results in considerable inter- and intra-reader variation of up to 20% between central and institutional reviewers in tumor prognosis.[3]
In histopathological image analysis, mitosis detection is a difficult task having to cope with several challenges such as irregular shaped object, artifacts and unwanted objects due to the slide preparation and acquisition. Mitosis has four main phases (prophase, metaphase, anaphase and telophase) and each phase has a different shape and texture. It is also observed that artifacts produce objects, which look similar to mitosis. As a result, there is no simple way to detect mitosis based on shape and pixels values. However, the major problem is the very low density of mitosis in a single HPF. It is not unusual to have an HPF without any mitosis.
The state of the art indicates some interesting approaches, with some issues will be discussed and considered in our approach by next: Sertel et al., proposed a computer-aided system using pixel-level likelihood functions and 2-step component-based thresholding for mitosis counting in digitized images of neuroblastoma tissue slides and resulted in 81% of detection rate and 12% false positive (FP) rate.[4] Fuzzy c-mean clustering algorithm along with the ultra-erosion operation in Commission Internationale de l’Eclairage (CIE) Lab (CIE; L = luminance, a = red-green axis, and b = blue-yellow) color space was used in detection of proliferative nuclei and mitosis index in immunohistochemistry images of meningioma.[5] Roullier et al., developed a multi-resolution unsupervised clustering driven by domain specific knowledge that resulted in more than 70% sensitivity and 80% specificity.[6] Weyn et al., performed a limited study that explored the ability of wavelet, Haralick, and densitometric features to distinguish nuclei from low, intermediate and high breast cancer tissue.[7] The diagnostic importance of nuclei texture has been widely studied,[7–9] yet recent work exploring nuclei classification of meningioma subtype in H and E images via analysis of nuclear texture has been limited to one color channel.[8] Irshad et al., proposed a texture features based framework using red green blue (RGB) color space[9] that resulted in 76% F-measure on MITOS database.[10] Haralick features have been previously used in both nuclei and global textural analysis for classification of tumor grade in numerous cancers.[8,9]
In this study, we address some of the shortcomings in previous works, including (1) comprehensive analysis of second order statistical features like Haralick features, run length (RL) features in various color channels of different color spaces rather than a single color space,[4–7] and (2) combining selective statistical features with morphological features in order to classify mitosis from other nuclei. The main novel contributions of this work are: (1) a robust multi-channel statistical features computation of segmented nuclei in various color spaces and (2) nuclei features describing both nuclear morphology and texture that are able to quantify the difference in mitosis and non-mitosis nuclei. The rest of the paper is organized as follows. Section 2 describes the proposed framework for mitosis detection. Experimental results are presented in section 3. Finally, the concluding remarks with future work are given in section 4.
MATERIALS AND METHODS
We propose multi-channels statistical and morphological features combination strategy for mitosis detection in H and E, images. The aim is to improve the accuracy of mitosis detection by selecting the color channels that capture the discriminating features of mitosis from other objects. Three main stages are involved in the proposed methods as shown in Figure 1.
Color Channels Selection
Initially, we convert RGB images into other color spaces like hue saturation value (HSV) (more intuitive for human perception), Lab and Luv (uniform color separation) and investigate which color channel would better capture the pixels and texture information and discriminate the mitosis region from other nuclei and background. By doing histogram analysis of mitosis region and background in all channels of RGB, HSV, Lab and Luv color spaces, the selected channels are red (RGB), blue (RGB), V (HSV), L (Lab) and L (Luv).
Candidate Detection and Segmentation
In H and E stained images, nuclear and cytoplasm regions appear as hues of blue and purple while extracellular material have hues of pink. In order to reduce the complexities for integrating Laplacian of Gaussian (LoG) responses, the RGB images are transformed into a new image called blue ratio (BR) image to accentuate the nuclear dye.[11]
where B, R and G are blue, red and green channel of RGB, respectively. On BR image, we compute LoG responses, which discriminate the nuclei region from the background, hence assisting in detection of candidate for mitosis. Then, we perform binary thresholding followed by morphological processing to eliminate too small regions and fill holes and later, we refine the boundaries of candidates’ using active contour model. Finally, we select candidates by filtering based on the size of candidates.
Feature Computation and Classification
For each candidate, we extract two sets of quantitative image features, which are morphological and statistical features. The five morphological features, computed from the mask of each segmented candidate, are area, roundness, elongation, perimeter and equivalent spherical perimeter. These morphological features reflect the phenotypic information of mitosis nuclei. Utilizing pixel intensity information of the selected color channels including BR image, we extract five first order statistical features including mean, median, variance, kurtosis and skewness of each segmented candidate. This resulted in 30 first order statistical features of candidates. Using mask from candidate segmentation, Haralick co-occurrence (HC)[12] and RL[13] matrices are computed with 1 displacement vector in four directions (0°, 45°, 90°, 135°) for all the selected channels. These texture features are rotationally invariant. So by making average in all four directions, the computed eight HC features and 10 RL features are given in Table 1, where g (i, j) is the element in cell i, j of a normalized HC and RL matrices, and μt and σt are the mean and standard deviation of the row and column sums, respectively. This task is repeated for each selected channels resulting in 48 HC features and 60 RL for each candidate.
Table 1.
Conceptually, a large number of descriptive features are highly desirable for classification of nuclei as mitosis or non-mitosis. However, we get poor classification results when using all extracted features (i.e., 143 features). By removing irrelevant and redundant features from the data, we can improve both the accuracy of classification and performance in terms of computational resource. Afterwards, we use consistency subset evaluation method[14] to select a subset of features that maximize the consistency in the class values. We evaluate the worth of subsets of features by the level of consistency in the class values using the projection of a subset of features from training dataset. The consistency of these subsets is not less than that of the full set of features. At last, we use these subsets in conjunction with a hill climbing search method augmented with backtracking value 5 which looks for the smallest subset with consistency equal to that of the full set of features. This procedure achieves 86% reduction in the dimensionality of feature set, by selecting only 20 features. The selected features contained one morphological feature (equivalent spherical perimeter), eight first order statistical features (median in BR image, blue and L (Lab) channels, variance in BR image and blue channel, Kurtosis in red and blue channels, and skewness in the blue channel, five HC features (energy in BR image, difference moment in red and blue channels, cluster shade and hara-correlation in V (HSV) channel) and six RL features (high grey level runs emphasis (HGRE) in BR image and red channel, low grey level runs emphasis (LGRE) in red, blue and L (Lab) channels and low run low grey level emphasis (LRLGE) in L (Luv) channel). The selected feature set is used to train decision trees (DT) classifier by specifying maximum depth = 10 and number of trees = 10.
RESULTS AND DISCUSSION
We evaluate the proposed framework on MITOS dataset,[10] a freely available mitosis dataset. A total of five H and E stained breast cancer biopsy cores from five patients were scanned using high-resolution whole slide scanners (Aperio and Hamamatsu Systems) at ×40 optical magnification in the Pitié-Salpκtriθre Hospital in Paris, France. In each slide, the pathologists selected ten HPF. A total of 50 HPF images are selected from five whole slides. A HPF has a size of 512 μm2 × 512 μm2 (that is an area of 0.262 mm2), which is the equivalent of a microscope field diameter of 0.58 mm. Each HPF has a digital resolution of 2084 × 2084 pixels. Two senior breast cancer oncologists from Pitié-Salpêtrière hospital have provided the ground truth for spatial presence of mitosis nuclei. These 50 HPFs contain a total of 326 mitosis. The training and testing set consisted of 35 and 15 HPFs containing 226 and 100 mitosis, respectively. We applied candidate detection stage on all the training set images (Aperio and Hamamatsu) and considered those candidates as non-mitosis, which was not mitosis. When we used this dataset for training the classifier, then most of the classifiers are biased toward non-mitosis, which resulted high number of FPs. We applied synthetic minority over-sampling technique[15] on training dataset to increase the number in mitosis class. In addition, down sampling was applied on non-mitosis class, which resulted 30% reduction. Table 2 represents the number of mitosis and non-mitosis in Aperio and Hamamatsu training set before and after sampling of the training set.
Table 2.
On the Aperio and Hamamatsu testing set, the candidate detection stage detects 6238 and 5720 candidates, containing 88 and 81 ground-truth mitosis from a total of 100 ground-truth mitosis, respectively. Therefore, among the entire candidate mitosis there are 6150 and 5639 non-mitosis in Aperio and Hamamatsu testing set, respectively. The candidate detection stage generates a large number of non-mitosis and misses 12 and 19 ground-truth mitosis from the Aperio and Hamamatsu testing set, respectively.
In candidate classification stage, we compare the results of these classification methods with ground-truth information provided along with the dataset. The metrics used to evaluate the mitosis detection of each method include: Number of true positives (TP), number of FPs, number of false negatives (FN), sensitivity or true positive rate, precision or positive predictive value and F-Measure. We evaluate the ability of our multi-channel statistical and morphological features based mitosis detection using DT classifier on Aperio and Hamamatsu images. A comparison of their classification results is presented in Table 3.
Table 3.
When we used all features including statistical and morphological with DT classifier, we get very few FPs but also not so many TP. As compared with Hamamatsu testing set, the experiments on Aperio testing set result in better performances in terms of less TP and FP as well. Although this paper focuses on combining statistical features from different color channels, the ability to integrate morphology of nuclei with selective statistical features at various color channels is also important for mitosis classification. By selecting features from a set of statistical and morphological features, we improve the classification accuracy and reduce the number of FP. By selecting morphological and statistical features in selective channels, we achieve higher mitosis detection rates and F-Measure in Aperio images as compared to Hamamatsu images. Figure 2 shows examples of a detected (green color), undetected (blue circle) and mistakenly detected mitosis (yellow) using selective statistical and morphological features with DT classifier.
CONCLUSION AND FUTURE WORK
An automated mitosis detection framework for H and E, images based on different multi-channel statistical features with morphological features has been proposed. The candidate detection stage represents detection of candidate mitosis using thresholding and morphological processing in BR image. Specifically, our multi-channel features computation scheme uses fixed image scale and extracts nuclei features in different color channels, a highly efficient model for capturing texture features for nuclei classification. In future work, we plan to use color deconvolution as pre-processing and Hough transform or local extrema based candidate detection in order to reduce the number of candidates in mitosis and non-mitosis classes. We also plan to investigate other model-based features computation for mitosis detection.
ACKNOWLEDGMENT
This work is supported by the French National Research Agency ANR, Project Cognitive virtual Microscopy (MICO) under reference ANR-10-TECS-015. We also acknowledge the Pathology Department of La Pitié-Salpêtrière hospital, leaded by Professor Frédérique Capron, for the knowledge provided and the data preparation during this project.
Footnotes
Available FREE in open access from: http://www.jpathinformatics.org/text.asp?2013/4/1/10/112695
REFERENCES
- 1.Bloom HJ, Richardson WW. Histological grading and prognosis in breast cancer: A study of 1409 cases of which 359 have been followed for 15 years. Br J Cancer. 1957;11:359–77. doi: 10.1038/bjc.1957.43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Elston CW, Ellis IO. Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: Experience from a large study with long-term follow-up. Histopathology. 1991;19:403–10. doi: 10.1111/j.1365-2559.1991.tb00229.x. [DOI] [PubMed] [Google Scholar]
- 3.Teot LA, Sposto R, Khayat A, Qualman S, Reaman G, Parham D, et al. The problems and promise of central pathology review: Development of a standardized procedure for the children’s oncology group. Pediatr Dev Pathol. 2007;10:199–207. doi: 10.2350/06-06-0121.1. [DOI] [PubMed] [Google Scholar]
- 4.Sertel O, Catalyurek UV, Shimada H, Gurcan MN. Computer-aided prognosis of neuroblastoma: Detection of mitosis and karyorrhexis cells in digitized histological images. Conf Proc IEEE Eng Med Biol Soc. 2009:1433–6. doi: 10.1109/IEMBS.2009.5332910. [DOI] [PubMed] [Google Scholar]
- 5.Anari V, Mahzouni P, Amirfattahi R. Computer-aided detection of proliferative cells and mitosis index in immunohistochemically images of meningioma. Machine vision and image processing (MVIP) 2010;6th Iranian:1–5. [Google Scholar]
- 6.Roullier V, Lézoray O, Ta VT, Elmoataz A. Multi-resolution graph-based analysis of histopathological whole slide images: Application to mitotic cell extraction and visualization. Comput Med Imaging Graph. 2011;35:603–15. doi: 10.1016/j.compmedimag.2011.02.005. [DOI] [PubMed] [Google Scholar]
- 7.Weyn B, van de Wouwer G, van Daele A, Scheunders P, van Dyck D, van Marck E, et al. Automated breast tumor diagnosis and grading based on wavelet chromatin texture description. Cytometry. 1998;33:32–40. doi: 10.1002/(sici)1097-0320(19980901)33:1<32::aid-cyto4>3.0.co;2-d. [DOI] [PubMed] [Google Scholar]
- 8.Al-Kadi OS. Texture measures combination for improved meningioma classification of histopathological images. Patt Recog. 2010;43:2043–53. [Google Scholar]
- 9.Irshad H, Jalali S, Roux L, Racoceanu D, Hwee LJ, Naour GL, Capron F. Automated mitosis detection using texture, SIFT features and HMAX biologically inspired approach in HIMA @ MICCAI 2012. J Pathol Inform J Pathol Inform. 2013;4:12. doi: 10.4103/2153-3539.109870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mitosis detection contest website. [Last accessed on 2013 Jan 22]. Available from: http://www.ipal.cnrs.fr/ICPR2012 .
- 11.Chang H, Loss LA, Parvin B. Nuclear segmentation in H and E sections via multi-reference graph-cut (MRGC) International Symposium Biomedical Imaging. 2012 [Google Scholar]
- 12.Haralick RM, Shanmuga K, Dinstein I. Textural features for image classification. Systems, Man and Cybernetics, IEEE Transactions on. 1973;6:610–21. [Google Scholar]
- 13.Galloway MM. Texture analysis using gray level run lengths. Computer graphics and image processing. 1975;4:172–9. [Google Scholar]
- 14.Liu H, Setiono R. Aprobabilistic approach to feature selection-a filter solution. Machine learning international workshop then conference; Burlington, Massachusetts: Morgan Kaufmann Publishers, Inc. 1996. pp. 319–27. [Google Scholar]
- 15.Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. Smote: Synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57. [Google Scholar]