Abstract
Background
Mammography is one of the most popular tools for early detection of breast cancer. Contour of breast mass in mammography is very important information to distinguish benign and malignant mass. Contour of benign mass is smooth and round or oval, while malignant mass has irregular shape and spiculated contour. Several studies have shown that 1D signature translated from 2D contour can describe the contour features well.
Methods
In this paper, we propose a new method to translate 2D contour of breast mass in mammography into 1D signature. The method can describe not only the contour features but also the regularity of breast mass. Then we segment the whole 1D signature into different subsections. We extract four local features including a new contour descriptor from the subsections. The new contour descriptor is root mean square (RMS) slope. It can describe the roughness of the contour. KNN, SVM and ANN classifier are used to classify benign breast mass and malignant mass.
Results
The proposed method is tested on a set with 323 contours including 143 benign masses and 180 malignant ones from digital database of screening mammography (DDSM). The best accuracy of classification is 99.66% using the feature of root mean square slope with SVM classifier.
Conclusion
The performance of the proposed method is better than traditional method. In addition, RMS slope is an effective feature comparable to most of the existing features.
Electronic supplementary material
The online version of this article (doi:10.1186/s12938-017-0332-0) contains supplementary material, which is available to authorized users.
Keyword: Breast mass, 1D signature contour subsection, RMS slope
Background
Breast cancer is now the most common cancer in women worldwide. Cases with 12.2% of all newly diagnosed breast cancers and 9.6% of all deaths from breast cancer are contributed by China [1]. Early detection of breast cancer can increase survival rate [2]. Currently, mammography is the most reliable method for detection of the abnormality in the breast [3–5]. But it is still a challenging work for the radiologists to distinguish between the malign and benign mass. Abnormal cases have various contour shapes, textures, and sizes. It is very difficult even for experienced radiologists to discriminate whether the breast mass is malign. Now the diagnoses depend on biopsy puncture which brings hurts to mind and body of the patients. A computer-assisted-diagnose system, which merges image processing and pattern recognition theory, can provide the diagnosis suggestions to decrease the false detection rate and false negative rate [6]. In screening mammography, if a doctor sees a clearly defined mass whose contour is microlobulated or spiculated, he need not ask patient to do pathological puncture. He is quite sure that the mass is malignant Fig. 1b. If the contour of a breast mass is regular and the shape is nearly round, then the mass is probably benign Fig. 1a. The computer assisted diagnose can distinguish the two classes breast mass. It can decrease the pain of patient to do pathological puncture.
Researchers proposed many methods to describe the shape and texture in the system of CAD. Shape descriptor is compactness, eccentricity, moment, Fourier transformation descriptor, statistical marginal characteristics [7–11]. Texture descriptions gray level co-occurrence matrix and fractal dimension and so on [5, 12–14]. Pohlman et al. [15] proposed a method to transform 2D contour of breast mass to 1D signature. The signature of a contour is obtained by a function of radial distance from the centroid to the contour versus the angle of the radial line over the range (0°–360°). In this way, a signature of small fluctuation is obtained if the contour of breast mass is benign. Otherwise, if it is a malignant mass, a signature of large fluctuation is obtained. Fractal character can describe the fluctuation. So in literature [16] the breast mass is classified with the fractal analysis and the classification accuracy is greater than 80%. However, the function of radial versus degree could lead to a multi-value function in the case of an irregular or speculated margin [17]; the signature computed in this manner would also have ranges of undefined values in the case of a contour for which the centroid falls outside the region enclosed by the contour. Rangaraj et al. [16] improved the method. They transformed the 2D contour of breast mass to 1D signature by polygonal modeling of contours of breast masses using the turning angle function. Rangayyan and Nguyen [2] demonstrated the usefulness of fractal analysis for the classification of breast masses with the box-counting and ruler methods for the derivation of the FD of the two-dimensional 2D contours of masses as well as their one-dimensional 1D signatures. Some literatures [2, 9, 18–20] revealed that the regular extent is also very important to make a distinction between benign and malign breast mass. If the shape of mass is circular or oval then its probability to be benign is larger than to be malign mass.
So we propose a new method in this paper to express the regularity of the contour for breast mass. At first, abnormal area in the mammography image is labelled by experimental doctors. Second, we translate 2D contour to 1D signature using the Euclid distance from the edge of the breast mass to periphery of the circular or oval centered with centroid. This method describes not only the roughness of the contour but also the regular degree of the contour. Third, we segment the whole 1D signature into different subsections. Fourth, we extract several local contour features. At last, the feature vectors re-organized according to the local feature value of each subsection are fed different classifiers. The flowchart of our proposed is shown as Fig. 2.
The remainder of this paper is organized as follows. The new method for translating 2D contour to 1D signature is proposed in “Methods”. In “Features”, we extract fractal dimension FD, , (where means mean radial distance of tumor boundary, and means standard deviation), and root mean square slope features describing the contour characteristic. Then in the next Section, experimental results and analysis are introduced. The last is the summary of our work and the prospect of future work.
Methods
In this part, the database is firstly introduced. Second, the method of 2D contour to 1D signature is illustrated in some detail. Finally, we explain how to segment 1D signature into subsections and how to re-organize these subsections.
Database
In this paper, digital database for screening mammography (DDSM) has been utilized to provide the mammography images. This database is provided by the Massachusetts General Hospital, the University of South Florida, and Sandia National Laboratories [21, 22]. This database includes about 2620 cases. Each case has 4 mammography images composed of two view images of each breast, along with some associated patient information. Images containing suspicious areas have associated pixel-level ground truth information about the locations and types of suspicious regions. This information is saved as an overlay file. Each overlay file may specify multiple abnormalities. Each abnormality has information on the lesion type, the assessment, the subtlety, the pathology and at least one outline. Each boundary is specified as a chain code. The details about the DDSM database can be found in literature [23] or availability of data and materials at the end of this article. The database includes Normal, benign and cancer volumes. The research object in this article is the contour of benign and malignant mass. So we choose 323 contours of mammography images from DDSM database including 143 contours of benign images and 180 contours of malign images. In order to simplicity and convenience of experiment, we choose some mammography images including single abnormality. The numbers of the images of we used are listed on the Additional file 1: Appendix S1. Among 143 benign images, most contours are similar ellipse. These benign mass is prone to classify wrongly using existing method. All images are from the different patient.
2D contour to 1D signature
The benign mass has a smooth shape that results in a simple signature, whereas the malignant tumor has a jagged contour that leads to a rough signature. The contours of every abnormality are extracted by means of connecting the point expressed with chain code in the overlay files. Figure 3a and b show the contours of benign breast mass and malignant mass. The contour of a 2D contour can be formalized as an orderly point set along anticlockwise direction . is the coordinate of point and N is the number of point on the contour restricted by . The center of 2D contour is expressed as . The first point on the contour we choose is on the right of center point. It is the crossover point of the horizontal line passed through the center point and the contour of breast mass. Radius is the distance between the point on the contour and the center . The diameter of one axis is . The diameter of the other axis is . So the equation of the ellipse centered as and diameter as respectively is . If , the ellipse is transformed into a circle centered as and diameter is . This ellipse or circle is the standard of breast mass contour. If the points on the contour of breast mass are all near the ellipse, we can declare the contour is regular. The probability that the mass is benign is high. Otherwise, the mass is determined as malign. We define that is the distance function between and . is the distance function between and . The distance between and is defined as . is also the function of the number of pixel on contour. Figure 4a and b show 1D signature of benign and malign breast mass in Fig. 3a and b.
Subsection and integration
The method which 2D contour transforms into 1D signature can describe the feature of the whole contour. Sometimes the local feature is also very important to classify the benign and malignant breast mass. In Fig. 1b, for example, the contour in the left is smooth and regular but subsection in the right is microlobulated. It is not precise if we extract the feature on whole contour. So we propose a method that a whole signature is divided into subsections respectively. If C = 1, the signature is whole one. The feature is a value. Otherwise, the feature of each subsection is extracted respectively. Then segments are ranked by the value of each subsection feature. Finally these subsections are integrated into a whole signature in sequential order according to the value of feature. That is to say the feature of each contour of breast mass is a vector of C dimension. The number of subsections affects the accuracy of classification. Because the optimized amounts of subsections are relevant to the size of mass contour and features, we divide each contour into C subsections and choose the average accuracy of all kinds of subsections as the final performance of each feature. For example, if C = 4, each contour is segmented into 4 subsections. The feature is a vector of 4 dimensions. Then we feed 323 feature vectors into classifiers. After the whole set C is ergodic, we obtain 11 results. The average of 11 results is as the final performance.
Features
In this part, four features are introduced. Among them, RMS is first proposed by us. It can describe the variation of 1D signature in vertical direction well.
Root mean square roughness w
Root mean square roughness describes the irregular degree of 1D signature. The root mean square roughness is defined as: is root mean square roughness defined as . Among the equation, 〈 〉 expresses the statistical average, expresses the fluctuation degree of in vertical direction. The shape is more regular with the value of more small. That is to say that the margin is more close to a circle or ellipse. The mass will more probably be benign than malign. So root mean square roughness may be used as a feature to classify the benign or malign breast mass.
The ratio
The ratio (where means mean radial distance of tumor boundary, and means standard deviation), describes the circularity of the breast mass contour. Malignant mass should have smaller values of circularity than benign mass. Haralick [24] proved that the ratio is a good feature in classifying malignant mass and benign mass. Polhman [15] applied this feature in his 1D signature and acquired the good result.
Fractal dimension
According to the fractal geometry of Mandelbort, the fractal dimension can describe the property of self-similarity in some way. Many fractal models are proposed to analyze fractal phenomenon of nature. The popular fractal model is differential box-counting method. Studies prove that the differential box-counting method is appropriate to self-similarity fractal model. In medical image, the fractal Brownian motion (fBm) model has been shown to be suitable for the analysis of medical image because the intensity surface of a medical image can be viewed as the end result of random walk. The fBm model belongs to the class of statistically self-affine fractal concept and regards naturally occurring rough surfaces as the end result of random walks. Since the roughness of the intensity surface of a medical image can also be viewed as the end result of a random walk, the fBm model suits for the analysis of medical images. To the affine fractal random rough model, autocorrelation function and height-height correlation function can be expressed as [23]:
1 |
2 |
where is the fractal exponent, the relative between and fractal dimension is , is the space dimension, and is constraint by . is root mean square roughness expressing the fluctuation degree of in vertical direction, and is correlation length expressing the fluctuation degree of in horizontal direction. The autocorrelation function of is can be defined as:
3 |
Here, is the interval between two points on signature. The autocorrelation function has some characteristics such as: (1) If the signal is the smooth and steady random process, is irrelevant to and relevant to only i.e. . With the increment of correlation interval , decreases little by little and tends to be zero. The rate of decrease is decided by the distance between two points irrelevant to each other. The correlation length is defined by the value of correlation interval at the point that the autocorrelation function decreases to of the maximum. The correlation length expresses the speed that decreasing with .If the interval between two points is less than , the two points are correlated. Otherwise, the two points are independent. The fluctuation in the horizontal direction is expressed with and the fluctuation in the vertical direction is expressed with .
In the condition of , self-affine fractal surface satisfies self-affine transform below:
4 |
If the scale is small as , the average variation of height difference is . This variation is corresponding to the power law variation of height-height correlation function during the short distance. The relationship is
5 |
The power law variation of height-height correlation function can describe statistically self-similarity characteristic and local fluctuation. If is smaller, the local fluctuation is more violent and fractal dimension is larger. From the Eq. (5), we can conclude that in log–log coordinate system is proportional to when . can be estimated from the slope of the line approximated by linear least squares fitting on versus when we choose a range of the lower scale . Figure 5 shows the curve of versus and the linear fitting for benign and malign mass. In this paper, we look the 1D signature of contour as height distribution of the affine fractal random surface. The fractal dimension indicates the self-similarity feature and it also expresses the local non-smooth fluctuation of the signature. The fractal dimension D is larger and larger; the local fluctuation of the signature is more and more drastic. Here we use the fractal exponent of 1D signature of contour as the third feature to distinguish the benign mass from the malign one.
RMS slope
Each point on the contour has different slope. The variation of slope describes the shape of contour. If the contour is smooth, the variation of slope is slow and regular; otherwise, variation of slope is drastic. When we transform 2D contour into 1D signature, the value in the Y-axis expresses the circularity. The absolute value of the slope shows the variation speed of contour. So we take the slope distribution of each point on the contour as one of the features to discriminate malign mass from benign mass. Slope is acquired by linear interval. Root mean square slope is defined as:
6 |
We can see from the Fig. 4 and Eq. (6) that the slope of benign mass has small value and the fluctuation is gentle. While the slope of malignant mass has big value and the fluctuation is violent. The variation range of the RMS for malignant mass is wider than benign mass.
Classification
K-Nearest-neighbor (KNN), support vector machine (SVM) and artificial neural network (ANN) are used as classifiers in this paper to differ benign mass from malign mass of breast. We choose K = 1 in KNN classifier and use a linear support vector machine classifier. The NNet classier is configured with 10 nodes in the hidden layer. The internal weight is initialized with randomly chosen values. 323 contours are divided into two subsets 300 contours for training and 23 for testing. The software we use is Matlab R2015b on a Win10 Operating System.
Experimental results and analysis
In this part, the performance of the proposed method is reported. Then, performance of four features is compared. Third, the effect of subsections is analyzed. And finally, classifier performance is shown.
Performance evaluation for 2D contour to 1D signature
Table 1 show the comparison of our proposed method and existing method. We can see that the accuracy used our method is higher than used existing method. The obvious promotion is the accuracy of alpha. It raises 14.90%, whereas the accuracy of RMS barely changes. This is because the accuracy of RMS itself is close to 100%. It is difficult to rise greatly. To similar ellipse cases of breast mass in selected database, our proposed method can not only describe the circularity of contour but also illustrate the degree of margin fluctuation. While traditional method used only the standard deviation of median filtering and origin boundary to quantify the degree of margin fluctuation. From Fig. 6 we can see that whether accuracy or sensitivity and specificity are improved with our method. Especially, the specificity of w for SVM raises 10.90%.
Table 1.
Feature | Method | KNN (%) | SVM (%) | ANN (%) |
---|---|---|---|---|
76.68 | 82.21 | 79.84 | ||
81.82 | 88.14 | 88.54 | ||
76.68 | 84.51 | 79.45 | ||
81.82 | 90.57 | 88.54 | ||
83.00 | 87.21 | 83.79 | ||
86.17 | 91.92 | 89.33 | ||
92.09 | 99.33 | 94.86 | ||
92.47 | 99.66 | 99.60 |
Italic type indicate maximum value
Performance evaluation for four features with three classifiers
Figure 7 and Table 1 show the performance of four features with three different classifiers. No matter which classifier is used, the result proves that our proposed feature is better than existing one. To the features and , SVM classifier is the almost the same as ANN and is better than KNN. To other features, SVM is the best among these three classifiers. SVM is robust for small sample data. The accuracy of fractal feature is 99.33%. Its performance is better than and . This is because the 1D signature of contour for breast mass accords with the fractal characteristic. The highest accuracy is 99.96% using the feature of root mean square slope with SVM classifier. The reason is that RMS slope can describe the variation of vertical direction of 1D signature. It is very important to distinguish the benign mass and malignant one.
Performance evaluation for subsection
Figure 8 shows the performance of four features for subsection using proposed in this paper. Performance is improved due to considering the local features in our method. Experiment proves that subsection is efficient to improve the performance for four features. Due to the slope feature has high performance, the improvement is not obvious. It can be seen that the accuracy increases quickly with the increasing the number of the subsections at the start for the feature of fractal dimension. Later the performance is stable with the larger N. This is because when N is larger, the segment is shorter; the number of point on the contour is less. The accuracy is affected due to the less point on the subsection. In three classifiers, SVM acquire the best performance using the feature of RMS slope. The performance of subsection is stable using the ANN classifier for four features.
Conclusion and future work
It is very important for contour to distinguish the benign breast mass from malign one. In this paper, we propose three shape features of broken line for contour to classify the benign and malign breast mass. The accuracy rate attains 99.66% with the RMS slope feature. In addition, we compute fractal dimension by another method of height-height correlation function in log–log coordinate. The accuracy rate attains 99.33%. It is higher than and . For further researches, the selection of N and some texture features could be studied for improving the classification performances. We can choose more cases in order that our study has a wider application range. Also, more advanced classification methods such as deep neural network can be used to improve the classification accuracy.
Authors’ contributions
HL designed the study and drafted the manuscript. YY and XM designed the study and revised the manuscript significantly. TW and YT oversaw the study and revised the manuscript. All authors read and approved the final manuscript.
Acknowledgements
This research is supported by National Natural Science Foundation of China (Grant No. 61573219); NSFC Joint Fund with Guangdong under Key Project (Grant No. U1201258); Shandong Natural Science Funds for Distinguished Young Scholar (Grant No. JQ201316); Fundamental Research Funds of Shandong University (2014JC028); the Fostering Project of Dominant Discipline and Talent Team of Shandong Province Higher Education Institutions. Colleges and universities of Shandong province science and technology plan projects (J16LN19).
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
The dataset analyzed during the current study was derived from the following public domain resources: http://marathon.csee.usf.edu/Mammography/Database.html
Ethics approval
All human data used in this study were obtained from public dataset DDSM. This experiment was approved by the hospital’s ethical research committee.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abbreviations
- RMS
root mean square
- KNN
K-nearest neighbor
- SVM
support vector machine
- ANN
artificial neural network
- DDSM
digital database of screening mammography
- FD
fractal dimension
- ROI
region of interest
Additional file
Contributor Information
Haixia Li, Email: lihaixiajinan@qq.com.
Xianjing Meng, Email: menglixianjing@qq.com.
Tingwen Wang, Email: 521wtw@163.com.
Yuchun Tang, Email: tyc@sdu.edu.cn.
Yilong Yin, Email: ylyin@sdu.edu.cn.
References
- 1.Fan L, Strasser-Weippl K, Li JJ, St Louis J, Finkelstein DM, Yu KD, Chen WQ, Shao ZM, Goss PE. Breast cancer in China. Lancet Oncol. 2014;15(7):279–289. doi: 10.1016/S1470-2045(13)70567-9. [DOI] [PubMed] [Google Scholar]
- 2.Rangayyan RM, Nguyen TM. Fractal analysis of contours of breast masses in mammograms. J Digit Imaging. 2007;20(3):223–237. doi: 10.1007/s10278-006-0860-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wang Y, Li J, Gao X. Latent feature mining of spatial and marginal characteristics for mammographic mass classification. Neurocomputing. 2014;144:107–118. doi: 10.1016/j.neucom.2013.11.050. [DOI] [Google Scholar]
- 4.Choi JY, Kim DH, Plataniotis KN, Ro YM. Classifier ensemble generation and selection with multiple feature representations for classification applications in computer-aided detection and diagnosis on mammography. Expert Syst Appl. 2016;46:106–121. doi: 10.1016/j.eswa.2015.10.014. [DOI] [Google Scholar]
- 5.Zheng Y, Keller BM, Ray S, Wang Y, Conant EF, Gee JC, Kontos D. Parenchymal texture analysis in digital mammography: a fully automated pipeline for breast cancer risk assessment. Med Phys. 2015;42(7):4149–4160. doi: 10.1118/1.4921996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Cupples TE, Cunningham JE, Reynolds JC. Impact of computer-aided detection in a regional screening mammography program. Am J Roentgenol. 2005;185(4):944–950. doi: 10.2214/AJR.04.1300. [DOI] [PubMed] [Google Scholar]
- 7.Xie W, Li Y, Ma Y. Breast mass classification in digital mammography based on extreme learning machine. Neurocomputing. 2016;173:930–941. doi: 10.1016/j.neucom.2015.08.048. [DOI] [Google Scholar]
- 8.Surendiran B, Vadivel A. Mammogram mass classification using various geometric shape and margin features for early detection of breast cancer. Int J Med Eng Inform. 2012;4(1):36–54. doi: 10.1504/IJMEI.2012.045302. [DOI] [Google Scholar]
- 9.Don S, Chung D, Revathy K, Choi E, Min D. A new approach for mammogram image classification using fractal properties. Cybern Inform Technol. 2012;12(2):69–83. [Google Scholar]
- 10.Casti P, Mencattini A, Salmeri M, Ancona A, Mangeri F, Pepe ML, Rangayyan RM. Contour-independent detection and classification of mammographic lesions. Biomed Signal Process Control. 2016;25:165–177. doi: 10.1016/j.bspc.2015.11.010. [DOI] [Google Scholar]
- 11.Dong M, Lu X, Ma Y, Guo Y, Ma Y, Wang K. An efficient approach for automated mass segmentation and classification in mammograms. J Digit Imaging. 2015;28(5):613–625. doi: 10.1007/s10278-015-9778-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Torrents-Barrena J, Puig D, Ferre M, Melendez J, Diez-Presa L, Arenas M, Marti J. Breast masses identification through pixel-based texture Classification. In: Hiroshi Fujita, editor. Berlin: Springer International Publishing; 2014. p. 581–88.
- 13.Ramos RP, do Nascimento MZ, Pereira DC. Texture extraction: An evaluation of ridgelet, wavelet and co-occurrence based methods applied to mammograms. Expert Syst Appl. 2012;39(12):11036–11047. doi: 10.1016/j.eswa.2012.03.020. [DOI] [Google Scholar]
- 14.Costa DD, Campos LF, Barros AK. Classification of breast tissue in mammograms using efficient coding. Biomed Eng Online. 2011;10(7):157. doi: 10.1186/1475-925X-10-55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Pohlman S, Powell KA, Obuchowski NA, Chilcote WA, Grundfest Broniatowski S. Quantitative classification of breast tumors in digitized mammograms. Med Phys. 1996;23(8):1337–1345. doi: 10.1118/1.597707. [DOI] [PubMed] [Google Scholar]
- 16.Guliato D, de Carvalho JD, Rangayyan RM, Santiago SA. Feature extraction from a signature based on the turning angle function for the classification of breast tumors. J Digit Imaging. 2008;21(2):129–144. doi: 10.1007/s10278-007-9069-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Menon Dhanalekshmi PS, Phadke AC. Classification using fractal features of well defined mammographic masses using power spectral analysis and differential box counting approaches. 2015;325:495–501. [Google Scholar]
- 18.Tahmasbi A, Saki F, Shokouhi SB. Classification of benign and malignant masses based on Zernike moments. Comput Biol Med. 2011;41(8):726–735. doi: 10.1016/j.compbiomed.2011.06.009. [DOI] [PubMed] [Google Scholar]
- 19.Rangayyan RM, Ayres FJ, Desautels JEL. A review of computer-aided diagnosis of breast cancer: Toward the detection of subtle signs. J Franklin Inst. 2007;344(34):312–348. doi: 10.1016/j.jfranklin.2006.09.003. [DOI] [Google Scholar]
- 20.Rangayyan RM. Fractal analysis and classification of breast masses using the power spectra of signatures of contours. J Electron Imaging. 2012;21(2):023018. doi: 10.1117/1.JEI.21.2.023018. [DOI] [Google Scholar]
- 21.Heath M, Bowyer K, Kopans D, Moore R, Kegelmeyer P. The digital database for screening mammography. In: Yaffe MJ, editor. Madison: Medical Physics Publishing; 2001. p. 212–18.
- 22.Heath M, Bowyer K, Kopans D, Kegelmeyer PK, Moore R, Chang K, Munishkumaran S. Current status of the digital database for screening mammography computational imaging vision. 1998;13(8982):457–460. [Google Scholar]
- 23.Sinha SK, Sirota EB, Garoff S, Stanley HB. X-ray and neutron scattering from rough surfaces. Phys Rev B: Condens Matter. 1988;38(4):2297–2311. doi: 10.1103/PhysRevB.38.2297. [DOI] [PubMed] [Google Scholar]
- 24.Haralick RM. A measure for circularity of digital figures. IEEE Trans Syst Man Cybern. 1974;4:394–396. doi: 10.1109/TSMC.1974.5408463. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The dataset analyzed during the current study was derived from the following public domain resources: http://marathon.csee.usf.edu/Mammography/Database.html