Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Sep 22.
Published in final edited form as: Proc SPIE Int Soc Opt Eng. 2014 Feb 15;9034:90342T. doi: 10.1117/12.2043990

Automated Volumetric Breast Density derived by Shape and Appearance Modeling

Serghei Malkov a, Karla Kerlikowske b, John Shepherd a
PMCID: PMC4112966  NIHMSID: NIHMS581798  PMID: 25083119

Abstract

The image shape and texture (appearance) estimation designed for facial recognition is a novel and promising approach for application in breast imaging. The purpose of this study was to apply a shape and appearance model to automatically estimate percent breast fibroglandular volume (%FGV) using digital mammograms. We built a shape and appearance model using 2000 full-field digital mammograms from the San Francisco Mammography Registry with known %FGV measured by single energy absorptiometry method. An affine transformation was used to remove rotation, translation and scale. Principal Component Analysis (PCA) was applied to extract significant and uncorrelated components of %FGV. To build an appearance model, we transformed the breast images into the mean texture image by piecewise linear image transformation. Using PCA the image pixels grey-scale values were converted into a reduced set of the shape and texture features. The stepwise regression with forward selection and backward elimination was used to estimate the outcome %FGV with shape and appearance features and other system parameters. The shape and appearance scores were found to correlate moderately to breast %FGV, dense tissue volume and actual breast volume, body mass index (BMI) and age. The highest Pearson correlation coefficient was equal 0.77 for the first shape PCA component and actual breast volume. The stepwise regression method with ten-fold cross-validation to predict %FGV from shape and appearance variables and other system outcome parameters generated a model with a correlation of r2 = 0.8. In conclusion, a shape and appearance model demonstrated excellent feasibility to extract variables useful for automatic %FGV estimation. Further exploring and testing of this approach is warranted.

Keywords: shape and appearance model, breast density, digital mammography

1. INTRODUCTION

Interest is growing in the developing automated breast density measures because of its strong association with breast cancer risk. Although a number of automated methods to quantify mammographic and volumetric density have been developed, there are issues with accuracy and reproducibily. There is a demand for developing new accurate and automated breast density estimation techniques that are strongly associated with breast cancer risk and can be easily measured in a clinical screening. There are several approaches to automatically estimate planimetric mammographic and true volumetric breast densities. We could classify them as in-image phantom based calibration,13 prior calibration,4,5 physical image formation model,6,7 image processing adaptive thresholding,812 and statistical model building13,14 approaches. Recently, single energy absorptiometry (SXA) - an automated method for quantifying fibroglandular tissue volume has been developed.1 It exhibited good accuracy and precision for a broad range of breast thicknesses, paddle tilt angles, and %FGV values. The method uses a breast tissue-equivalent phantom in the unused portion of the mammogram as a reference to estimate breast composition. To perform quality control monitoring and cross-validation between sites and machines a new modified calibration approach for the SXA method15 was developed. It provides stable thickness measurements and grayscale to density pixel conversion and different machine and sites cross-validation. The cross-calibration is achieved by quality control monitoring with specially designed calibration phantom to control thickness and grey-scale conversion stability by the phantom weekly scanning. The new automated approach for volumetric breast density estimation proposed in this paper combines volumetric density measures derived by the SXA method and statistical model building technique based on image parameters extracted from the mammogram. Thus, we achieved automatic volumetric breast density estimation from digital mammograms not using the SXA phantom. The image shape and texture estimation (appearance) designed for face recognition seems to be promising approach for application in breast imaging to extract the features suitable for statistical model building. Potentially, the breast shape and appearance model parameters could give new information valuable for breast density estimation, breast cancer risk assessment and diagnostics. The purpose of this study is to apply a shape and appearance model approach to digital mammograms for automatically quantifying true volumetric fibroglandular tissue volumes from clinical screening full-field digital mammograms without use of phantoms.

2. METHODS

Our approach for volumetric breast density estimation consists in building statistical model using training set of digital mammograms with known measures of percent fibroglandular tissie volume measured by SXA. To derive the model we follow the standard procedure in supervised machine learning: feature generation, feature selection, regression classification of outputs, final model building and validation. The main set of features was generated using shape and appearance model approach.

2.1 Shape and Appearance model

The shape and appearance model approach was implemented according to the method of Cootes et al16. To build the shape model, we used the 137 edge and grid point pairs of x, y coordinates inside of the breast area as shown in the Fig. 1. The edge line was calculated using global threshold method. Principal Component Analysis (PCA) was applied to extract significant and uncorrelated components. As a preprocessing step an affine transformation was used to remove rotation, translation and scale. The scale factor of this transformation was used as a feature additional to the shape PCA components. In order to build an appearance model, first, we transformed the breast images into the mean texture image by piecewise linear image transformation. That step aligned all texture information inside the reference mean image. Then, a set of significant principal component vectors of appearance model was calculated from image pixels grey-scale values. Finally, the shape and texture features of each image were created using principal component vectors. The equations (14) describe calculations of the shape (1, 2) and appearance (3, 4) PCA scores:

[VS,λS,XM]=PCA(X) (1)
BS=VS*(XXM) (2)
[VA,λA,GM]=PCA(GW) (3)
GA=VA*(GWGM) (4)

where X – x, y coordinates; XM – mean of x, y coordinates, λS, VS, BS – PCA shape eigen values, vectors and scores, GM, GW – mean and normalized breast values of pixels, λA, VA, GA - PCA appearance eigen values, vectors and scores. PCA() function executes PCA transformation.

Figure 1.

Figure 1

a) The breast with 137 markers and triangulation; b) mean shape breast template with 137 markers and triangulation.

2.2 Datasets and processing method

To build a shape and appearance model we used training set of 2000 deidentified, full field digital mammograms from San Francisco Mammography Registry with known %FGV. The %FGV, dense tissue (FGV) and breast volumes were obtained by the phantom based x-ray SXA method described in Malkov et al1. The mammograms in this study were obtained during 4.5 period of time on 10 Hologic Selenia machines located at 3 different sites (California Pacific Medical Center, Marin General Hospital, Novato Community Hospital). Moreover, during one and a half years of this period the recalibration procedure was performed to fulfill quality assurance monitoring using weekly scans of specially designed quality assurance phantom as described in15. All images that contained implants were excluded from our analysis. Moreover, validation and testing sets were created contained 480 mammograms each. The image sets selected represented mammograms with %FGV from 0% to 100%. In addition, a group of image breast region of interest (ROI) parameters such as the breast area, width of image (nipple to chest wall) and height of image (top to bottom) was obtained from images using global threshold segmentation and skin detection algorithm. To further improve the model we added a few image formation and breast parameters such as filter/target combination, mAs, kVp, compression force and breast thickness to the final model. We considered that by combining these parameters and features together we would be able to maximally boost the performance of the final model and to predict efficiently %FGV. The correlation between %FGV, breast volume, FGV and most significant shape and appearance features generated was also analyzed.

The linear regression method was applied to build a statistical model. The SAS glmselect proc with different options was used. The forward selection and ten-fold cross-validation was used to estimate the outcome %FGV with different combinations shape and appearance features, and other parameters. The Schwarz Bayesian information, Akaike’s information, predicted residual sum of squares statistics, and adjusted R-square statistic criteria were monitored.

3. RESULTS

3.1 Breast shape and appearance model components

Fig. 13 demonstrates the steps of the shape and appearance model creation. The Fig. 1 shows the edge and grid points inside of the breast area and their triangulation of single and mean breasts. Using the equations 1 and 2 and x, y coordinates of points (presented in the Fig1, a-b) the shape model and PCA shape principle components were calculated.

Figure 3.

Figure 3

The 1st PCA shape (a) and appearance (b) components. Blue markers represent the mean breast shape, red markers represent the mean breast plus three standard deviations.

In order to build an appearance model, first, we transformed the breast images into the mean texture image by piecewise linear image transformation as shown in the Fig 2, a-b. That step aligned all texture information inside the reference mean image (see Fig. 2, a-b). You can see an example of the smooth conversion of pixel grey scale values of the breast with the size smaller than the mean breast into the pixels inside of the mean breast template. Then, the image pixels grey-scale values were converted into a set of significant principal component vectors. The Fig. 3 presents the 1st PCA component scores of the shape model (a) and the image of the first PCA appearance component scores. These shape and texture images were created using the first principal component vectors. As one see from the Figure 3 the first PCA component characterizes the breast shape stretching along the chest wall – nipple direction. The first appearance component describes the pixel value maximum variation at the breast region adjacent to the chest wall and located inside of the breast area. It should be noted that gray scale values were normalized by the breast pixel gray scale value standard deviations. The bright areas at the edge could be explained by steep thickness gradient and larger pixel gray scale value variation at the periphery region close to the skin edge.

Figure 2.

Figure 2

Warping transform to mean breast shape: red markers represent mean breast shape, blue markers represent the breast to register.

The Pearson’s correlation coefficients of %FGV, dense tissue and actual volumes measured by the SXA method, the shape first (bs_1) and third (bs_3) PCA component scores, the appearance second (gs_2), and fourteenth (gs_14) PCA component scores are presented in Table 1. These shape and appearance components make the most significant contribution into the ten-fold cross validated regression model for %FGV for separated sets of shape and appearance components. In addition, the table contains the scale factor of breast transform to the mean shape (scale), BMI and age. The shape and appearance PCA scores were found to correlate moderately and to breast %FGV, dense tissue and actual volumes, BMI and age (see Table 1). The highest Pearson correlation coefficients were equal −0.88 and 0.77 for relationship between actual breast volume, and the scale factor and the first shape PCA components respectively. As expected there is also some correlation between shape and appearance PCA components. The highest correlation to %FGV demonstrates the scale factor, that has a positive correlation, and the 2nd appearance PCA component, which was negatively correlated to %FGV.

Table 1.

Parameter correlation (Pearson's coefficients

Variable %FGV Volume FGV bs_l bs_3 gs_2 gs_14 scale age BMI
%FGV 1 −0.58 0.21 −0.41 0.19 −0.56 0.29 0.62 −0.37 −0.52
Volume 1 0.47 0.77 0.05 0.5 −0.13 −0.88 0.17 0.75
FGV 1 0.5 0.12 −0.01 0.04 −0.49 0.29 0.50
bs_l 1 0 0.37 −0.12 −0.82 0.20 0.49
bs_3 1 −0.01 0.12 0.02 0.04 −0.08
gs_2 1 0 −0.51 0.33 0.42
gs_14 1 0.2 −0.13 −0.17
scale 1 −0.25 −0.68
age 1 0.12
BMI 1

where bs_l and bs_3 are the 1st and 3rd shape components; gs_2 and gs_14 are the 2rd and 14th appearance components. The Volume and FGV represent breast and dense tissue volumes. The scale is a scale factor of the breast transform into the mean shape.

3.2 Model outputs

To estimate the performance of different groups for the %FGV prediction we analyzed the model outputs of different feature groups separately. The result of this analysis is presented at the Table 2. In more detail, the models outputs of 5 different groups of features such as 1 - Shape PCA; 2 - Shape PCA and scale factor; 3 - Appearance PCA; 4 - Shape PCA, Appearance PCA, scale factor; 5 - Shape PCA, Appearance PCA, scale factor, filter/target, mAs, kVp, compression force, breast thickness and ROI y-dimension, were analyzed. The models for different feature groups were created by the forward selection method using 10-fold cross validation. The selection stopped at a local minimum of the cross validation predicted residual sum of squares. The shape could be described by 8 PCA components, and %FGV prediction achieved was characterized by r2 equal 0.234. Adding the scale factor to the shape PCA components boosted %FGV prediction to r2 = 0.462 with selected 6 features. The breast appearance PCA components demonstrate r2 = 0.625 association with %FGV using 77 appearance PCA components. By adding groups together we managed to increase adjusted r2 value to 0.723 using 80 features. To further improve the association between predicted and measured %FGV values the filter/target, mAs, kVp, compression force, breast thickness and ROI y-dimension were added into the model, and the final model which includes these parameters, and shape and appearance PCA components demonstrated a correlation of r2 = 0.8 between predicted and measured %FGV.

Table 2.

Model outputs for predicted %FGV of different feature groups with 10-fold cross validation

Group Number
in model
r2 r2
adjusted
Number
selected
Shape PCA 8 0.236 0.234 6
Shape PCA, scale 9 0.464 0.462 6
Appearance PCA 285 0.64 0.625 77
Shape PCA, Appearance PCA, scale 294 0.734 0.723 80
Shape PCA, Appearance PCA, scale, physical conditions 300 0.8 0.79 90

Physical conditions include filter/target, mAs, kVp, compression force, breast thickness and ROI y-dimension.

In addition to the cross-validation method we tested the final model using all three independent sets: training, validation, and testing. To build this model the regression forward selection method of the training set was used with a stop criterion related to an average square error minimum of the validation set. Then, predicted values of %FGV for the testing set were calculated using the results of this regression. Figure 4 demonstrated progression of average squared errors for %FGV of the training, validation and testing sets while adding new features in the order of their contribution performance. The final model had 33 degrees of freedom as a result of selection among 300 input parameters. The first 20 parameters in an order of their contribution in the model are the following: kVp, mAs, compression force, thickness, gs_14,,gs_11, gs_2, bs_3, gs_7, bs_25, gs_68, gs_27, gs_58, gs_93, gs_53, gs_106, gs_163, gs_52, gs_46, gs_15. The bs_x and gs_x terms represent shape and appearance components respectively, and x corresponds to the component number. The progression dependence is characterized by continuously decreasing of average square errors, with a fast drop at the beginning and slow monotonous downward trend for the rest of features. All three sets follow the same trend but the errors of the testing set are higher than ones for the training and validation sets. Figure 5 demonstrates the plot of predicted %FGV versus measured %FGV dependence for the final model selected. Thus, the model demonstrates an association of the predicted and measured volumetric breast densities r2 = 0.7 for the testing set. The deviation error between predicted and measured %FGV is defined by a root mean square error of 11.3%. The fitting line has a slope of 0.7 and an intercept of 13.9. The association r2 value obtained for the testing set is lower than one achieved by the ten-fold cross-validation of the training set that indicates on an additional variability for the testing set in comparison with the training set. Further research is required to improve the model creation.

Figure 4.

Figure 4

Progression of average squared errors by role for the %FGV prediction of the training, validation and testing sets.

Figure 5.

Figure 5

The plot of predicted %FGV versus measured %FGV values for a testing set of 480 mammograms.

4. CONCLUSIONS

We applied the shape and appearance model approach to digital mammograms and demonstrated the feasibility of this method to extract variables useful for automatic %FGV estimation. We obtained the maps and pixel distributions of different shape and appearance PCA components. The shape and appearance scores are found to correlate moderately to measured breast %FGV, dense tissue volume and actual breast volume. The models created of different groups of features were analyzed. Further exploring and testing of this approach is warranted.

REFERENCES

  • 1.Malkov S, Wang J, Kerlikowske K, Cummings S, Shepherd JA. Single x-ray absorptiometry method for the quantitative mammographic measure of fibroglandular tissue volume. Medical Physics. 2009;v. 36(12):5525–5536. doi: 10.1118/1.3253972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Pawluczyk O, Augustine BJ, Yaffe MJ, Rico D, Yang J, Mawdsley GE, Boyd NF. A volumetric method for estimation of breast density on digitized screen-film mammograms. Med. Phys. 2003;30(3):352–364. doi: 10.1118/1.1539038. [DOI] [PubMed] [Google Scholar]
  • 3.Diffey J, Hufton A, Astley S. IWDM 2006, Proceedings of the Eighth International Workshop, edited by S. Astley, Manchester, UK Springer, New York, 2006, Paper No. LNCS 4046. 2006. A new step-wedge for the volumetric measurement of mammographic density; pp. 1–9. [Google Scholar]
  • 4.Kaufhold J, Thomas JA, Eberhard JW, Galbo CE, Trotter DE. A calibration approach to glandular tissue composition estimation in digital mammography. Med. Phys. 2002;29(8):1867–1880. doi: 10.1118/1.1493215. [DOI] [PubMed] [Google Scholar]
  • 5.Heine JJ, Cao K, Rollison DE. Calibrated measures for breast density estimation. Acad Radiol. 2011;18:547–555. doi: 10.1016/j.acra.2010.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Highnam R, Pan X, Warren R, Jeffreys M, Davey Smith G, Brady M. Breast composition measurements using retrospective standard mammogram form (SMF) Phys. Med. Biol. 2006;51(11):2695–2713. doi: 10.1088/0031-9155/51/11/001. [DOI] [PubMed] [Google Scholar]
  • 7.van Engeland S, Snoeren PR, Huisman H, Boetes C, Karssemeijer N. Volumetric breast density estimation from fullfield digital mammograms. IEEE Trans. Med. Imaging. 2006;25(3):273–282. doi: 10.1109/TMI.2005.862741. [DOI] [PubMed] [Google Scholar]
  • 8.Zhou C, Chan H, Petrick N, Helvie MA, Goodsitt MM, Sahiner B, Hadjiiski LM. Computerized image analysis: Estimation of breast density on mammograms. Med. Phys. 2001;28(6):1056–1069. doi: 10.1118/1.1376640. [DOI] [PubMed] [Google Scholar]
  • 9.Kallenberg MG, Lokate M, van Gils CH, Karssemeijer N. Automatic breast density segmentation: an integration of different approaches. Phys. Med. Biol. 2011;56(9):2715–2729. doi: 10.1088/0031-9155/56/9/005. [DOI] [PubMed] [Google Scholar]
  • 10.Kim Y, Kim C, Kim J. Automated Estimation of Breast Density on Mammogram using Combined Information of Histogram Statistics and Boundary Gradients. Proc. of SPIE. 2010;Vol. 7624:76242F-1. [Google Scholar]
  • 11.Keller BM, Nathan DL, Wang Y, Zheng Y, Gee JC, Conant EF, Kontos D. Estimation of breast percent density in raw and processed full field digital mammography images via adaptive fuzzy c-means clustering and support vector machine segmentation. Med. Phys. 2012;39(8):4903–17. doi: 10.1118/1.4736530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Saidin N, Mat Sakim HA, Ngah UK, Shuaib IL. Computational and Mathematical Methods in Medicine Volume 2013, Article ID 205384, 13 pages. 2013. Computer Aided Detection of Breast Density and Mass, and Visualization of Other Breast Anatomical Regions on Mammograms Using Graph Cuts. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Heine JJ, Velthuizen RP. A statistical methodology for mammographic density detection. Med Phys. 2000;27:2644–51. doi: 10.1118/1.1323981. [DOI] [PubMed] [Google Scholar]
  • 14.Li J, Szekely L, Eriksson L, Heddson B, Sundbom A, Czene K, Hall P, Humphreys K. High-throughput mammographic-density measurement: a tool for risk prediction of breast cancer. Breast Cancer Research. 2012;14(4):R114. doi: 10.1186/bcr3238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Malkov S, Wang J, Duewer F, Shepherd JA. A Calibration Approach for Single-Energy X-ray Absorptiometry Method to Provide Absolute Breast Tissue Composition Accuracy for the Long Term. IWDM 2012, LNCS 7361. 2012:769–774. [Google Scholar]
  • 16.Cootes TF, Edwards GJ, Taylor CJ. Active Appearance Models. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2001;23(6):681–685. [Google Scholar]

RESOURCES