Abstract
Background
Accurate estimation of breast volume and weight is critical for post-mastectomy reconstruction. Existing methods are frequently costly or complex. We developed a machine learning framework that leverages demographic and anthropometric data to address these challenges.
Methods
We collected data from 199 patients between 2021 and 2023. The workflow comprised data collection, pre-processing, feature selection, model training, and performance evaluation. Three feature selection techniques were applied: domain expert knowledge, Spearman's rank correlation, and the Boruta algorithm. Each feature set was used to train linear regression, random forest regression, and support vector regression models. Model performance was evaluated using the coefficient of determination (R2) and Pearson's correlation coefficient. Significant correlations were identified between breast volume or weight and key patient characteristics, such as BMI, breast cup size, ptosis severity, and anthropometric measurements.
Results
The optimal linear regression model, which incorporated both domain-expert and statistically selected features, achieved R2 values of 81.8% for breast volume and 72% for breast weight.
Conclusion
The results indicate that integrating demographic and anthropometric data with machine learning yields an accurate, interpretable, and accessible method for preoperative breast assessment. In contrast to conventional imaging or mathematical models, this approach eliminates costs related to imaging equipment, relies on routinely collected clinical data, reduces the need for specialized equipment and training, and enables rapid integration into existing clinical workflows. By overcoming the limitations of traditional methods, the proposed model provides a practical, efficient, and cost-effective solution for clinical practice.
Keywords: Machine learning, Breast, Reconstruction, Estimation, Volume
Introduction
Breast cancer is the most common malignant disease among women. In 2020, it represented 24% of new cancer cases and caused 15% of cancer-related deaths worldwide [1]. A standard mastectomy removes the entire breast to eliminate tumors. This surgery often leads to permanent scarring, which can affect self-confidence and well-being, especially in younger women. Reconstructive surgery is commonly performed to restore breast function, improve body image, and enhance quality of life [1].
Accurate estimation of breast volume and weight is crucial for selecting implants and determining the amount of fat to inject during reconstruction. Imaging techniques such as magnetic resonance imaging (MRI), mammography, and 3D laser scanning are considered the gold standards for these assessments. However, their use is limited by high costs and lengthy procedures [2]. Three-dimensional breast models can be created using laser scanners [3, 4] or by acquiring photographs from multiple angles [5, 6]. For volume estimation, breast boundaries are typically outlined; however, this approach often overlooks tissue composition, as subcutaneous fat and fibroglandular tissue are frequently disregarded. Mammography-based methods usually assume the breast is cone-shaped [7, 8] or modeled as a half-elliptic cylinder for calculations [7, 9]. Deep-learning techniques have been applied to mammograms for volume prediction [10]. However, mammography may also cause pain or discomfort for some patients [11, 12].
Compared to breast volume prediction, breast weight estimation has received less attention. Breast resection weight—the weight of tissue removed during reduction mammoplasty (breast reduction surgeries)—is typically estimated in clinical practice, yet no computational models exist for this purpose [21–24]. Identifying features that are strongly correlated with breast cancer can enhance the decision-making of machine learning models. For example, menopausal status, body mass index (BMI), and breastfeeding are significantly associated with breast volume prediction (p < 0.01) [20, 25]. Additionally, Avşar et al. (2010) and Kim et al. (2014) found body mass, ptosis severity, and breast anthropometric measurements to be significant predictors (p < 0.001) [26, 27]. Notably, body mass and BMI are also strongly correlated with breast weight [28]. Multivariate logistic regression analyses have shown that BMI, breast cup size, and age are highly correlated with breast density (p < 0.01) [29]. Pearson correlation analysis similarly reveals significant associations between body mass, BMI, and breast anthropometric measurements (p < 0.05) [30]. Furthermore, breast anthropometric measurements correlate with breast resection weight (p < 0.05) [21, 24]. BMI is highly associated with subcutaneous fat (p < 0.001), fibroglandular tissue (p < 0.0001), mammographic density (p < 0.0001), and background parenchymal enhancement (p = 0.005) [31–33]. Finally, a recent prospective study estimated breast volume using preoperative anthropometric measurements from 78 breasts in 39 female-to-male transgender patients, producing promising results, though the small sample size and specific population limit generalizability [34].
A review of breast volume and weight estimation studies highlights a gap: key demographic and anthropometric variables—such as ptosis severity, BMI, and breast size—are often excluded from predictive models, despite their established correlation with breast volume and weight. This likely reduces model accuracy relative to imaging-based techniques, as also noted in a recent review of ptosis measurement methods. Although these factors are linked to physical symptoms and morphometry, they remain underutilized in modelling [25]. To address this, a machine learning framework was developed to predict breast volume and weight more accurately. Nevertheless, studies have yet to combine demographic and anthropometric data when training machine learning models for both breast volume and weight estimation, despite the recommendation for such integration to achieve greater clinical relevance.
Materials and methods
The general flow of our proposed methodology, illustrated in Fig. 1, consists of five main modules: data collection (gathering raw information), data pre-processing (cleaning and organizing data), feature selection (choosing the most relevant variables), machine learning modeling (applying algorithms to learn from data), and evaluation (assessing model performance). The study was implemented in Python (3.7.10)[34], utilizing essential packages: Sklearn (0.22.1)[36] for machine learning, Pandas (1.2.4)[37] for data handling, Numpy (1.19.2)[38] for numerical operations, Scipy (1.4.1)[39] for scientific computing, and Seaborn (0.11.1)[40] for data visualization.
Fig. 1.
Methodology flow overview. First, a set of data which comprises demographic data, breast anthropometric measurements, and the ground truth breast volume and weight was collected from 199 patients. This dataset is split into train (159) and validation (40) sets. After pre-processing is performed on the dataset, feature selection was then performed to obtain two sets of discriminative features, which are then use for training the breast volume and breast weight prediction models. The trained model can then be used for inference of breast volume and weight for new patients
Data collection
A total of 199 patient data were collected between 2021 and 2023 for this study. Well-trained medical staff members were deployed to collect the demographic data of breast cancer patients, such as age, height, BMI, etc. In addition, breast anthropometric measurements were taken in centimetres (cm). The ground truth of breast volume was obtained using the Archimedes water displacement method, while the breast weight was measured using a digital scale (regular calibration). All patients were admitted to the Breast Surgery Clinic of the University of Malaya Medical Centre (UMMC) in Kuala Lumpur, Malaysia, and their data were obtained with informed consent. Permission for this study was granted by the hospital’s Medical Ethics Committee (UMMC MREC ID NO: 202,032–8339).
Data pre-processing
Data pre-processing techniques enable the raw data to be transformed into a standardised format, such as data type casting and transformation [41]. Missing values were handled appropriately according to the data's behaviour, such as filling them with 0 or the mean value [42]. Some attributes, such as childbirth differences and menstruation period, were derived from existing data. A standard scaler was used to normalise the data, as it could handle outliers better than min–max normalisation and did not require the minimum and maximum values. Standard Scalar formula is shown in Eq. (1), where µ refers to the mean while s refers to variance. The left and right breasts were considered identical when handling missing breast anthropometric measurements.
| 1 |
Feature selection
Three feature selection methods were used to select the features for training the machine learning models for the prediction of breast weight and volume; namely, (1) domain expert, (2) Spearman Correlation Coefficient[43] and (3) Boruta algorithm[43]. The domain expert feature set was recommended by breast surgeons of UMMC based on their clinical knowledge and experience on factors that influence breast weight and volume. The Spearman Correlation Coefficient was used to compute the correlation between patients’ features and breast volume and weight, determined through the Archimedes water displacement method [43]. Significant features with p < 0.01 formed the statistical feature set. The Spearman Correlation Coefficient does not assume that volume is normally distributed, and it can handle both categorical and numerical features as dependent variables. The independent variables were breast volume and weight, which were continuous in nature. The third feature set is selected using the Boruta algorithm [43], a wrapper method built around the random forest classification algorithm [44]. Boruta was able to obtain all relevant variables by considering the multivariable relationship and collinearity with the target variable; however, this approach would be computationally expensive.
Machine learning modelling
Three machine learning algorithms: (1) linear regression, (2) random forest regressor, and (3) support vector regressor (SVR) were selected for training breast volume and weight prediction models for comparison. Linear regression is a linear modelling approach that estimates dependent variables by learning from independent variables [46]. The random forest regressor is an ensemble machine learning technique that comprises multiple decision-tree regressors [47]. It learns by splitting the data points in a multi-dimensional plane, using the mean-squared error as the loss function, to greatly reduce the variance within the regression data. The random forest regressor improves prediction accuracy and controls overfitting by averaging the results from each decision-tree regressor. SVR is similar to a support vector machine (SVM), which identifies the support vectors within the data and attempts to maximise the margin between hyperplanes and data points [48]. In SVR, the decision would be made by considering the intercept between these support vectors and the position of the intercepts.
The machine learning models were trained on a dataset of 199 patient records. The training samples consisted of data from 159 patients, while the validation samples consisted of data from 40 patients, resulting in a ratio of approximately 4:1 between the training and validation sets. A five-fold cross-validation (CV) was performed to compare and select an appropriate model with better generalizability.
Performance evaluation
The performance of the model was evaluated using the coefficient of determination, also known as r-squared (r2), a metric that evaluate the performance based on the distribution of estimated and actual results by taking variance into consideration. The sum of square residuals (SSR) would penalise the r2value by squaring the prediction error as shown in Eq. (2). Data variability was taken into account by dividing the SSR by the sum of squares total (SST, Eq. (4)). The total number of samples was defined as n. is the ground truth value and is the value predicted by a machine learning model for sample I. and are the means of the ground truth and predicted values, respectively. The r-squared, r2is defined in Eq. (3).
Feature selection
Three feature selection methods were used to select the features for training the machine learning models for the prediction of breast weight and volume; namely, (1) domain expert, (2) Spearman Correlation Coefficient[43] and (3) Boruta algorithm[43]. The domain expert feature set was recommended by breast surgeons of UMMC based on their clinical knowledge and experience on factors that influence breast weight and volume. The Spearman Correlation Coefficient was used to compute the correlation between patients’ features and breast volume and weight, determined through the Archimedes water displacement method [43]. Significant features with p < 0.01 formed the statistical feature set. The Spearman Correlation Coefficient does not assume that volume is normally distributed, and it can handle both categorical and numerical features as dependent variables. The independent variables were breast volume and weight, which were continuous in nature. The third feature set is selected using the Boruta algorithm [43], a wrapper method built around the random forest classification algorithm [44]. Boruta was able to obtain all relevant variables by considering the multivariable relationship and collinearity with the target variable; however, this approach would be computationally expensive.
Machine learning modelling
Three machine learning algorithms: (1) linear regression, (2) random forest regressor, and (3) support vector regressor (SVR) were selected for training breast volume and weight prediction models for comparison. Linear regression is a linear modelling approach that estimates dependent variables by learning from independent variables [46]. The random forest regressor is an ensemble machine learning technique that comprises multiple decision-tree regressors [47]. It learns by splitting the data points in a multi-dimensional plane, using the mean-squared error as the loss function, to greatly reduce the variance within the regression data. The random forest regressor improves prediction accuracy and controls overfitting by averaging the results from each decision-tree regressor. SVR is similar to a support vector machine (SVM), which identifies the support vectors within the data and attempts to maximise the margin between hyperplanes and data points [48]. In SVR, the decision would be made by considering the intercept between these support vectors and the position of the intercepts.
The machine learning models were trained on a dataset of 199 patient records. The training samples consisted of data from 159 patients, while the validation samples consisted of data from 40 patients, resulting in a ratio of approximately 4:1 between the training and validation sets. A five-fold cross-validation (CV) was performed to compare and select an appropriate model with better generalizability.
Performance evaluation
The performance of the model was evaluated using the coefficient of determination, also known as r-squared (r2), a metric that evaluates the performance based on the distribution of estimated and actual results by taking variance into consideration. The sum of square residuals (SSR) would penalise the r2value by squaring the prediction error as shown in Eq. (2). Data variability was taken into account by dividing the SSR by the sum of squares total (SST, Eq. (4)). The total number of samples was defined as n. is the ground truth value and is the value predicted by a machine learning model for sample I. and are the means of the ground truth and predicted values, respectively. The r-squared, r2 is defined in Eq. (3).
| 2 |
where
| 3 |
| 4 |
Additionally, the mean absolute error (MAE), as defined in Eq. (5), was used to assess the model's performance. MAE measured the average difference between the actual and the predicted values. The advantages of the MAE over the RMSE in assessing average model performance were that the former could provide an accurate and straightforward error measurement, which could be easily understood[48].
| 5 |
Lastly, Pearson correlation analysis was performed to measure the linear relationship between actual and estimated values, and Pearson p-value test could be used to determine the significance of correlation between the prediction and ground truth. Pearson correlation (r) is defined in Eq. (6).
| 6 |
Results
The dataset was collected at UMMC and comprised the records of 199 patients. Each record contained an exhaustive list of attributes, along with written diagnoses, breast volume, and weight.
Breast volume and weight distribution
A total of 199 sets of breast data have been collected (BMI = 25.33 ± 5.29). The study included 199 patients with a mean age of 60.3 years (range: 31–95 years) and a median age of 61 years. In terms of ethnicity, 111 patients were Chinese (55.8%), 58 were Malay (29.1%), and 30 were Indian (15.1%). The distributions for breast volume and weight were skewed to the right, as shown in Fig. 2. This trend was also observed in a study of 41,102 women (BMI = 25.4 ± 4.2)[49].
Fig. 2.
Histogram and kernel density plot for a breast volume and b breast weight. The plot is based on the data of 199 patients. It can be observed that the distribution for both breast volume and weight were skewed towards the right
Statistical analysis
A two-sided hypothesis test using the Spearman Correlation Coefficient was performed on each patient’s demographic and breast anthropometric variables, regarding breast volume and weight, where the correlation (r) and significance (p-value) were computed. The results are documented in Table 1.
Table 1.
Descriptive and statistical analysis (Spearman Correlation Coefficient) of patients’ demographic and breast anthropometric variables in relation to breast volume and weight
| Variable | Descriptive analysis | Breast volume (ml) 534.18 ± 286.52 |
Breast weight (g) 473.35 ± 244.78 |
||
|---|---|---|---|---|---|
| p-value | r | p-value | r | ||
| Categorical variables (discreet) | |||||
| Alcohol consumption | Yes: 25; No: 174 | 0.703 | 0.027 | 0.767 | 0.021 |
| BIRADS recode | 5.0: 61; 6.0: 59; 4.0: 50; 0.0: 24; 3.0: 3; 2.0: 2 | 0.206 | 0.089 | 0.175 | 0.095 |
| Breast cup | B: 84; A: 50; C: 44; D: 18; F: 1; E: 1; 7.0:1 | < 0.001 | 0.578 | < 0.001 | 0.527 |
| Breastfeeding | Yes: 113; No: 86 | 0.897 | -0.009 | 0.811 | 0.017 |
| Co-morbidities | Yes: 138; No: 61 | ||||
| Hormonal contraception | Yes: 162; No: 37 | 0.045 | -0.140 | 0.050 | -0.137 |
| Menstruation status |
Menopaused: 133; Having menses: 36; Hysterectomy: 21; Never had menses: 1; Unknown: 9 |
0.273 | 0.077 | 0.485 | 0.049 |
| Ptosis severity | No ptosis: 91; Severe: 40; Mild: 35; Moderate: 33 | < 0.001 | 0.310 | < 0.001 | 0.301 |
| Smoking | No: 190; Yes: 9 | 0.222 | -0.086 | 0.449 | 0.053 |
| Numerical variables (Mean ± standard deviation) | |||||
| Age (year) | 60.2915 ± 12.9072 | 0.193 | -0.091 | 0.244 | -0.082 |
| Age Menopaused (year) | 50.4735 ± 3.3615 | 0.414 | -0.057 | 0.419 | -0.057 |
| Age of first childbirth (year) | 20.92 ± 3.36 | < 0.001 | -0.228 | < 0.001 | -0.221 |
| Age of last childbirth (year) | 26.05 ± 15.08 | 0.106 | -0.113 | 0.098 | -0.116 |
| Age Menarche (year) | 13.03 ± 1.63 | 0.662 | 0.031 | 0.639 | 0.033 |
| BMI () | 25.34 ± 5.29 | < 0.001 | 0.6730 | < 0.001 | 0.6580 |
| Height (cm) | 155.25 ± 6.51 | 0.869 | -0.012 | 0.891 | -0.01 |
| No. of childbirths | 4.141 ± 2.434 | 0.01 | 0.179 | 0.012 | 0.0175 |
| Childbirth difference | 51.12 ± 5.28 | 0.85 | 0.013 | 0.936 | -0.006 |
| Cigarettes smoked per day | 0.39 ± 2.21 | 0.387 | -0.061 | 0.282 | -0.076 |
| Menstruation period (year) | 37.45 ± 3.54 | 0.385 | -0.061 | 0.34 | -0.067 |
| Weight (kg) | 60.961 ± 13.75 | < 0.001 | < 0.001 | < 0.001 | 0.6440 |
| Breast anthropometric measurements (Mean ± standard deviation) | |||||
| Breast projection (cm) | 11.25 ± 2.71 | < 0.001 | 0.6600 | < 0.001 | 0.6480 |
| Lateral sternal anterior axillary line(cm) | 20.40 ± 6.07 | < 0.001 | 0.4880 | < 0.001 | 0.4550 |
| Mid-clavicular nipple (cm) | 22.42 ± 4.63 | < 0.001 | 0.7000 | < 0.001 | 0.6900 |
| Nipple areolar complex diameter (cm) | 4.06 ± 1.39 | < 0.001 | 0.5540 | < 0.001 | 0.5700 |
| Nipple inframammary fold (cm) | 7.21 ± 2.31 | < 0.001 | 0.6810 | < 0.001 | 0.6890 |
| Nipple lateral sternal (cm) | 10.71 ± 3.01 | < 0.001 | 0.6190 | < 0.001 | 0.6230 |
| Sternal nipple (cm) | 22.5 ± 4.57 | < 0.001 | 0.6580 | < 0.001 | 0.6520 |
In breast volume prediction, surgeons at UMMC identified clinically significant features, including BMI, ptosis severity, cup size, age at first childbirth, weight, and seven breast measurements (nipple-areolar diameter, lateral sternal–anterior axillary line, sternal–nipple, mid-clavicular–nipple, and lateral sternal–nipple distances), along with hormonal contraception use. Chosen for their clinical relevance and impact on breast shape, these features encompass both physiological and geometric factors that affect breast volume. By integrating these elements, the expert-derived set enhances model interpretability and provides insight into why expert feature selection sometimes surpasses algorithmic approaches in capturing clinically meaningful variations in breast volume.
Interestingly, several highly significant features (p £ 0.01) were shared between breast volume and breast weight, namely BMI, ptosis severity, breast cup, first childbirth, weight, and all seven breast anthropometric measurements (nipple areolar complex diameter, lateral sternal anterior axillary line, sternal nipple, mid-clavicular nipple, and lateral sternal nipple). Hormonal contraception proved to be moderately significant in breast volume estimation, but not for weight, although its p-value of 0.05 was close to the threshold of significance. A moderate correlation () could be observed for breast volume and weight with breast cup, BMI, body.
weight and five of seven breast anthropometric measurements[51]. A notable discovery was that a large proportion of features that had a statistically significant effect on one characteristic displayed an equal influence on its counterpart, leading to the hypothesis that statistical models of both tasks were reliant on highly similar indicators.
Feature selection
The important features from Boruta, statistical (Spearman Correlation Coefficient), and the domain expert are documented in Table 2. Notably, BMI, breast cup, and all seven breast anthropometric measurements were shown to be associated with breast volume and weight. Additionally, the number of childbirths, age at first childbirth, and body weight were found to be useful predictors of breast volume and weight that met the criteria imposed by Boruta (importance score ≥ 0.80) and statistical tests (p < 0.01). Features identified by breast surgeons at UMMC included BMI, ptosis severity, breast cup size, age at first childbirth, body weight, seven breast anthropometric measurements (nipple-areolar complex diameter, lateral sternal–anterior axillary line, sternal–nipple, mid-clavicular–nipple, and lateral sternal–nipple distances), and hormonal contraception use. These features were selected based on their clinical relevance and known influence on breast morphology, integrating both physiological and geometric determinants of breast volume. By explicitly incorporating features with mechanistic and morphometric significance, the domain expert feature set enhances the interpretability of the predictive model. This approach also helps explain why expert-guided feature selection may outperform purely algorithmic methods, particularly in capturing clinically meaningful variations in breast volume.
Table 2.
List of important features in determining breast volume and weight. ‘*’ indicates the attribute is selected by the respective feature selection methods

Breast volume estimation
Table 3 compares the results of models trained on the three feature sets using the three machine learning algorithms: linear regression, SVR (with a linear kernel), and random forest regression. The linear regression model, operating purely on features selected by domain experts, performed remarkably well across all three-validation metrics, with an R-squared value of 81.81% and a minimum MAE of 94.70; a correlation value of 0.91 indicated a strong and dependable statistical model. The random forest regressor performed comparably, with an R-squared value of 74.22% and an MAE of 105.61; however, it could be improved by utilizing features obtained through a standard statistical test. One could hypothesize that expert-defined features, informed by genuine medical expertise, yield a more generalizable model compared to statistically selected characteristics that are only determined based on a limited training set. Furthermore, a clear linear relationship existed between attributes and breast volume, as evidenced by the linear regression model outperforming its more complex and non-linear counterparts by up to 17.77%.
Table 3.
Performance of machine learning models for breast volume prediction
| Machine learning model | Feature set | |||
|---|---|---|---|---|
| Boruta | Statistical | Domain expert | ||
| Coefficient of determination / r-squared () | Random forest regressor | 73.79% | 74.22% | 68.46% |
| SVR—Linear | 73.45% | 73.54% | 70.19% | |
| Linear regression | 70.51% | 73.77% | 81.81% | |
|
Mean absolute error (MAE) |
Random forest regressor | 102.8898 | 105.6107 | 123.9683 |
| SVR—Linear | 112.9393 | 114.3105 | 121.4820 | |
| Linear regression | 111.2440 | 104.8979 | 94.70 | |
| Pearson correlation (r) | Random forest regressor | 0.8995 | 0.8929 | 0.8547 |
| SVR—Linear | 0.8763 | 0.8799 | 0.8700 | |
| Linear regression | 0.8405 | 0.8607 | 0.9100 | |
The best model for breast volume estimation, Mvolume, a linear regression model trained with domain experts features is depicted as follows:
| 7 |
Breast weight estimation
Table 4 details the results for breast weight estimation. For breast weight, the linear regression model operating on statistically selected features outperformed its counterparts, with an R-squared value of 72.02%. In contrast, the SVR model, using the same feature set, displayed a stronger correlation score of 0.8701. In terms of MAE, the linear regression model for BORUTA features performs the best. In contrast to the prediction of breast volume, models trained on features selected by domain experts produced the worst overall performance.
Table 4.
Performance of machine learning models for breast weight prediction
| Machine learning model | Feature set | |||
|---|---|---|---|---|
| Boruta | Statistical | Domain expert | ||
| Coefficient of determination () | Random forest regressor | 64.11% | 65.38% | 62.34% |
| SVR—Linear | 68.62% | 68.50% | 64.61% | |
| Linear regression | 69.69% | 72.02% | 69.71% | |
|
Mean absolute error (MAE) |
Random forest regressor | 110.6372 | 110.9340 | 128.2156 |
| SVR—Linear | 112.9078 | 112.8959 | 120.2414 | |
| Linear regression | 107.7791 | 108.6024 | 112.9223 | |
| Pearson correlation (r) | Random forest regressor | 0.8673 | 0.8667 | 0.8104 |
| SVR—Linear | 0.8675 | 0.8701 | 0.8586 | |
| Linear regression | 0.8357 | 0.8492 | 0.8363 | |
The linear regression model trained on statistical data for breast weight estimation, Mwieght, is given by:
| 8 |
Interestingly, we find that there is a high correlation between the weights for the breast anthropometric measurements features {f12, f13, … f18}, for both breast volume and weight models. Specifically, the features {f13, f16} have a positive sign, while the features {f12, f15, f17, f18} have a negative sign.
More results analysis
To gain further insights into the model's performance, we conducted additional analyses on the prediction results for the full dataset (comprising both training and validation sets) of 199 samples. For prediction of breast volume on the whole dataset, the r-squared (r2) was 91.00% and MAE was 94.7 ml. The 95% Confidence Interval (CI) [lower, upper] for the ground truth value is [-16.66 ml, 14.949 ml].
From the graph in Fig. 3a, it could be observed that the volume predicted by the machine learning model was highly correlated (r = 0.91) with the Archimedes breast volume. Interestingly, Fig. 3b shows that the range between the first and third quartiles of the predicted volume for each ptosis severity type was within ± 100 ml. Notably, there were only a total of six outliers in the mild, moderate, and severe ptosis categories, but within the no ptosis category itself, there were 12 outliers.
Fig. 3.
Performance analysis of breast volume estimation using the best machine learning model. a Actual-prediction plot showed that the volume predicted by the machine learning model was highly correlated (r = 0.91) with the Archimedes breast volume. b Volume-difference boxplot showed that the range between the first- and third-quartile of the predicted volume for each ptosis severity type was within ± 100 ml
For predicting breast weight, the r-squared (r2) on the whole dataset was 87.01%. The MAE value was 112.8959 g. The 95% CI [lower, upper] with actual values were -15.38 g and 14.95 g, respectively. In the actual-prediction plot in Fig. 4a, the predicted weight by machine learning showed a strong correlation with the ground-truth weight (r = 0.87). Furthermore, the machine learning model kept the error well within ± 100 g. Additionally, the boxplot in Fig. 4(b) indicated that the range between the first and third quartiles of the predicted weight for each ptosis severity type was within ± 100 g. As in breast volume prediction, there were significantly more outliers for the no ptosis category.
Fig. 4.
Performance of breast weight estimation using the best machine learning model. a Actual-prediction plot shows that the weight predicted by machine learning was highly correlated (r = 0.87) with the ground truth weight. b Weight-difference boxplot showed that the range between the first- and third quartile of the predicted weight for each ptosis severity type was within ± 100 g
Discussion
A total of 199 patients have undergone breast surgery, where some patients may undergo bilateral mastectomy, and 80 breast volume and weight data have been obtained. Their age ranged from 32 to 95, with a median of 58. They had an average BMI of 25.47 ± 5.74. The median ages at menarche and menopause were 13 (8 to 18) and 50.15 (40 to 60), respectively, which were consistent with those reported in a local study [52]. The cohort demography reflects the patient population at our single center rather than the national distribution, where Malays represent the majority. While this relative overrepresentation of Chinese patients may limit the model's national representativeness, the age and ethnic composition are unlikely to have significantly affected its predictive performance.
The two-tailed Spearman Correlation Coefficient, a statistic that measures how strongly two variables are related in a non-linear, rank-based way, showed that weight, BMI (Body Mass Index), breast cup size, ptosis severity (the extent of breast sagging), and anthropometric (body measurement) indicators are highly significant (p < 0.01) and moderately correlated with both breast volume and weight. This is consistent with previous research [20, 25–27]. Boruta, a feature selection algorithm, and statistical tests also identified age at first childbirth and number of births as significant (p < 0.01) and relevant (Boruta importance score ≥ 0.80) predictors, though these features have not been used in prior studies. The Boruta threshold of ≥ 0.80 was chosen to balance the number of false positives and missed relevant features.
We evaluated three machine learning models: a random forest regressor, a support vector regressor with a linear kernel, and linear regression. Each model used Boruta, a statistical test, and domain expert feature sets. The linear regressor with features selected by statistics and domain expertise (n = 11) performed best for both breast volume and weight on the validation set. This model achieved an R-squared of 81.8% (domain expert features) and 72% (statistical test features) for breast volume and weight, respectively. It also minimized MAE for both breast volume (94.7 mL) and weight (107.78 g). The linear regression coefficients provide interpretable insights into the influence of individual features. This enables clinicians to assess the relative importance of anthropometric and clinical variables. Such interpretability supports the model’s predictive performance and clinical relevance.
The study found that the optimal feature selection method for breast volume prediction is domain expert." Specialists in breast anatomy or measurement made these selections. In contrast, "for breast weight prediction, the SVR algorithm performs best, and domain experts perform the worst." This difference may result from variable interdependence. Feature selection may better capture the geometric complexity of breast volume than the mass-related characteristics of breast weight. Volume estimation relies on spatial and morphological features, making domain expertise valuable; however, breast weight prediction may depend on subtle nonlinear relationships. Machine learning algorithms such as SVR may better capture these due to their sensitivity to feature dimensionality and interaction effects. These findings underscore the need to tailor feature selection strategies to each prediction target.
Our approach applies a machine learning framework that integrates medical and demographic factors to estimate breast volume and weight. It achieves high replicability and reasonable accuracy when compared to 3D CT/MRI reconstruction and mammography. This demonstrates practical value in medical settings. These models can serve as a screening tool, reducing the need for direct clinician involvement in diagnoses. Incorporating intraoperative breast volume measurements, patient demographic data, and detailed anthropometric measurements allowed the model to learn from real surgical data and patient-specific factors. This improved prediction accuracy and supports future refinement. While MRI remains the gold standard for breast volume assessment, its cost and technical demands limit widespread use. By combining selected clinical and anthropometric features with machine learning, our method provides a non-invasive, cost-effective, and clinically applicable solution for breast volume estimation.
Conclusion
Our research demonstrates the potential of machine learning in estimating breast parameters. The current best validation accuracies for breast volume and weight are 81.8% and 72%, respectively. Larger, richer, and more diverse training sets could improve these results. Performance should also be tested at scale across medical communities. To meet the needs of different demographic groups, specialized machine learning models can be developed. Their robustness depends on the diversity of the training data. The main constraint is acquiring training data, a process complicated by ethical issues, especially with medical data. Promising correlation scores and domain expert features in breast volume estimation suggest that demographic and anthropometric features have a mathematical relationship with key breast measurements. Future work should include pilot studies in various healthcare settings, protocols for data ethics and privacy, and collaborations with clinical partners to scale these models responsibly.
Acknowledgements
The authors have nothing to report.
Authors contributions
SPT collected and analyzed the data, developed the technique and software, performed the experiments, and drafted the manuscript. MHS conceptualized, supervised collected data, and edited the manuscript. LLL collected data and edited the manuscript. LKW conceptualized, supervised data analysis, and edited the manuscript. SN performed the experiments and edited the manuscript. JS supervised data analysis and edited the manuscript. KR collected data. TAO supervised data analysis, and edited the manuscript. MISMHT analyzed the data, and edited the manuscript. KHN conceptualized, supervised the technique of data collection, supervised data analysis, and edited the manuscript. All authors approved the manuscript.
Funding
Open access funding provided by The Ministry of Higher Education Malaysia and Universiti Malaya. This study was funded by the UMSC C.A.R.E Grant (Grant Number: PV041-2021) and MMU Internal Research Fund (Grant Number: MMUI/220030).
Data availability
The data is obtained from University Malaya Medical Center (UMMC) in Malaysia it is available upon request for research purpose.
Declarations
Conflict of interest
The authors declare no conflicts of interests.
Ethical Approval
The study was approved by the University of Malaya Medical Centre’s Medical Ethics Committee UMMC MREC ID NO: 202032–8339 and conformed to the Declaration of Helsinki 1975. Informed consents are obtained from all patients who participated in this study. The patients’ details were also protected under Malaysia’s Personal Data Protection Act 2010.
Consent to Participate
The study was approved by the University of Malaya Medical Centre’s Medical Ethics Committee UMMC MREC ID NO: 202032–8339 and conformed to the Declaration of Helsinki 1975. Informed consents are obtained from all patients who participated in this study. The patients’ details were also protected under Malaysia’s Personal Data Protection Act 2010.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Sheng-Pu Teo and Mee-Hoong See are co first authors.
Contributor Information
Lai-Kuan Wong, Email: lkwong@mmu.edu.my.
Kwan-Hoong Ng, Email: ngkh@ummc.edu.my.
References
- 1.Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424. 10.3322/caac.21492. [DOI] [PubMed] [Google Scholar]
- 2.Xi W, et al. Objective breast volume, shape, and surface area assessment: a systematic review of breast measurement methods. Aesthet Plast Surg. 2014;38(6):1116–30. 10.1007/s00266-014-0412-5. [DOI] [PubMed] [Google Scholar]
- 3.Tomita K, Yano K, Hata Y, Nishibayashi A, Hosokawa K. DIEP flap breast reconstruction using 3-dimensional surface imaging and a printed mold. Plast Reconstr Surg Glob Open. 2015;3(3):e316. 10.1097/GOX.0000000000000288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Tepper OM, Small K, Rudolph L, Choi M, Karp N. Virtual 3-dimensional modeling as a valuable adjunct to aesthetic and reconstructive breast surgery. Am J Surg. 2006;192(4):548–51. 10.1016/j.amjsurg.2006.06.026. [DOI] [PubMed] [Google Scholar]
- 5.Ju X, Henseler H, Peng MJ, Khambay BS, Ray AK, Ayoub AF. Multi-view stereophotogrammetry for post-mastectomy breast reconstruction. Med Biol Eng Comput. 2016;54(2–3):475–84. 10.1007/s11517-015-1334-3. [DOI] [PubMed] [Google Scholar]
- 6.de Heras Ciechomski P, et al. Development and implementation of a web-enabled 3D consultation tool for breast augmentation surgery based on 3D-image reconstruction of 2D pictures. J Med Internet Res. 2012;14(1):e21. 10.2196/jmir.1903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Bulstrode N, Bellamy E, Shrotria S. Breast volume assessment: comparing five different techniques. Breast. 2001;10(2):117–23. 10.1054/brst.2000.0196. [DOI] [PubMed] [Google Scholar]
- 8.Katariya RN, Forrest AP, Gravelle IH. Breast volumes in cancer of the breast. Br J Cancer. 1974;29(3):270–3. 10.1038/bjc.1974.66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kalbhen CL, McGill JJ, Fendley PM, Corrigan KW, Angelats J. Mammographic determination of breast volume: comparing different methods. AJR Am J Roentgenol. 1999;173(6):1643–9. 10.2214/ajr.173.6.10584814. [DOI] [PubMed] [Google Scholar]
- 10.Warren, L., et al. Deep learning to calculate breast density from processed mammography images. In 15th International Workshop on Breast Imaging (IWBI2020), 11513, 352–358. 10.1117/12.2561278. (2020).
- 11.Whelehan P, Evans A, Wells M, Macgillivray S. The effect of mammography pain on repeat participation in breast cancer screening: a systematic review. Breast. 2013;22(4):389–94. 10.1016/j.breast.2013.03.003. [DOI] [PubMed] [Google Scholar]
- 12.Rutter DR, Calnan M, Vaile MS, Field S, Wade KA. Discomfort and pain during mammography: description, prediction, and prevention. BMJ. 1992;305(6851):443–5. 10.1136/bmj.305.6851.443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ivanovska T, Jentschke TG, Daboul A, Hegenscheid K, Völzke H, Wörgötter F. A deep learning framework for efficient analysis of breast volume and fibroglandular tissue using MR data with strong artifacts. Int J Comput Assist Radiol Surg. 2019;14(10):1627–33. 10.1007/s11548-019-01928-y. [DOI] [PubMed] [Google Scholar]
- 14.Yoo A, Minn KW, Jin US. Magnetic resonance imaging-based volumetric analysis and its relationship to actual breast weight. Arch Plast Surg. 2013;40(3):203–8. 10.5999/aps.2013.40.3.203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.El-Oteify M, Megeed HA, Ahmed B, El-Shazly MM. Assessment of the breast volume by a new simple formula. Indian J Plast Surg. 2006;39:13–6. 10.4103/0970-0358.26897. [Google Scholar]
- 16.Qiao Q, Zhou G, Ling Y. Breast volume measurement in young Chinese women and clinical applications. Aesthet Plast Surg. 1997;21(5):362–8. 10.1007/s002669900139. [DOI] [PubMed] [Google Scholar]
- 17.Sigurdson LJ, Kirkland SA. Breast volume determination in breast hypertrophy: an accurate method using two anthropomorphic measurements. Plast Reconstr Surg. 2006;118(2):313–20. 10.1097/01.prs.0000227627.75771.5c. [DOI] [PubMed] [Google Scholar]
- 18.Karwasra, I. (2018). Breast Volume Estimation by Anthropometry. Journal of Medical Science and Clinical Research. 6. 10.18535/jmscr/v6i1.35
- 19.Longo B, Farcomeni A, Ferri G, Campanale A, Sorotos M, Santanelli F. The BREAST-V: a unifying predictive formula for volume assessment in small, medium, and large breasts. Plast Reconstr Surg. 2013;132(1):1e–7e. 10.1097/PRS.0b013e318290f6bd. [DOI] [PubMed] [Google Scholar]
- 20.Huang NS, et al. A prospective study of breast anthropomorphic measurements, volume, and ptosis in 605 Asian patients with breast cancer or benign breast disease. PLoS ONE. 2017;12(2):e0172122. 10.1371/journal.pone.0172122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kececi Y, Sir E. Prediction of resection weight in reduction mammaplasty based on anthropometric measurements. Breast Care (Basel). 2014;9(1):41–5. 10.1159/000358753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bilgen F, Ural A, Bekerecioğlu M. Preoperative estimation of breast resection weight in patients undergoing inferior pedicle reduction mammoplasty: the Bilgen formula. Turk J Med Sci. 2020;50(4):817–23. 10.3906/sag-1905-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Descamps M, Landau AG, Lazarus D, Hudson DA. A formula determining resection weights for reduction mammaplasty. Plast Reconstr Surg. 2008;121(2):397–400. 10.1097/01.prs.0000298319.01574.02. [DOI] [PubMed] [Google Scholar]
- 24.Hernanz F, Muñoz P, Fidalgo M. Breast reduction surgery—an easy formula to estimate the resection weight to be removed. Eur J Plast Surg. 2014;37:373–80. 10.1007/s00238-014-0958-0. [Google Scholar]
- 25.See MH, Yip KC, Teh MS, Teoh LY, Lai LL, Wong LK, et al. Classification and assessment techniques of breast ptosis: a systematic review. J Plast Reconstr Aesthet Surg. 2023;1(83):380–95. [DOI] [PubMed] [Google Scholar]
- 26.Avşar DK, Aygit AC, Benlier E, Top H, Taşkinalp O. Anthropometric breast measurement: a study of 385 Turkish female students. Aesthet Surg J. 2010;30(1):44–50. 10.1177/1090820X09358078. [DOI] [PubMed] [Google Scholar]
- 27.Kim SJ, Kim M, Kim MJ. The affecting factors of breast anthropometry in Korean women. Breastfeed Med. 2014;9(2):73–8. 10.1089/bfm.2013.0068. [DOI] [PubMed] [Google Scholar]
- 28.Brown N, White J, Milligan A, Risius D, Ayres B, Hedger W, et al. The relationship between breast size and anthropometric characteristics. Am J Hum Biol. 2012;24(2):158–64. 10.1002/ajhb.22212. [DOI] [PubMed] [Google Scholar]
- 29.Hashemi EA, Haghighat S, Olfatbakhsh A, Harandi HT, Beheshtian T. Investigating the factors affecting the mammographic density of breast tissue in patients referred to the Breast Cancer Research Center, Iran. Multidiscip Cancer Investig. 2017;1:27–31. 10.21859/mci-010212. [Google Scholar]
- 30.Al-Qattan, M. M., Aldakhil, S. S., Al-Hassan, T. S., & Al-Qahtani, A. Anthropometric Breast Measurement: Analysis of the Average Breast in Young Nulliparous Saudi Female Population. Plastic and Reconstructive Surgery Global Open, 7(8), e2326. 10.1097/GOX.000000000000232626 [DOI] [PMC free article] [PubMed]
- 31.Nadeem B, Bacha R, Gilani SA. Correlation of subcutaneous fat measured on ultrasound with body mass index. J Med Ultrasound. 2018;26(4):205–9. 10.4103/JMU.JMU_34_18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Gillman J, Chun J, Schwartz S, Schnabel F, Moy L. The relationship of obesity, mammographic breast density, and magnetic resonance imaging in patients with breast cancer. Clin Imaging. 2016;40(6):1167–72. 10.1016/j.clinimag.2016.08.009. [DOI] [PubMed] [Google Scholar]
- 33.Chun J, Refinetti AP, Leite AK, Schnabel FR, Hochman T, Moy L. The relationship of breast density, BMI, and menopausal status in mammography and MRI. J Clin Oncol. 2012;10:36–36. 10.1200/jco.2012.30.27_suppl.36. [Google Scholar]
- 34.Akhoondinasab, M., Shafaei, Y., Rahmani, A., & Keshavarz, H. A Machine Learning-Based Model for Breast Volume Prediction Using Preoperative Anthropometric Measurements. [DOI] [PubMed]
- 35.Aesthetic Plastic Surgery, 1–7. htts://doi.org/10.1007/s00266-022-02937-0 (2022).
- 36.Van Rossum, G., & Drake, F. L. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace. https://dl.acm.org/doi/book/10.5555/1593511 (2009).
- 37.Pedregosa, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–30. [Google Scholar]
- 38.McKinney, W. Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference, Vol. 445(1), pp. 51–56 (2020).
- 39.Harris CR, et al. Array programming with NumPy. Nature. 2020;585(7825):357–62. 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Virtanen P, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17(3):261–72. 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Waskom, M., et al. mwaskom/seaborn: v0.8.1 (September 2017). Zenodo. 10.5281/zenodo.883859 (2017).
- 42.Nemethova, A. & Michalconok, G. Preprocessing Raw Data in Clinical Medicine for a Data Mining Purpose. Research Papers Faculty of Materials Science and Technology, Slovak University of Technology. 24(39) 10.1515/rput-2016-0025 (2016).
- 43.Hassler AP, Menasalvas E, García-García FJ, Rodríguez-Mañas L, Holzinger A. Importance of medical data preprocessing in predictive modeling and risk factor discovery for the frailty syndrome. BMC Med Inform Decis Mak. 2019;19(1):33. 10.1186/s12911-019-0747-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kokoska, S. and Zwillinger, D. CRC Standard Probability and Statistics Tables and Formulae. CRC Press. Section 14.7 (2000).
- 45.Kursa M, Rudnicki W. Feature selection with Boruta package. J Stat Softw. 2010;36:1–13. 10.18637/jss.v036.i11. [Google Scholar]
- 46.Sanchez-Pinto LN, Venable LR, Fahrenbach J, Churpek MM. Comparison of variable selection methods for clinical predictive modeling. Int J Med Inform. 2018;116:10–7. 10.1016/j.ijmedinf.2018.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.David A. Freedman. Statistical Models: Theory and Practice. Cambridge University Press. (2009).
- 48.Wiener, A. L. Classification and Regression by Random Forest. R News, 18–22. Retrieved from https://CRAN.R-project.org/doc/Rnews/ (2002).
- 49.Harris Drucker, C. J. Support vector regression machines. In Proceedings of the 9th International Conference on Neural Information Processing Systems, 155–161. https://dl.acm.org/doi/10.5555/2998981.2999003 (1996).
- 50.Willmott C, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res. 2005;30:79. 10.3354/cr030079. [Google Scholar]
- 51.Brand JS, Czene K, Shepherd JA, Leifland K, Heddson B, Sundbom A, et al. Automated measurement of volumetric mammographic density: a tool for widespread breast cancer risk assessment. Cancer Epidemiol Biomarkers Prev. 2014;23(9):1764–72. 10.1158/1055-9965.EPI-13-1219. [DOI] [PubMed] [Google Scholar]
- 52.Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg. 2018;126(5):1763–8. 10.1213/ANE.0000000000002864. [DOI] [PubMed] [Google Scholar]
- 53.Ismael NN. A study on the menopause in Malaysia. Maturitas. 1994;19(3):205–9. 10.1016/0378-5122(94)90073-6. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data is obtained from University Malaya Medical Center (UMMC) in Malaysia it is available upon request for research purpose.




