Abstract
Background:
Post-mastectomy PTSD is a serious mental health issue, but it has not been studied enough, particularly in low-resource settings like Bangladesh. This study aimed to predict PTSD among breast cancer survivors using machine learning (ML) models and identify significant predictors through the Boruta algorithm, a feature selection tool, offering scalable solutions for early detection and intervention.
Methods:
A cross-sectional study of 138 post-mastectomy breast cancer patients was conducted across 3 hospitals in Bangladesh. Data on sociodemographic, health history, social experience, and treatment were collected using validated tools, including the PTSD Checklist for DSM-5 (PCL-5). The Boruta algorithm identified key predictors, and 10 ML models were evaluated for PTSD prediction using metrics such as accuracy, sensitivity, specificity, and AUC.
Results:
Random Forest (RF) outperformed other models (accuracy: 88.9%, AUC: 0.914). Significant predictors included education, monthly income, and changes in family behaviour. Factors like marital status, having chronic diseases, and hormone therapy were not statistically significant. PTSD prevalence was 34.1%, with urban residents and younger patients facing higher risks.
Conclusion:
ML models, particularly RF, demonstrated strong predictive performance and identified critical PTSD predictors. These findings highlight the potential for cost-effective PTSD screening in resource-constrained settings. Future research should focus on broader validation and longitudinal studies to refine predictive models.
Keywords: breast cancer, mastectomy, post-traumatic stress disorder, machine learning, socio-demographic characteristics
Background
Post-Traumatic Stress Disorder (PTSD) is a severe mental health condition triggered by traumatic events, such as interpersonal violence (eg, sexual assault or domestic abuse), serious accidents or natural disasters, medical trauma like a cancer diagnosis or major surgery, and the sudden or violent death of a loved one. Among breast cancer patients, particularly those undergoing mastectomy, PTSD is a growing but underexplored concern, especially in low-resource settings. 1 Characterized by intrusive symptoms like flashbacks, nightmares, emotional numbness, and hypervigilance, PTSD significantly impairs daily functioning and overall quality of life. 2 The American Psychiatric Association recognizes a cancer diagnosis as a traumatic stressor capable of triggering PTSD, with symptoms often including re-experiencing the event, avoiding reminders, and negative shifts in mood and cognition. 3 Among breast cancer patients, PTSD prevalence is alarmingly high, particularly among those undergoing mastectomy, a common treatment for advanced disease stages. 4 The psychological toll of mastectomy—altered body image, perceived loss of femininity, and diminished self-esteem—coupled with fear of recurrence and societal stigma, often leads to PTSD, making it a critical concern in survivorship care. 5
PRISMA-based review of POD and POCD identifies shared phenotypes, risk factors, and mechanisms involving inflammation, cellular stress, and neural injury in older adults. From 24 354 records, 176 studies, 24 deeply analysed, examined biomarkers tied to postsurgical cognitive decline. No universal biomarkers emerged; definitions of POCD are evolving, calling for genetic and neurobiological research. 6 In trauma-care literature, Kim et al operationalize Swanson’s caring theory into a 6-session, one-to-one nurse-led programme for sexually exploited women. Using PCL-5, CES-D, HPLP-II, and RSES, they report significant improvements from pre- to post-test and 1-month follow-up: PTSD and depression decreased markedly; health-promoting behaviours rose; self-esteem improved but waned by 1 month. 7 RCT of an online programme based on Roy’s Adaptation Model for trauma-exposed female college students (n = 16) over 2 months. Versus controls, it reduced PTSD and depression and improved functional health and adjustment, with gains sustained for 1 month. Demonstrates effective nurse-led internet mental healthcare. 8 Another PRISMA-guided systematic review examined coping strategies and PTSD among forcibly displaced people. Maladaptive strategies, including other blame and emotion-focussed disengagement, were linked to higher PTSD. Findings support culturally sensitive mental health care and replication. 9
Globally, breast cancer is the most commonly diagnosed cancer, with over 2.3 million new cases and approximately 685 000 deaths reported in 2020. 10 Although it affects women across all socioeconomic groups, the burden is disproportionately high in low- and middle-income countries (LMICs), where limited access to early detection, delayed diagnoses, and inadequate treatment options exacerbate morbidity and mortality. 11 In Bangladesh, breast cancer is the leading cause of cancer-related death among women, accounting for 19% of all female cancer cases. Most patients are diagnosed at advanced stages (Stage III or IV) due to sociocultural stigma, low health literacy, and the absence of organized screening programmes. 12 While mastectomy can be life-saving, it imposes significant psychological strain. Up to 24.1% of breast cancer survivors develop PTSD in China, and as many as 90% report symptoms of post-traumatic stress during treatment in Italy.13,14
Despite these alarming figures, mental health challenges in LMICs like Bangladesh remain underexplored, with limited research on PTSD predictors or pathways for early detection and intervention. Systemic healthcare barriers further compound the issue, including underfunding, a shortage of mental health professionals, and widespread stigma surrounding psychological disorders. In Bangladesh, only 0.44% of the national health budget is allocated to mental health, contributing to a significant treatment gap for PTSD and other conditions. 15 This lack of infrastructure underscores the urgent need for innovative, cost-effective, and scalable solutions to identify and manage PTSD in resource-constrained environments.
Being female is closely linked with sociodemographic factors such as age, education, income, marital status, occupation, and place of residence. For women, lower education and income, rural living, and young or older age are often associated with increased vulnerability and traditional gender roles, affecting empowerment, health, and life opportunities. 16 While studies in high-income countries have identified numerous PTSD predictors—such as socio-demographic factors, psychological distress, fear of cancer progression, and biological mechanisms—research in LMICs remains limited. For example, studies in China have highlighted menopause, blood cholesterol levels, and social support as significant factors, while research in Nigeria has emphasized the roles of religiosity, chemotherapy, and illness perception.17,18 Advances in biological research have further linked PTSD risk to inflammation and hormonal changes, particularly those driven by stress-activated inflammatory pathways and anti-endocrine therapies. 4
Recent breakthroughs in predictive modelling, particularly those utilizing Machine Learning (ML), provide powerful tools for exploring complex associations. ML excels at analysing high-dimensional data, uncovering nonlinear relationships, and modelling intricate feature interactions—capabilities often beyond traditional statistical approaches.19,20 A study builds machine learning models to predict PTSD in refugees using PCL 5 symptom clusters and sociodemographic data. Among 77 refugees in Portugal, random forest models in R and Python achieved moderate to high AUC values, roughly 0.50 to 0.93, with pooled sensitivity 33% to 70%. 21 One notable application is Elastic Net modelling, which has been used to develop methylation-based PTSD risk scores with high predictive accuracy and precision. 22 ML offers substantial advantages in PTSD research, particularly in LMIC settings. Bangladesh faces late-stage breast cancer diagnosis, low screening, and stigma that delay care. 12
Mental health services are limited and underfunded, leaving a treatment gap for PTSD. 15 Household role shifts after mastectomy affect support and stress. In this context, practical ML screening based on routine data could flag high risk patients for timely referral within oncology clinics 23 Algorithms like Random Forest (RF) effectively handle noisy or imbalanced datasets and reduce overfitting, while feature selection methods such as the Boruta algorithm improve model accuracy by isolating the most relevant predictors.24,25 These tools are especially valuable where data may be limited, incomplete, or inconsistent—common challenges in LMIC healthcare systems. By identifying key socio-demographic, behavioural, and treatment-related predictors, ML enables the development of targeted interventions for PTSD prevention and care.
This study aims to address these critical gaps by developing and validating ML-based predictive models for PTSD among post-mastectomy breast cancer patients in Bangladesh. Specifically, it employs the Boruta algorithm for feature selection and evaluates the performance of 10 ML algorithms to identify significant predictors. By focussing on a resource-constrained setting, this research provides actionable insights to support early intervention, mental health resource planning, and improved outcomes. Furthermore, it is expected that variables such as education level, household income, and changes in family behaviour will emerge as significant predictors. The use of the Boruta algorithm is also hypothesized to enhance model performance by selecting the most relevant features and minimizing overfitting. Ultimately, this study contributes to global efforts to reduce the burden of PTSD among breast cancer survivors, particularly in LMICs where the need for data-driven, scalable mental health solutions is most imperative.
Materials and Methods
Study Area and Population
This cross-sectional study was conducted from February 2024 to October 2024 among breast cancer patients attending the National Institute of Cancer Research and Hospital (NICRH), Dhaka Medical College Hospital, and Ahsania Mission Cancer & General Hospital. The study included patients who had undergone mastectomy and were attending the outpatient departments of these hospitals. All participants were aged 18 years or older. The study followed the Equator network guideline and reported as per STROBE – Checklist for Cross-sectional studies.
Sampling Size Determination
The sample size was calculated based on the following parameters:
Prevalence of PTSD: 10% (P = .10). 26
Confidence Level: 95% (Z-score = 1.96).
Margin of Error: 5% (E = 0.05).
Using the formula 27 for sample size determination in proportion studies:
Substituting the values:
Thus, the required sample size was approximately 139. Due to time constraints and patient availability, 138 patients were enrolled, which is sufficiently close to the calculated sample size, allowing for robust statistical analysis. This sample size is adequate to ensure that the study findings are representative of breast cancer patients undergoing mastectomy in the selected hospitals.
Data Collection Tools and Methods
A pre-tested semi-structured questionnaire was prepared, which comprised items focussing on the PTSD Checklist for DSM-5 (PCL-5) during attending to patients at the outpatient department. The questionnaire including PCL-5 were administered in Bengali language using a forward–back translation process: 2 bilingual public-health researchers independently translated the items into Bangla; a third bilingual expert back-translated to English; discrepancies were reconciled by a panel including a clinical psychologist and an oncologist to maintain conceptual—not just literal—equivalence. We piloted the translated instrument with 15 post-mastectomy patients (11% of total sample) to check comprehensibility and cultural relevance; minor wording refinements were made (eg, clarifying idioms related to re-experiencing and hyperarousal).
Respondents were asked to rate how bothered they have been by each of 20 items in the past month on a 5-point scale. The PTSD Checklist total score is the sum of the ratings given by the examinee for the 20 item responses. Each item ranges from 0 to 4. The maximum score is 80 points. A total score of 31 to 33 or higher suggests the patient is likely to be diagnosed with PTSD. Scores lower than 31 to 33 may indicate the patient does not meet criteria for PTSD.
Ethical Considerations
Ethical approval was obtained from the Ethical Review Committee of the Faculty of Health and Life Sciences (FHLS), Daffodil International University (DIU), Bangladesh. Ethics approval reference number is FHLS-REC/DIU/2024/0016. Official permission was obtained from the selected hospital authorities prior to data collection. Prepared semi-structured questionnaire including PCL-5, and informed consent was translated into Bengali, then reviewed by 2 public health experts with oncology and psychology expertise and obtained approval for administration along with protocol from the same ethics committee. Written informed consent was obtained from the literate participants, and for illiterate participants, study purpose and procedure was described in Bengali by native interviewers and collected thumbprints on the consent form from the participants, who were unable to sign.
Sample
Among the 150 patients initially assessed for eligibility, 11 were excluded for not meeting inclusion criteria. We approached 139 eligible post-mastectomy patients; 1 participant did not consent to publication, therefore 138 participants were included in the study, consisted of 46 patients from the National Institute of Cancer Research and Hospital (NICRH), 46 from the Dhaka Medical College Hospital (DMCH), and 46 from the Ahsania Mission Cancer & General Hospital (AMCGH). Figure 1 presents the sampling framework used in this study.
Figure 1.
Sampling framework of this study.
Variables
The survey included a questionnaire assessing sociodemographic data, health behaviours, treatment information, and PTSD symptoms as measured by the PTSD Checklist for DSM-5 (PCL-5). 28 Diagnostic terminology is referenced to the DSM 5 Text Revision 2022, which updates DSM 5 and clarifies criteria and descriptors. 29 The PCL-5 demonstrated strong internal consistency, with a Cronbach’s alpha of .91 indicating good reliability. 30
Dependent Variable
In this study, the response variable was the Post-Traumatic Stress Disorder (PTSD) status. For identifying PTSD status, each respondent was asked to rate the PCL-5 questionnaire. The PCL-5 is a 20-item self-report measure that employs a 5-point Likert scale, ranging from 0 to 4, where 0 represents “not at all,” 1 indicates “a little bit,” 2 signifies “moderately,” 3 corresponds to “quite a bit,” and 4 denotes “extremely.” The PCL 5 assesses DSM 5 PTSD symptoms across 4 clusters. Intrusion, Cluster B, 5 items, for unwanted memories, nightmares, and flashbacks. Avoidance, Cluster C, 2 items, for avoiding trauma related thoughts or reminders. Negative alterations in cognition and mood, Cluster D, 7 items, for negative beliefs, emotions, and detachment. Arousal and reactivity, Cluster E, 6 items, for irritability, hypervigilance, startle, concentration, and sleep problems. The total severity score can range from 0 to 80. For the current study, a cut-off score of ⩾33 was applied to identify probable PTSD. 28 If the sum of the score was 33 or higher, it suggested that the patient was likely diagnosed with PTSD; otherwise, it indicates that the patient does not meet the criteria for PTSD.
Independent Variables
In this analysis, we consider 4 different types of variables as predictor variables: Sociodemographic Information (Age, Education, Occupation, Monthly Income, Marital Status, Age of Last Child, and Residence), Health History (Chronic Disease, Mental Illness, Performing Household Activity, Tumour Grade), Social Experience (Husbands’ Behaviour, Husbands Behaviour Change, Family Members Behaviour, and Family Members Behaviour Change), Treatment Information (Mastectomy Operation, Chemotherapy, Radiotherapy, and Hormone therapy).
All variables were assessed using uniform procedures across the study population, and as the design did not include comparison groups, issues of inter-group comparability were not applicable. A detailed description of each variable, including its categorization and distribution, is presented in Tables 1 to 3.
Table 1.
Sociodemographic Characteristics of the Participants.
| Sociodemographic variables | Description | Categorization | Frequency (%) |
|---|---|---|---|
| Age | The age group of the individual. | Below 40 | 30 (21.739) |
| 41-50 | 47 (34.058) | ||
| 51-60 | 35 (25.362) | ||
| More than 60 | 26 (18.841) | ||
| Education | The highest level of education achieved by the individual. | Illiterate | 75 (54.348) |
| Primary | 24 (17.391) | ||
| Secondary | 11 (7.971) | ||
| Higher secondary or more | 28 (20.290) | ||
| Occupation | The type of employment or activity in which the individual is engaged. | Homemaker | 103 (74.638) |
| Job employee | 27 (19.565) | ||
| Others (students, business) | 8 (5.797) | ||
| Monthly income | The individual’s family’s monthly income level in 1000 units of BDT. | Less than 10 | 26 (18.841) |
| 10-20 | 45 (32.609) | ||
| 20-30 | 28 (20.290) | ||
| 30-40 | 18 (13.043) | ||
| More than 40 | 21 (15.217) | ||
| Marital status | The individual’s marital situation. | Married | 118 (85.507) |
| Widow or Divorce | 20 (14.493) | ||
| Age of last child | The age group of the individual’s youngest child in years. | Below 15 | 46 (33.333) |
| 15-25 | 43 (31.159) | ||
| 25-35 | 35 (25.362) | ||
| 35-45 | 14 (10.145) | ||
| Residence | The type of area in which the individual resides. | Urban | 66 (47.826) |
| Rural | 72 (52.174) |
Table 3.
Social Experience of the Participants.
| Social experience variables | Description | Categorization | Frequency (%) |
|---|---|---|---|
| Husbands’ behaviour | Whether there was a change in the husband’s behaviour after the mastectomy. | Yes | 60 (43.478) |
| No | 78 (56.522) | ||
| Husbands behaviour change | Type of change in husband’s behaviour after mastectomy. | Not applicable | 79 (57.246) |
| Sympathized | 42 (30.435) | ||
| Avoiding | 8 (5.797) | ||
| Misbehave | 9 (6.522) | ||
| Family members behaviour | Whether there was a change in the behaviour of other family members after the mastectomy | Yes | 66 (47.826) |
| No | 72 (52.174) | ||
| Family members behaviour change | Type of change in other family members’ behaviour after mastectomy. | Not applicable | 73 (52.899) |
| Sympathized | 16 (11.594) | ||
| Avoiding | 49 (35.507) | ||
| Misbehave | 73 (52.899) |
Data Analysis and Feature Selection Method
This study employed the Boruta algorithm 31 to identify the most relevant PTSD-related feature and its essential score. The Boruta algorithm uses a RF classifier to iteratively compare the importance of features in the original dataset with the significance of shadow features. The shadow features are created by shuffling the values of the original features across the dataset, effectively representing random noise. In a simplified version of the Boruta Algorithm flowchart, illustrated in Figure 2 the process begins with data preparation, in which the original dataset is extended by adding shadow features for each original attribute. The values of each shadow attribute are randomized by shuffling the values of the corresponding original feature across all instances in the dataset. Then, a RF classifier is trained on the extended dataset, which contains both original and shadow features. The RF assigns importance scores to all features, including the original and shadow features. After that, the highest importance score among all shadow features is identified as the Maximum Z-Score Among Shadow Attributes (MZSA).
Figure 2.
Flowchart of Boruta Algorithm.
Then, a comparison is made to identify important features; if the feature importance is higher than MZSA, it is recognized as an important feature; otherwise, we proceed to the next step to determine whether the feature is unimportant or tentative. The features are tentative if feature scores are similar to the MZSA’s. Features with significantly lower importance scores than the MZSA are confirmed as unimportant. Next, remove all unnecessary features and retrain the RF model. The algorithm iterates, generating new shadow features for the remaining tentative attributes. This process continues until all attributes are categorized as either essential or unimportant, or the iteration reaches the maximum number initially set. Boruta provides a ranked list of features and their importance scores, categorized into confirmed important, confirmed unimportant, and tentative.
Prediction Using Machine Learning Model
The utilization of ML models can meet the necessity for an accurate prediction of PTSD status. ML approaches have gained recognition for their capacity to manage complex relationships among variables, which is especially advantageous for predicting health outcomes such as PTSD. 32 Considering the complex interaction of sociodemographic, health, behaviour changes, and treatment information in predicting PTSD status, we applied several ML models to provide better predictive performance, including Random Forest (RF), Partial Least Squares (PLS), Support Vector Machine (SVM), Gradient Boosting Machine (GBM), Decision Tree (DT), Linear Discriminant Analysis (LDA), Bagged Trees (BT), Logistic Regression (LR), Extreme Gradient Boosting (XGB), and K-Nearest Neighbours (KNN).
This method selection represents a balanced mix of ensemble learners, linear classifiers, kernel-based methods, and distance-based approaches, each offering unique strengths in handling varied data distributions and decision boundaries. Each model was trained and validated using a dataset partitioned into an 80% training set and a 20% test set. We further applied 10-fold cross-validation across all models to ensure generalizability and reduce model variance. This process allowed for the reliable estimation of test accuracy while mitigating overfitting. Model performance was evaluated primarily on classification accuracy. Still, additional metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC) were also examined to capture different aspects of predictive performance.
Technical Implementation and Model Development for PTSD Prediction
The technical implementation and model development for predicting PTSD involved several key steps, including data preprocessing, feature engineering, feature selection, and hyperparameter optimization, all conducted using R programming (version 4.4.3). Comprehensive data preprocessing was essential for preparing the dataset, where all categorical variables were encoded using label encoding techniques. For feature selection, the Boruta algorithm was utilized alongside a RF estimator 33 comprising 500 trees and a maximum of 100 iterations, effectively identifying significant predictive features from the initial dataset. To optimize model performance, grid search cross-validation was employed to determine the best hyperparameters for each ML model, utilizing a 10-fold cross-validation approach to ensure robust parameter selection. This systematic methodology facilitated the development of effective ML models for PTSD prediction, ultimately enhancing the accuracy and reliability of the predictions made.
Results
Frequency Distribution
Among the 138 participants, 47 (34.1%) have developed PTSD, and 91 participants haven’t developed PTSD. Figure 3 shows the ratio of people having PTSD in the dataset. After carefully selecting and processing the relevant data, we analysed the characteristics of 138 female patients, focussing on sociodemographic factors, health history, social experience, and treatment information.
Figure 3.
Percentage of people having PTSD in the dataset.
Table 1 presents the sociodemographic section, highlighting the background characteristics of the 138 participants. The majority of participants were aged between 41 and 50 years (34.1%), followed by those aged 51 to 60 years (25.4%). Among the participants over half of them (54.3%) were illiterate and most of them (80.7%) were educated less than higher secondary level. Regarding occupation, a significant portion (74.6%) were homemakers, while others were employed (19.6%) or involved in other roles such as business or student life (5.8%). In terms of income, 32.6% of participants reported a monthly household income between 10 000 and 20 000 BDT, with smaller proportions earning more or less than this amount. Most of the participants were married (85.5%), and 33.3% had their youngest child under 15 years of age. A slightly higher proportion of respondents (52.2%) lived in rural areas compared to those in urban areas (47.8%).
Table 2 focuses on the participants’ physical and mental health status prior to or during treatment. Approximately 26.1% of the participant had a pre-existing chronic disease, while the remaining 73.9% did not. Notably, none of the participants reported a history of mental illness, which may reflect a lack of awareness, diagnosis, or willingness to disclose such conditions. When asked about their ability to perform household tasks, only 23.2% indicated that they were capable, whereas 76.8% reported that they were not, suggesting a degree of physical or functional limitation. Regarding tumour severity, 60.9% had Grade 2 tumours, 34.1% had Grade 3 tumours, and only 5.1% had Grade 1 tumours.
Table 2.
Health History of the Participants.
| Health related variables | Description | Categorization | Frequency (%) |
|---|---|---|---|
| Chronic disease | Whether the individual had any chronic disease before the Mastectomy. | Yes | 36 (26.087) |
| No | 102 (73.913) | ||
| Mental illness | Whether the individual had a history of mental illness. | Yes | – |
| No | 138 (100) | ||
| Household work | Whether the individual engages in household activities or can perform household tasks. | Yes | 32 (23.188) |
| No | 106 (76.812) | ||
| Tumour grade | The classification of the tumour based on severity. | Grade-1 | 7 (5.072) |
| Grade-2 | 84 (60.870) | ||
| Grade-3 | 47 (34.058) |
The social experience variables examine how interpersonal relationships has affected following mastectomy, and they are presented in Table 3. A total of 43.5% of the participants reported a change in their husband’s behaviour after the surgery. Among them, 30.4% experienced sympathetic behaviour, while 5.8% faced avoidance, and 6.5% reported being mistreated. Meanwhile, 47.8% noticed a change in the behaviour of other family members. Of those, 35.5% experienced avoidance, and a notable 52.9% reported being mistreated, revealing significant social challenges and potential emotional strain within family environments during recovery.
Table 4 provides a detailed description of the treatment history of participants following mastectomy. Most respondents (39.9%) had undergone mastectomy between 10 and 12 months prior to the survey, with smaller groups falling into earlier postoperative intervals. Interestingly, none of the participants had received chemotherapy, which may reflect treatment limitations or individual medical decisions. Half of the sample (50.7%) received radiotherapy, while the other half did not. Only a small fraction (8.7%) of patients reported receiving hormone therapy. These treatment-related variables provide important context for understanding the medical exposure and potential psychological impacts experienced by the patients.
Table 4.
Treatment Characteristics of the Participants.
| Treatment variables | Description | Categorization | Frequency (%) |
|---|---|---|---|
| Mastectomy operation | The duration since the individual underwent a mastectomy. | 1-3 mo | 26 (18.841) |
| 4-6 mo | 33 (23.913) | ||
| 7-9 mo | 24 (17.391) | ||
| 10-12 mo | 55 (39.855) | ||
| Chemotherapy | Whether the individual received chemotherapy treatment. | Yes | – |
| No | 138 (100) | ||
| Radiotherapy | Whether the individual received radiotherapy treatment. | Yes | 70 (50.725) |
| No | 68 (49.275) | ||
| Hormone therapy | Whether the individual received hormone therapy. | Yes | 12 (8.696) |
| No | 126 (91.304) |
Table 5 presents the bivariate distribution of PTSD status in relation to a range of sociodemographic, familial, and clinical variables among mastectomy individuals. Age showed a significant gradient, with the highest proportion of PTSD observed in individuals below 40 years (70.0%) and progressively decreasing with advancing age (eg, 14.3% among those aged 51-60, and 15.4% in those over 60). Educational attainment demonstrated a clear inverse relationship with PTSD prevalence: while only 12.0% of illiterate participants had PTSD, the proportion rose sharply to 89.3% among those with higher secondary education or more.
Table 5.
Bivariate Characteristics of the Participants with Having PTSD.
| Variables | Having PTSD | No PTSD |
|---|---|---|
| N (%) a | N (%) a | |
| Age | ||
| Below 40 | 21 (70.000) | 9 (30.000) |
| 41-50 | 17 (36.170) | 30 (63.830) |
| 51-60 | 5 (14.286) | 30 (85.714) |
| More than 60 | 4 (15.385) | 22 (84.615) |
| Education | ||
| Illiterate | 9 (12.000) | 66 (88.000) |
| Primary | 8 (33.333) | 16 (66.667) |
| Secondary | 5 (45.455) | 6 (54.545) |
| Higher secondary or more | 25 (89.286) | 3 (10.714) |
| Occupation | ||
| Homemaker | 20 (19.417) | 83 (80.583) |
| Job employee | 22 (81.481) | 5 (18.519) |
| Others | 5 (62.500) | 3 (37.500) |
| Monthly income | ||
| Less than 10 | 2 (7.692) | 24 (92.308) |
| 10-19 | 9 (20.000) | 36 (80.000) |
| 20-29 | 8 (28.571) | 20 (71.429) |
| 30-39 | 10 (55.556) | 8 (44.444) |
| More than 40 | 18 (85.714) | 3 (14.286) |
| Marital status | ||
| Married | 43 (36.441) | 75 (63.559) |
| Widow or divorce | 4 (20.000) | 16 (80.000) |
| Age of last child | ||
| Below 15 | 28 (60.870) | 18 (39.130) |
| 15-25 | 13 (30.233) | 30 (69.767) |
| 25-35 | 5 (14.286) | 30 (85.714) |
| 35-45 | 1 (7.143) | 13 (92.857) |
| Residence | ||
| Urban | 37 (56.061) | 29 (43.939) |
| Rural | 10 (13.889) | 62 (86.111) |
| Chronic disease | ||
| Yes | 12 (33.333) | 24 (66.667) |
| No | 35 (34.314) | 67 (65.686) |
| Mental illness | ||
| Yes | – | – |
| No | 47 (34.057) | 91 (65.942) |
| Household work | ||
| Yes | 3 (9.375) | 29 (90.625) |
| No | 44 (41.509) | 62 (58.491) |
| Tumour grade | ||
| Grade-1 | 4 (57.143) | 3 (42.857) |
| Grade-2 | 26 (30.952) | 58 (69.048) |
| Grade-3 | 17 (36.170) | 30 (63.830) |
| Husbands’ behaviour | ||
| Yes | 33 (55.000) | 27 (45.000) |
| No | 14 (17.949) | 64 (82.051) |
| Husbands behaviour change | ||
| Not applicable | 15 (18.987) | 64 (81.013) |
| Sympathetic | 20 (47.619) | 22 (52.381) |
| Avoiding | 7 (87.500) | 1 (12.500) |
| Misbehave | 5 (55.556) | 4 (44.444) |
| Family members behaviour | ||
| Yes | 36 (54.545) | 30 (45.455) |
| No | 11 (15.278) | 61 (84.722) |
| Family members behaviour change | ||
| Not applicable | 13 (17.808) | 60 (82.192) |
| Misbehave | 11 (68.750) | 5 (31.250) |
| Sympathetic | 23 (46.939) | 26 (53.061) |
| Mastectomy operation | ||
| 1-3 mo | 16 (61.538) | 10 (38.462) |
| 4-6 mo | 19 (57.576) | 14 (42.424) |
| 7-9 mo | 4 (16.667) | 20 (83.333) |
| 10-12 mo | 8 (14.545) | 47 (85.455) |
| Chemotherapy | ||
| Yes | – | – |
| No | 47 (34.057) | 91 (65.942) |
| Radiotherapy | ||
| Yes | 12 (17.143) | 58 (82.857) |
| No | 35 (51.471) | 33 (48.529) |
| Hormone therapy | ||
| Yes | 2 (16.667) | 10 (83.333) |
| No | 45 (35.714) | 81 (64.286) |
Frequency (%).
Occupational status was another differentiating factor, where PTSD prevalence was markedly higher among job employees (81.5%) compared to homemakers (19.4%). A similar trend was noted with monthly income, where higher income groups (eg, >40 000 BDT/month) showed disproportionately higher PTSD rates (85.7%), contrasting with only 7.7% in the lowest income category (<10 000 BDT/month).
Marital status, age of the youngest child, and place of residence were also associated with PTSD status. Widowed/divorced individuals had lower PTSD prevalence (20.0%) compared to married ones (36.4%), while those with younger children (<15 years) had notably higher PTSD rates (60.9%). Urban residents reported a PTSD prevalence of 56.1%, significantly higher than their rural counterparts (13.9%). Participants reporting negative changes in husband’s behaviour (eg, avoidance or misbehaviour) and similar changes in other family members had substantially higher PTSD rates (eg, 87.5% among those reporting avoidances by their husband; 68.8% among those reporting misbehaviours by family members).
Regarding clinical variables, no substantial differences in PTSD prevalence were observed between those with and without chronic disease. However, participants who could not perform household work were more likely to have PTSD (41.5%) than those who could (9.4%). Tumour grade also demonstrated variation, with PTSD more prevalent in Grade 1 (57.1%) compared to Grades 2 and 3 (Table 5). Timing since mastectomy appeared influential, with PTSD most prevalent in those 1 to 3 months post-surgery (61.5%) and lowest among those 10 to 12 months post-mastectomy (14.5%). Treatment modalities also reflected PTSD variations: radiotherapy recipients had lower PTSD prevalence (17.1%) than non-recipients (51.5%), and hormone therapy was associated with a lower PTSD prevalence (16.7%) compared to non-recipients (35.7%).
Boruta Algorithm
In this study, the Boruta algorithm was employed as a robust feature selection method to identify the most relevant predictors associated with PTSD. Table 6 provides a detailed summary of the results generated by the Boruta algorithm. The table lists the evaluated features along with their corresponding mean importance scores, normalized importance scores (NormHits Importance), and their final classification as either “Confirmed” or “Rejected.” Features “Confirmed” were deemed statistically significant and relevant for predictive modelling, whereas those classified as “Rejected” exhibited insufficient importance.
Table 6.
Overview of Boruta Algorithm Outcomes.
| Features | Feature importance scores | Status | |
|---|---|---|---|
| Mean importance | Normalized importance | ||
| Age | 4.421001 | 0.7941 | Confirmed |
| Education | 18.3976344 | 1.0000 | Confirmed |
| Occupation | 12.059879 | 1.0000 | Confirmed |
| Monthly income | 13.183537 | 1.0000 | Confirmed |
| Marital status | −2.2879437 | 0.0000 | Rejected |
| Age of last child | 6.3160386 | 0.9706 | Confirmed |
| Residence | 9.4589131 | 1.0000 | Confirmed |
| Chronic disease | −0.9121031 | 0.0000 | Rejected |
| Household work | 7.2057635 | 0.9412 | Confirmed |
| Husbands’ behaviour | 5.9081259 | 0.9412 | Confirmed |
| Husbands behaviour change | 14.6142682 | 0.9706 | Confirmed |
| Family members behaviour | 5.2386789 | 0.9412 | Confirmed |
| Family members’ behaviour change | 9.4191189 | 1.0000 | Confirmed |
| Mastectomy operation | 5.1785529 | 0.9706 | Confirmed |
| Tumour grade | 5.186441 | 0.9412 | Rejected |
| Radiotherapy | −0.2374663 | 0.0000 | Confirmed |
| Hormone therapy | 4.421001 | 0.7941 | Rejected |
Among the evaluated features, 13 were confirmed as statistically significant, including Education, Occupation, Monthly Income, Family Members’ Behaviour Change, and Husbands’ Behaviour Change, all demonstrating high normalized importance scores (1.0000 or close to it). In contrast, features such as Marital Status, Chronic Disease, and Tumour Grade were rejected, indicating their limited relevance to the predictive model.
Figure 4 provides a graphical summary of the feature importance scores identified by the Boruta algorithm. The boxplot visualizes the distribution of importance scores for each feature, with confirmed features highlighted in green, rejected in red, and shadow features used as controls in blue. The x-axis represents the individual features, while the y-axis denotes their respective importance scores. The figure clearly illustrates the relative importance of the confirmed features, with Education, Occupation, and Monthly Income standing out as the most critical predictors due to their significantly higher importance scores. In contrast, rejected features such as Marital Status, Chronic Disease, and Tumour Grade are positioned lower on the graph, with importance scores close to or below zero. The shadow features (eg, ShadowMin, ShadowMean, ShadowMax) are benchmarks for distinguishing meaningful predictors from noise, further validating the confirmed features. Together, Table 6 and Figure 4 provide a comprehensive overview of the Boruta algorithm’s results, emphasizing the most relevant predictors and their contributions to distinguishing individuals at risk of PTSD. These insights form the basis for subsequent predictive modelling and analysis.
Figure 4.
Feature selection using the Boruta algorithm.
Predicting PTSD with Machine Learning
The study conducted a comprehensive evaluation of several ML models for predicting PTSD status. These models were assessed using 13 statistically significant features chosen via the Boruta feature selection method. The models’ performance was compared based on key metrics, including accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC), with 95% confidence intervals (CIs) provided for accuracy. Table 7 summarizes the performance of the models, while Figure 5 illustrates the corresponding Receiver Operating Characteristic (ROC) curves, offering a visual representation of their discriminatory capabilities.
Table 7.
Evaluating the Performance of Traditional and ML Models with Boruta-Selected Features.
| Model name | Accuracy | 95% CI | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|
| RF | 0.889 | [0.708, 0.976] | 0.889 | 0.889 | 0.914 |
| PLS | 0.852 | [0.663, 0.958] | 0.889 | 0.778 | 0.895 |
| SVM | 0.815 | [0.619, 0.937] | 0.778 | 0.889 | 0.889 |
| GBM | 0.778 | [0.577, 0.914] | 0.778 | 0.778 | 0.873 |
| DT | 0.778 | [0.577, 0.914] | 0.778 | 0.778 | 0.864 |
| LDA | 0.778 | [0.577, 0.914] | 0.833 | 0.667 | 0.846 |
| BT | 0.778 | [0.577, 0.914] | 0.778 | 0.778 | 0.836 |
| LR | 0.741 | [0.537, 0.889] | 0.722 | 0.778 | 0.815 |
| XGB | 0.704 | [0.498, 0.863] | 0.667 | 0.778 | 0.852 |
| KNN | 0.704 | [0.498, 0.863] | 0.778 | 0.556 | 0.802 |
Figure 5.

ROC curve for different models.
The RF model demonstrated the highest overall performance. RF achieved an accuracy of 0.889 (95% CI: [0.708, 0.976]), with balanced sensitivity (0.889) and specificity (0.889). Its AUC value of 0.914 was the highest among all models, underscoring its strong ability to discriminate between PTSD-positive and PTSD-negative cases. RF’s superior performance can be attributed to its capacity to handle complex feature interactions and its robustness against overfitting, which is particularly advantageous for small datasets. The PLS regression model followed closely, achieving an accuracy of 0.852 (95% CI: [0.663, 0.958]), sensitivity of 0.889, specificity of 0.778, and an AUC of 0.895. PLS’s ability to address multicollinearity among features likely contributed to its competitive performance, although its slightly lower specificity compared to RF indicates limitations in distinguishing PTSD-negative cases. The SVM model also performed well, with an accuracy of 0.815 (95% CI: [0.619, 0.937]), a specificity of 0.889, and an AUC of 0.889. However, its sensitivity was slightly lower (0.778), suggesting reduced capability in identifying PTSD-positive cases.
Models such as the GBM and LDA achieved accuracies of 0.778 (95% CI: [0.577, 0.914]) with moderate AUC values of 0.873 and 0.846, respectively. GBM’s iterative boosting approach demonstrated balanced sensitivity and specificity (both 0.778), while LDA’s higher sensitivity (0.833) but lower specificity (0.667) indicated an increased rate of false positives. Simpler models, such as LR, DT, and BT, exhibited moderate performance, with accuracies ranging from 0.741 to 0.778 and AUC values between 0.815 and 0.836. The Extreme Gradient Boosting (XGB) and KNN models demonstrated the lowest overall performance, particularly KNN, which exhibited an accuracy of 0.704 (95% CI: [0.498, 0.863]) and the lowest AUC of 0.802.
Figure 5 displays the ROC curves for the evaluated models, providing a graphical representation of their sensitivity (true positive rate) versus 1-specificity (false positive rate) across varying thresholds. The AUC values derived from these curves reflect the models’ overall discriminatory capabilities and complement the metrics reported in Table 7.
The RF model, represented by the red curve, reinforces its superior performance with the highest AUC (0.914), demonstrating its strong ability to distinguish PTSD-positive and PTSD-negative cases. Similarly, the PLS regression and SVM models exhibit high AUC values of 0.895 and 0.889, respectively, aligning with their strong quantitative results. The GBM achieves an AUC of 0.873, slightly lower than RF, PLS, and SVM, but still indicative of competitive performance. Models such as LDA (yellow curve), LR, and KNN show relatively lower AUC values of 0.846, 0.815, and 0.802, respectively, consistent with their moderate performance metrics. Lastly, the DT and TreeBag models exhibit moderate AUC values of 0.864 and 0.836, respectively, reflecting their limitations in generalizing to unseen data.
The results presented in Table 7 and Figure 5 collectively highlight the comparative strengths and limitations of the evaluated ML models. While the table provides a detailed breakdown of the key performance metrics, the graph visually confirms these findings by illustrating the models’ ROC curves and associated AUC values. Together, they emphasize the superior performance of the RF model, followed by PLS and SVM, in predicting PTSD status. The high AUC values for these models reinforce their strong discriminatory capabilities, particularly in identifying PTSD-positive cases. In contrast, simpler models such as LR, DT, and KNN demonstrated comparatively lower performance in quantitative metrics and visual ROC curve assessments, reflecting their limited ability to capture complex feature interactions.
Discussion
This study demonstrates the utility of several ML models in predicting PTSD among post-mastectomy breast cancer patients. The RF model showed the highest overall performance, achieving an accuracy of 88.9% and an area under the curve (AUC) of 0.914. This strong performance underscores the model’s ability to manage complex feature interactions and its robustness against overfitting, even in small datasets. The PLS and SVM models also performed well. However, their slightly lower specificity suggests that they may not be as reliable as RF for distinguishing PTSD-negative cases. Other models, such as KNN and XGB, exhibited comparatively lower performance, highlighting their limitations in handling the intricate relationships among the predictors in this dataset. A previous study used an RF model and achieved AUC scores of 0.77 for predicting PTSD 2.5 years post-deployment and 0.78 for 6.5 years post-deployment in Danish soldiers. 34 In this situation, our RF model yields better results than those reported in the previous study.
Beyond prediction, ML offers concrete prevention advantages in low resource oncology settings. Models trained on routinely collected variables, for example education, income, and family behaviour change, can flag high risk patients at the point of care, enabling same day psycho oncology referral and brief counselling before symptoms worsen. Feature selection with Boruta prioritizes modifiable or monitorable factors, which helps clinicians target upstream supports, for example family education or social work, rather than waiting for severe distress. Probabilistic outputs allow threshold tuning to favour sensitivity when the goal is early detection, which is appropriate at initial post operative visits, then retuning for specificity during survivorship follow up, which reduces false positives when resources are tight. Ensemble methods like Random Forest are robust to small, noisy, and mixed type datasets, common in LMIC clinics, which makes preventive screening feasible without expensive biomarkers. Finally, integration into a two-stage workflow, brief ML triage followed by focussed clinician assessment, can shorten time to support, reduce missed cases, and align limited counselling capacity with those most likely to benefit.
The study’s findings indicate that Education and Monthly Income were among the most significant socio-demographic predictors, likely reflecting their role in shaping patients’ access to resources, coping mechanisms, and overall resilience. Again, Higher education and income levels may improve patients’ understanding of their condition and provide better access to mental health support. In comparison, lower levels may exacerbate vulnerability to PTSD due to financial and informational constraints, which also explored similarly in other studies, that education and income status are significant risk factors for having PTSD.35,36
The study reveals that Family Members’ Behaviour and Behaviour Change factors emerged as critical predictors, highlighting the importance of social support in mental health outcomes. Negative changes in family dynamics, such as avoidance or misbehaviour, can worsen psychological stress, while supportive family behaviours may act as protective factors. This finding aligns with prior studies emphasizing the role of close relationships in mitigating PTSD risk.21,37 Urban or rural residence was a significant factor, potentially reflecting disparities in access to healthcare services and mental health resources. Urban residents may have better access to specialized care, whereas rural residents might face additional barriers. Previous studies also support this result.38-40
The type and timing of treatment were also influential. Patients undergoing mastectomy and radiotherapy are exposed to significant physical and emotional stressors, which may increase their PTSD risk. The findings suggest that the psychological impact of these invasive treatments should be carefully monitored. Employment status was predictive of PTSD, with homemakers showing lower PTSD rates than employed individuals. This could reflect occupational stress or differences in social roles and expectations, which may influence the psychological burden experienced by patients. This result is consistent with the findings reported by previous research.41-43
Marital status was rejected as a significant predictor. While marriage is often considered a source of social support, this variable did not show a substantial predictive value in this study. This non-significance may be attributed to the greater influence of qualitative aspects of relationships, such as emotional and behavioural responses from family members, which were captured more specifically through other variables. This result is different from the result of the earlier study. 44
Chronic disease, despite being a recognized health burden, was also found to be a non-significant predictor of post-mastectomy PTSD among breast cancer patients in this study. This may be due to its relatively low prevalence in the sample (26.1%) and the possibility that its impact was overshadowed by more acute and socially sensitive experiences such as treatment-related stress or changes in family dynamics. Previous literature has provided mixed findings—while some studies identified chronic comorbidities as contributors to PTSD, 17 others have shown weak or no associations. 18
Hormone therapy was also rejected as a significant predictor, likely due to its low utilization among the study population (8.7%). The insufficient sample size for this subgroup may have hindered the detection of a meaningful association. Moreover, the psychological impact of hormone therapy may manifest over longer periods and might not be readily observable in a cross-sectional design. Other studies have similarly reported inconclusive results regarding hormone therapy’s relationship to PTSD. 4 Tumour grade, though clinically relevant for disease progression, did not emerge as a significant predictor. Patients may be more psychologically affected by visible treatment consequences (such as mastectomy) and perceived social reactions than by the histopathological details of the tumour. This aligns with prior studies where psychological distress was more closely tied to perceived threat and treatment experience rather than disease stage or grade. 21
Although chronic disease is a known stressor, it was not found to be a significant predictor in this study. Clinical factors such as Tumour Grade and Hormone Therapy were also rejected as predictors. While tumour grade reflects the severity of cancer, its psychological impact may be overshadowed by the emotional and social stressors associated with treatment and recovery. Similarly, the limited use of hormone therapy in the sample could explain its lack of significance. This result also aligns with earlier investigations. 17 However, another study found some different features of PTSD.4,18,45
Conclusions
This study demonstrates the effectiveness of ML, particularly RF algorithms, in predicting PTSD among post-mastectomy breast cancer patients in Bangladesh. With an accuracy of 88.9% and an AUC of 0.914, our model shows promising potential for clinical application. Identifying key sociodemographic and behavioural predictors provides valuable insights for targeted intervention strategies. Our findings have significant implications for mental health screening in resource-limited settings. The model’s strong performance using readily available predictors makes it a practical tool for identifying high-risk patients in routine clinical practice. This could enable early intervention and improved mental health support for breast cancer survivors. We recommend implementing ML-based screening tools in post-mastectomy care protocols, particularly in resource-limited settings. Further validation through larger, multicentre studies will be crucial for establishing the broader applicability of this approach in different healthcare contexts.
Limitations
Despite the promising findings, this study has several limitations. First, the relatively small sample size (n = 138) and the single geographic location may restrict the generalizability of the results to broader populations. Future studies involving larger, more diverse cohorts from multiple centres are needed to validate the model’s applicability in different contexts. Second, the cross-sectional design limits the ability to infer causality or assess the temporal progression of PTSD symptoms, which are known to evolve over time. A longitudinal design would provide deeper insight into the trajectory and predictors of PTSD.
Additionally, the dataset exhibited class imbalance, with only 34.1% of patients classified as having PTSD. SMOTE (Synthetic Minority Over-sampling Technique) can be used to mitigate this issue and capture the complexity and heterogeneity of real PTSD cases. This could potentially influence the performance and generalizability of the machine learning models, especially in clinical settings. Furthermore, the absence of external validation remains a critical limitation. All models were trained and tested on the same dataset using cross-validation, which may not adequately reflect real-world scenarios. Lastly, while the study relied on self-reported data, including the PCL-5 PTSD checklist, potential reporting biases or underreporting of symptoms—especially in low-literacy or stigmatized populations—could have impacted the findings. Future research should also explore the integration of clinical, biological, or behavioural data to enhance model robustness and predictive capability.
Supplemental Material
Supplemental material, sj-docx-1-cix-10.1177_11769351251401330 for Prediction and Feature Selection of Mastectomy-Related Post Traumatic Stress Disorder (PTSD) Using Machine Learning Among Breast Cancer Patients in Bangladesh by Syed Billal Hossain, Md. Mizanoor Rahman, Kapashia Binte Giash, Md. Hazrat Ali, Mst. Asma Akter and A.B.M. Alauddin Chowdhury in Cancer Informatics
Supplemental material, sj-docx-2-cix-10.1177_11769351251401330 for Prediction and Feature Selection of Mastectomy-Related Post Traumatic Stress Disorder (PTSD) Using Machine Learning Among Breast Cancer Patients in Bangladesh by Syed Billal Hossain, Md. Mizanoor Rahman, Kapashia Binte Giash, Md. Hazrat Ali, Mst. Asma Akter and A.B.M. Alauddin Chowdhury in Cancer Informatics
Acknowledgments
We acknowledge the support and assistance from the authorities of the National Institute of Cancer Research and Hospital (NICRH), Dhaka Medical College Hospital (DMCH), and Ahsania Mission Cancer & General Hospital (AMCGH) for the research.
Footnotes
ORCID iDs: Syed Billal Hossain
https://orcid.org/0000-0003-4903-3690
Kapashia Binte Giash
https://orcid.org/0009-0003-7556-5102
Md. Hazrat Ali
https://orcid.org/0009-0000-2054-8434
Ethics Considerations: Ethical approval was obtained from the Ethical Review Committee of the Faculty of Health and Life Sciences (FHLS), Daffodil International University (DIU), Bangladesh. Ethics approval reference number is FHLS-REC/DIU/2024/0016. Official permission was obtained from the selected hospital authorities prior to data collection.
Consent to Participate: Written informed consent was obtained from the literate participants, and for illiterate participants, study purpose and procedure was described in Bengali by native interviewers and collected thumbprints on the consent form from the participants, who were unable to sign.
Consent for Publication: Not applicable.
Author Contributions: S.B.H. conceptualized the manuscript and contributed for supervision, data preparation and analysis, writing – original draft, and writing – review and editing; M.M.R. contributed for data preparation and analysis, and writing – original draft; K.B.G. contributed for data preparation and analysis, and writing – original draft; M.H.A. contributed for data preparation and analysis, and writing – original draft; M.A.A. contributed for data collection and preparation, and visualization; A.B.M.A.C. contributed for supervision, and writing – review and editing.
Funding: The authors received no financial support for the research, authorship, and/or publication of this article.
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement: Dataset is available in the Mendeley Data Repository in this following link: https://data.mendeley.com/datasets/r7vfpy9ckg/1 (DOI: 10.17632/r7vfpy9ckg.1).
Clinical Trial Number: Not applicable.
Supplemental Material: Supplemental material for this article is available online.
References
- 1. Shapira R, Baris Ginat YJ, Lipskaya-Velikovsky L. Daily life participation in PTSD: pilot study on patterns and correlators. Front Psychiatry. 2024;15:1429647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. El-Solh AA. Management of nightmares in patients with posttraumatic stress disorder: current perspectives. Nat Sci Sleep. 2018;10:409-420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Arnaboldi P, Lucchiari C, Santoro L, Sangalli C, Luini A, Pravettoni G. PTSD symptoms as a consequence of breast cancer diagnosis: clinical implications. Springerplus. 2014;3(1):1-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Brown LC, Murphy AR, Lalonde CS, Subhedar PD, Miller AH, Stevens JS. Posttraumatic stress disorder and breast cancer: risk factors and the role of inflammation and endocrine function. Cancer. 2020;126(14):3181-3191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Roy N, Downes MH, Ibelli T, Amakiri UO, Li T, Tebha SS, et al. The psychological impacts of post-mastectomy breast reconstruction: a systematic review. Ann Breast Surg. 2024;8:19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Figueiredo S, Devezas M. Bridging the gaps in understanding POD and POCD: a thorough examination of genetic and clinical biomarkers. Perioper Care Oper Room Manag. 2024;35:100401. [Google Scholar]
- 7. Kim G, Kim H, Park J, Kang HS, Kim S, Kim S. A Caring Program for health promotion among women who have experienced trauma: a QuasiExperimental Pilot Study. J Korean Acad Nurs. 2023;53(5):500-513. [DOI] [PubMed] [Google Scholar]
- 8. Kim S, Lee K. Development and evaluation of an online mental health program for traumatized female college students: a randomized controlled trial. Arch Psychiatr Nurs. 2023;43:118-126. [DOI] [PubMed] [Google Scholar]
- 9. Figueiredo S, Petravičiūtė A. Examining the relationship between coping strategies and post-traumatic stress disorder in forcibly displaced populations: a systematic review. Eur J Trauma Dissociation. 2025;9(2):100535. [Google Scholar]
- 10. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209-249. [DOI] [PubMed] [Google Scholar]
- 11. Yatham S, Sivathasan S, Yoon R, da Silva TL, Ravindran AV. Depression, anxiety, and post-traumatic stress disorder among youth in low and middle income countries: a review of prevalence and treatment interventions. Asian J Psychiatr. 2018;38:78-91. [DOI] [PubMed] [Google Scholar]
- 12. Islam RM, Bell RJ, Billah B, Hossain MB, Davis SR. Awareness of breast cancer and barriers to breast screening uptake in Bangladesh: a population based survey. Maturitas. 2016;84:68-74. [DOI] [PubMed] [Google Scholar]
- 13. Wang J, Kang D-X, Zhang A-J, Li B-R. Effects of psychological intervention on negative emotions and psychological resilience in breast cancer patients after radical mastectomy. World J Psychiatry. 2024;14(1):8-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Oliveri S, Arnaboldi P, Pizzoli SFM, et al. Ptsd symptom clusters associated with short- and long-term adjustment in early diagnosed breast cancer patients. Ecancermedicalscience. 2019;13:917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Hasan MT, Anwar T, Christopher E, et al. The current state of mental healthcare in Bangladesh: part 1 - an updated country profile. BJPsych Int. 2021;18(4):78-82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Sarkar S, Ghosh D, Mahata S, et al. Sociodemographic factors and clinical presentation of women attending Cancer Detection Centre, Kolkata for breast examination. J Clin Transl Res. 2020;5(3):132-139. [PMC free article] [PubMed] [Google Scholar]
- 17. Zhou B, Kang J, Zhao L, Zhang L, He Q. Development and validation of a risk prediction model for post-traumatic stress disorder among Chinese breast cancer survivors. Sci Rep. 2025;15(1):9175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Lebimoyo AA, Sanni MO. A prospective longitudinal study of post-traumatic stress symptoms and its risk factors in newly diagnosed female breast cancer patients. Middle East Curr Psychiatry. 2023;30(1):1-11. [Google Scholar]
- 19. Liaw A, Wiener M. Classification and Regression by randomForest. R News 2002;2(3):18-22. [Google Scholar]
- 20. Cortes C, Vapnik V, Saitta L. Support-vector networks. Mach Learn. 1995;20(3):273-297. [Google Scholar]
- 21. Figueiredo S, Ndiaye L. Predicting PTSD with machine learning: forecasting refugees’ trauma and tailored intervention. European Journal of Trauma & Dissociation. 2025;9(1):100502. [Google Scholar]
- 22. Wani AH, Katrinli S, Zhao X, et al. Blood-based DNA methylation and exposure risk scores predict PTSD with high accuracy in military and civilian cohorts. BMC Med Genomics. 2024;17(1):235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Peterson DJ, Ostberg NP, Blayney DW, Brooks JD, Hernandez-Boussard T. Machine learning applied to electronic health records: Identification of chemotherapy patients at high risk for preventable emergency department visits and hospital admissions. JCO Clin Cancer Inform. 2021;5(5):1106-1126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Kursa MB, Rudnicki WR. Feature selection with the Boruta Package. J Stat Softw. 2010;36(11):1-13. [Google Scholar]
- 25. Karstoft KI, Statnikov A, Andersen SB, Madsen T, Galatzer-Levy IR. Early identification of posttraumatic stress following military deployment: application of machine learning methods to a prospective study of Danish soldiers. J Affect Disord. 2015;184:170-175. [DOI] [PubMed] [Google Scholar]
- 26. Wu X, Wang J, Cofie R, Kaminga AC, Liu A. Prevalence of posttraumatic stress disorder among breast cancer patients: a meta-analysis. Iran J Public Health. 2016;45(12):1533-1544. [PMC free article] [PubMed] [Google Scholar]
- 27. Ahmed SK. How to choose a sampling technique and determine sample size for research: a simplified guide for researchers. Oral Oncology Reports. 2024;12:100662. [Google Scholar]
- 28. Blevins CA, Weathers FW, Davis MT, Witte TK, Domino JL. The Posttraumatic Stress Disorder Checklist for DSM-5 (PCL-5): development and initial psychometric evaluation. J Trauma Stress. 2015;28(6):489-498. [DOI] [PubMed] [Google Scholar]
- 29. American Psychiatric Association. DSM-5-TR Update. 2025. https://www.psychiatry.org/getmedia/b68a5776-f88c-45c7-9535-fd219d7aa5cb/APA-DSM5TR-Update-September-2025.pdf
- 30. Taber KS. The use of Cronbach’s alpha when developing and Reporting Research Instruments in science education. Res Sci Educ. 2018;48(6):1273-1296. [Google Scholar]
- 31. Kursa MB, Jankowski A, Rudnicki WR. Boruta – a system for feature selection. Fundam Inform. 2010;101(4):271-285. [Google Scholar]
- 32. Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019;380(14):1347-1358. [DOI] [PubMed] [Google Scholar]
- 33. Breiman L. Random forests. Mach Learn. 2001;45(1):5-32. [Google Scholar]
- 34. Karstoft K-I, Tsamardinos I, Eskelund K, Andersen SB, Nissen LR. Applicability of an automated model and parameter selection in the prediction of screening-level PTSD in Danish soldiers following deployment: development study of transferable predictive models using automated machine learning. JMIR Med Inform. 2020;8(7):e17119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Golin CE, Haley DF, Wang J, et al. Post-traumatic stress disorder symptoms and mental health over time among low-income women at increased risk of HIV in the U.S. J Health Care Poor Underserved. 2016;27(2):891-910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Gebresenbet EA, Zegeye S, Biratu TD. Prevalence and associated factors of depression and posttraumatic stress disorder among trauma patients: multi-centered cross-sectional study. Front Psychiatry. 2025;16:1447232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Horesh D, Brown AD. Editorial: post-traumatic stress in the family. Front Psychol. 2018;9:40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Forrest LN, Waschbusch DA, Pearl AM, et al. Urban vs. Rural differences in psychiatric diagnoses, symptom severity, and functioning in a psychiatric sample. PLoS One. 2023;18(10):e0286366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Peen J, Schoevers RA, Beekman AT, Dekker J. The current status of urban-rural differences in psychiatric disorders. Acta Psychiatr Scand. 2010;121(2):84-93. [DOI] [PubMed] [Google Scholar]
- 40. GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry. 2022;9(2):137-150. doi: 10.1016/S2215-0366(21)00395-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Auvinen P, Mäntyselkä P, Koponen H, et al. Elevation of tumor necrosis factor alpha levels is associated with restless legs symptoms in clinically depressed patients. J Psychosom Res. 2018;115:1-5. [DOI] [PubMed] [Google Scholar]
- 42. Henshall C, Davey Z. Development of an app for lung cancer survivors (iEXHALE) to increase exercise activity and improve symptoms of fatigue, breathlessness and depression. Psychooncology. 2020;29(1):139-147. doi: 10.1002/pon.5252 [DOI] [PubMed] [Google Scholar]
- 43. Bosmans MWG, Van der Velden PG. The effect of employment status in postdisaster recovery: a longitudinal comparative study among employed and unemployed affected residents. J Trauma Stress. 2018;31(3):460-466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Allen E, Knopp K, Rhoades G, Stanley S, Markman H. Between- and within-subject associations of PTSD symptom clusters and marital functioning in military couples. J Fam Psychol. 2018;32(1):134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Zhang J, Richardson J, Dunkley D, BT, Classifying post-traumatic stress disorder using the magnetoencephalographic connectome and machine learning. Sci Rep. 2020;10(1):5937. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material, sj-docx-1-cix-10.1177_11769351251401330 for Prediction and Feature Selection of Mastectomy-Related Post Traumatic Stress Disorder (PTSD) Using Machine Learning Among Breast Cancer Patients in Bangladesh by Syed Billal Hossain, Md. Mizanoor Rahman, Kapashia Binte Giash, Md. Hazrat Ali, Mst. Asma Akter and A.B.M. Alauddin Chowdhury in Cancer Informatics
Supplemental material, sj-docx-2-cix-10.1177_11769351251401330 for Prediction and Feature Selection of Mastectomy-Related Post Traumatic Stress Disorder (PTSD) Using Machine Learning Among Breast Cancer Patients in Bangladesh by Syed Billal Hossain, Md. Mizanoor Rahman, Kapashia Binte Giash, Md. Hazrat Ali, Mst. Asma Akter and A.B.M. Alauddin Chowdhury in Cancer Informatics




