Skip to main content
PLOS One logoLink to PLOS One
. 2025 Nov 25;20(11):e0336846. doi: 10.1371/journal.pone.0336846

Predicting IVF outcomes using a logistic regression–ABC hybrid model: A proof-of-concept study on supplement associations

Uğur Ejder 1,*, Pınar Uskaner Hepsağ 2
Editor: Ayman A Swelum3
PMCID: PMC12646469  PMID: 41289279

Abstract

Machine learning models are increasingly applied to assisted reproductive technologies (ART), yet most studies rely on conventional algorithms with limited optimization. This proof-of-concept study investigates whether a hybrid Logistic Regression–Artificial Bee Colony (LR–ABC) framework can enhance predictive performance in in vitro fertilization (IVF) outcomes while producing interpretable, hypothesis-driven associations with nutritional and pharmaceutical supplement use. A retrospective dataset of 162 women undergoing IVF was analyzed. Clinical, demographic, and supplement variables were preprocessed into 21 predictors. Four algorithms (K-Nearest Neighbors, Classification and Regression Tree, Support Vector Machine, and Random Forest) were implemented alongside their LR–ABC hybrid counterparts. Model performance was evaluated using 5-fold cross-validation with Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance. Local Interpretable Model-agnostic Explanations (LIME) were applied to improve interpretability. Across all algorithm models, LR–ABC hybrids outperformed their baseline models (e.g., Random Forest: 85.2% → 91.36% accuracy). LIME explanations identified omega-3, folic acid, and dietician support as influential features in individual predictions. However, given the small sample size, binary representation of supplements, and absence of external validation, the observed improvements and associations should be regarded as exploratory rather than definitive. The LR–ABC hybrid model demonstrates methodological potential for improving prediction and interpretability in IVF research. Findings regarding supplement associations are hypothesis-generating, not clinically directive. Future studies with larger, multi-center datasets including detailed dosage and dietary data are needed to validate and extend this framework.

1. Introduction

Subfertility is defined as the inability to become pregnant after one year of regular, unprotected sexual intercourse. According to the World Health Organization (WHO), about 10–15% of couples worldwide are affected by subfertility (World Health Organization, 2020). As the world’s population ages, subfertility is becoming an increasingly common problem and one of the biggest public health challenges of the 21st century [1]. The trend towards postponing childbearing, particularly in high-income countries, has contributed to the rise in subfertility cases and further increased the demand for effective treatment solutions [2]. undertook a review of future directions for subfertility research and intervention. His study focused on three key areas: helping and offering alternatives to infertile people, targeting avoidable causes of subfertility, and boosting new LCIVF initiatives to improve the availability and acceptability of global aided reproductive technologies (ART).

There are various approaches to treating subfertility, including pharmacological therapies, surgical interventions, and ART such as in vitro fertilization (IVF) and intrauterine insemination (IUI) [3]. However, these treatment methods are often costly, emotionally draining, and physically demanding, making the subfertility treatment process both stressful and lengthy [3]. [4] provide a comprehensive overview of subfertility, how it’s defined, and how it affects the world.

The aim of Aoun’s review was to synthesize recent findings on the influence of nutritional factors, including specific food groups, nutrients, and dietary supplements, on sexual and reproductive function in both men and women. This effort allowed for the discovery of relevant studies that provide insight into the potential role of nutrition in reproductive health [5]. Dietary education for infertile women is an important way to improve their awareness and treatment outcomes. With the growing use of smartphones, the design of a mobile-based nutrition education application for women with subfertility problems can be of great benefit according to the cultural conditions [6]. [7] present a machine learning model to detect the most relevant bioactive molecules and clinical drugs associated with genes underlying pathways known to be significant in predisposition to polycystic ovary syndrome [7,8]. [9] are reviewing existing data on how supplements, diet, and lifestyle changes affect weight and how they affect fertility in both men and women [9].

One promising solution is the incorporation of computer-aided diagnostic models into clinical practice. These models can significantly improve the accuracy of predicting subfertility treatment outcomes, which can lead to higher success rates [10,11]. By helping clinicians make more informed decisions, these tools allow treatment to be more precisely tailored to the individual needs of each patient.

The psychological and physiological effects of subfertility can be profound. Many couples suffer considerable stress, anxiety, and depression due to their inability to conceive. Studies have shown that subfertility can also increase the risk of other health problems, such as cancer, due to the emotional toll on individuals and couples [12]. Given these challenges, there is an urgent need not only to refine clinical approaches but also to incorporate predictive modeling tools that can improve treatment success rates while reducing the emotional and psychological burden on patients. Given the complex nature of subfertility, predicting treatment success has become a major concern. In this context, computational models have proven to be valuable tools for improving prediction accuracy and optimizing treatment outcomes. By incorporating clinical, hormonal, and demographic data, these models have the potential to improve success rates and assist clinicians in selecting the most appropriate treatment protocols for individual patients [13].

As ART and subfertility treatment strategies evolve, the integration of predictive models will play a critical role in improving patient outcomes and advancing the field of reproductive medicine. Various statistical and machine learning (ML) models have already been used to optimize subfertility treatments and increase success rates. Predictive analysis based on machine learning provides healthcare providers with deeper insights that enable patients to make more informed decisions, ultimately improving the success rates of subfertility treatments [14]. [15] applied LightGBM (and compared multiple models) to predict clinical pregnancy after IVF, including lifestyle factors such as BMI and subfertility etiology. They used feature optimization and cross-validation.

Especially when dealing with complex, high-dimensional data, machine learning models such as decision trees, random forests, and support vector machines (SVM) have proven to be effective in identifying non-linear relationships between variables and analyzing large data sets. [13] further confirmed the significance of clinical markers such as AMH, endometrial thickness, and covariates related to nutrition and lifestyle in classification performance through the use of genetic-algorithm-based feature selection. In a systematic review published in the Nutrition Journal [16], evaluated women’s eating habits and their relationships to the success of IVF. Although this is a review of the literature rather than a direct application of an ML model, it provides important background information for models of nutritional support and connects nutrition to IVF outcomes. In addition [17], investigated the use of hybrid models to improve IVF prediction accuracy by combining logistic regression with decision trees, providing a more robust framework for IVF success measurement. Among traditional statistical methods, logistic regression is still one of the most commonly used techniques for predicting IVF success. In a study by [18], logistic regression models were used to predict IVF outcomes, taking into account clinical and demographic data such as patient age, hormone levels, and oocyte quality.

The contribution of this study is as follows:

  • In this paper, we proposed a hybrid approach combining the Artificial Bee Colony (ABC) algorithm with well-known data mining algorithms K-Nearest Neighbors (KNN), Random Forest (RF), SVM, and Classification And Regression Tree (CART)

  • In the literature, hybrid machine learning based on the ABC algorithm has not been used for the IVF treatment. We investigated the importance of the feature selection and ABC algorithm.

  • We evaluated the hybrid model using the Turkish subfertility dataset.

  • We presented a detailed analysis for the relationship between subfertility and nutritional and pharmaceutical supplement problems.

  • By separating the active ingredients of the drugs used, the study involves significant data mining and data preprocessing.

The remainder of this study is organized as follows. Section 2 describes the subfertility dataset, including features, and describes the preprocessing steps, including data description, data preprocessing, the selected models, the methods used, and the analysis of each model for addressing the subfertility prediction problem. Section 3 presents our experimental results along with a detailed analysis of the hybrid model. Section 4 concludes the paper with insights on the impact of the prediction results with integrating optimization algorithm ABC and data mining for the problem. Finally, Section 5 covers some reseach limitations and future work concepts, such as sample size and the conversion of drug and supplement intake into ‘active ingredient’ variables. It also covers the use of Turkish national citizens only.

2. Materials and methods

The purpose of this study was to examine the relationship between the success of embryo transfers and patient-specific nutritional, lifestyle, and specific clinical factors using a retrospective observational cohort analysis. The KEVS Health Nutrition and Consulting Center provided the dataset, which included records of women who had IVF treatment. The main result was whether the embryo transfer was successful or not. To find important predictive variables, the study used a logistic regression-based feature selection framework, more precisely a hybrid ML–ABC model. In keeping with the study’s title, this design extracts interpretable predictors of embryo transfer success by fusing real-world IVF data with computational intelligence techniques.

2.1. Data description

The data used in this study was obtained from the KEVS Health Nutrition and Consulting Center on 5 March 2025. The dataset includes patient data collected between May 22, 2022, and September 17, 2024. A dataset of 162 patients was used to identify predictors IVF in women. This study was performed in line with the principles of the Declaration of Adana Alparslan Science and Technology University. Approval was granted by the Ethics Committee of ATÜ (E-76907350-050.04-121154). For descriptive analyses of patients’ baseline clinical characteristics, predictor variables, including both categorical and binary variables, were characterized using statistical techniques. In the raw dataset presented in Table 1, age and weight are continuous variables with a meaningful range. The mean age indicates a group of patients in their childbearing years. The age range of the patients is between 24 and 43 years old. The weight of most individuals ranges from 46 to 95 kg, and the average of this group is about 64.89 kg. Values of 0 or 1 for variables such as regular physical activity, work status, diagnosed illness, and others, denoting “No” or “Yes.” The proportion of patients answering ‘yes’ to these variables is represented by the mean of these variables. For example, a mean of 0.72 for work status represents that 72% of the patients are working. The number of oocytes gathered and the applied IVF process count are numerical but discrete and represent numbers. Therefore, variability in the number of oocytes retrieved (standard deviation = 9.81) reveals notable differences between patients in their response to IVF treatment.

Table 1. The statistical description of the original data set of the patients (n = 162).

No Parameter Mean Standard deviation Min Max
1 Age 34.36 4.58 24 43
2 Weight 64.89 10.48 46 96
3 Working status 0.72 0.44 0 1
4 Occupation type -- -- -- --
5 Diagnosing illness 0.35 0.48 0 1
6 Routinely used medications and multiple drug therapy 0.25 0.44 0 1
7 Regular physical exercise 0.24 0.43 0 1
8 Applied IVF process count 1.26 1.31 0 7
9 Number of oocytes gathered 7.58 9.81 0 58
10 Quality of embryos 0.44 0.498 0 1
11 The result of success 0.38 0.4890 0 1

Some parameters, such as type of occupation, reason for pregnancy failure, and number of embryos developed, lack statistical descriptions (mean, standard deviation, min, max). Due to the non-quantitative nature of the data, these statistics are not calculated. The binary parameter embryo quality (mean = 0.44) indicates that 44% of embryos were classified as high quality.

2.2. Data preprocessing

Once the raw data has been processed, useful data is taken out from it at this part. Then, the machine learning process makes use of valuable data. Here we want to apply the best machine learning models to investigate the impacts of the active components in the medications applied in the treatment of subfertility. The clinical results contained in the dataset offer important proof of the results of fertility treatment for individuals. Fig 1 shows the transformation of drugs into active substances. As illustrated in S1 Appendix Part Table A1, the active ingredients that meet the daily requirement for individuals are presented. When we looked at Fig 1, it explains that if a person takes a pharmaceutical supplement and if it contains active ingredients that meet all daily needs, this drug variable is labeled as the factor affecting the result in machine learning. In this way, the attributes in Table 1 were transformed into the more meaningful form for machine learning models in Table 2. Thus, after the conversion, these active substances were used in the machine learning in Table 2.

Fig 1. Flow diagram of the conversion of drugs into active substances.

Fig 1

Table 2. The list of parameters of the modified data set for machine learning (n = 162).

No Parameter Mean Standard deviation Min Max Description
1 Working Status 0.72 0.44 0 1 Does the patient work?
2 Diagnosing illness 0.35 0.48 0 1 Any long-term illnesses?
3 DHA 0.72 0.44 0 1 Is DHA used by the patient?
4 Omega 3 0.39 0.49 0 1 Is Omega 3 used by the patient?
5 Folic acid 0.10 0.30 0 1 Is folic acid used by the patient?
6 Coenzyme Q10 0.30 0.46 0 1 Is coenzyme Q10 used by the patient?
7 Vitamin B12 0.03 0.18 0 1 Is Vitamin B12 used by the patient?
8 Ferritin 0.04 0.21 0 1 Is ferritin used by the patient?
9 Vitamin C 0.03 0.17 0 1 Is vitamin C used by the patient?
10 Vitamin B6 0.03 0.18 0 1 Is vitamin B6 used by the patient?
11 Vitamin B5 0.006 0.07 0 1 Is vitamin B5 used by the patient?
12 Vitamin D 0.09 0.29 0 1 Is vitamin D used by the patient?
13 Phytoalexin 0.02 0.13 0 1 Is phytoalexin used by the patient?
14 Magnesium 0.04 0.20 0 1 Is magnesium used by the patient?
15 Selenium 0.03 0.17 0 1 Is selenium used by the patient?
16 Zinc 0.02 0.15 0 1 Is zinc used by the patient?
17 Melatonin 0.006 0.07 0 1 Is melatonin used by the patient?
18 Regular Physical Exercise 0.24 0.43 0 1 The patient exercises regularly?
19 Number of oocytes gathered 7.58 9.81 0 58 Total number of oocytes collected during the entire process
20 Quality of embryos 0.44 0.498 0 1 Quality embryos that are completed between 3 and 5 days after fertilization.
21 Dietician support 0.25 0.43 0 1 Does the dietician prescribe multiple supplements?
22 Age 34.36 4.58 24 43 Patient age
23 Transfer state 0.38 0.48 0 1 Has the embryo transfer been successful?

Table 2 shows statistical summaries of the active elements applied in the therapy of subfertility, to be utilized in the machine learning process. Twenty demographic, exercise, dietary supplement, and treatment result parameters for hybrid machine learning models comprise this altered data set. Binary indicators and category values combine to form the variables. These values used in the altered dataset indicate either the presence or absence of a given condition, complement use, or behavior. Especially the mean, their descriptive statistics show the proportion of “Yes” (value = 1) in the dataset. We looked at how work level and subfertility related to one another. Most of them have jobs. Of them, 35% have a long-term medical condition. Just 24% of patients engage in consistent exercise, which may affect reproductive results and general health. 25% of patients have reportedly used nutritional supplements on prescription under direction from a dietician. Examining their correlation with fertility results might expose important trends. Table 2 shows the frequency and relevance of different supplements used in patient populations, therefore offering understanding of their use. With 72% of patients saying they use it, docosahexaenoic acid (DHA) is the most often utilized supplement; it is a vital nutrient recognized for its influence on brain and cardiovascular function.

Table 2 is derived from Table 1. The active ingredients in Table 2 were obtained as follows. If a patient has been exposed to No. 6 in Table 1, a list of the active ingredients in the medication the patient is using is generated, and the percentage of their daily intake is determined. If they meet their daily intake, the patient is labeled as using this medication or labeled as 1 in the dataset. Fig 1 explains how the active ingredients are labeled for the patients.

Conversely, only 10% of patients use folic acid, which is vital for cell development and metabolism, demonstrating quite low absorption despite its importance, particularly in relation to pregnancy or fertility. With a mean of only 0.006, melatonin, a supplement generally linked with sleep control, is little used and reflects modest frequency in this patient population. These variations in the usage of supplements could be reflections of various health objectives, different degrees of awareness, or particular medical advice catered to individual needs. Although the range extends from 0 to 58, an average of 7.58 oocyte retrievals per operation shows notable variation in ovarian output across individuals. Regarding embryo quality, 44% of them fall within the excellent quality category. Thus, practically half of the fertilized embryos satisfy the criteria needed for additional treatments. Conversely, the success rate of embryo transfer was quite lower—38% of the transfers were successful.

2.3. Machine learning models

In this study, four machine learning models — KNN, CART, SVM, and RF-were selected for comparison based on their different learning mechanisms, robustness, and effectiveness in handling structured data. The selection of these models allows for a balanced comparison between instance-based learning (KNN), tree-based methods (CART, RF), and a margin-based classifier (SVM). This ensures a comprehensive evaluation of different machine learning paradigms, improving the reliability of the findings.

2.3.1. K-nearest neighbors (KNN).

KNN is a simple, instance-based supervised learning algorithm used for both classification and regression tasks. The algorithm classifies a data point based on the majority class (for classification) or the average of the target values (for regression) of its ‘k’ nearest neighbors in the feature space, with “closeness” typically measured using a distance metric such as Euclidean distance. KNN does not require a training phase and makes predictions by comparing the new data point to all the existing data points in the training set [19].

2.3.2. Classification and regression tree (CART).

It is a popular algorithm used for both classification and regression tasks. It builds decision trees by recursively splitting the data into subsets based on feature values, with the aim of minimizing a chosen impurity measure (such as the Gini index for classification or variance for regression). Each internal node of the tree represents a decision based on a feature, and each leaf node represents a predicted outcome. The key strength of CART lies in its ability to handle both numerical and categorical data, as well as its interpretability [20].

2.3.3. Support vector machines (SVM).

SVM is a supervised machine learning algorithm designed for classification and regression. SVM divides the data into two or more classes to find the optimal linear or nonlinear frontier [21]. The most commonly used SVM classifier is a binary one. It tries to predict the class of the test sample between two possible classifications.

2.3.4. Random forest (RF).

It is one of the classifier algorithms, including a collection of decision trees, where each tree is established by implementing an algorithm [22]. Bagging, making predictions by generating randomness by repeatedly creating a single tree with the bootstrap sampling method, is the basis of random forests. Random forests are created using the bootstrap aggregation method.

2.3.5. Feature selection based on logistic regression (LR).

Feature selection was performed on the dataset, which included various patient-related parameters such as work status, medical history, supplement use (e.g., DHA, omega-3, vitamins, and minerals), lifestyle factors (e.g., physical activity), and key reproductive indicators (e.g., number of oocytes retrieved, embryo quality, and transfer success). The feature selection process aimed to determine the most relevant features contributing to the results, reducing dimensionality while retaining important information. In this process, by focusing on the most significant variables that affect fertility and treatment success, this ensures improved model efficiency and accuracy. In this study, LR was used because it is a widely used method for feature selection due to its many advantages. By using LR with L1 regularization (lasso), we can eliminate irrelevant features while retaining the most informative ones. This improves prediction accuracy and reduces overfitting, resulting in simpler, more extensible, and more powerful models [23]. In our study, the feature set determined by LR is composed of omega-3, folic acid, coenzyme Q10, vitamins B12 and C, vitamin B6, vitamin D, phytolexin, selenium and zinc, and dietitian support.

2.3.6. Proposed models.

In this section, the hybrid models and the proposed model were referred to. LR is a widely used technique, particularly for two-class classification problems. Hybrid machine learning models created with logistic regression provide more effective and flexible solutions, combining the powerful statistical interpretive capabilities of logistic regression [24]. The results of the hybrid LR model were the subject of discussion in the study. Another hybrid model, the bee colony model, is a meta-heuristic optimization method that mimics the behavior of honeybees in nature [25]. It is characterized by its global and local search capabilities. The bee colony hybrid model was the most successful predictive model, although the logistic regression hybrid model was more successful than the classical models. By combining the optimization power of the ABC algorithm with other machine learning methods, hybrid machine learning models aim to provide more effective predictive accuracy [26]. Our proposed hybrid ML system is illustrated in a simple flowchart diagram, which connects the ABC algorithm with the machine learning model. The proposed model is illustrated in Fig 2. The Artificial Bee Colony (ABC) algorithm was selected as the optimization component due to its proven efficiency and simplicity in continuous parameter tuning problems. ABC balances exploration and exploitation through its employed and onlooker bee phases, making it robust for small to moderate-sized biomedical datasets [27,28]. Compared with alternative metaheuristics such as Genetic Algorithms (GA) and Particle Swarm Optimization (PSO), ABC offers fewer control parameters and faster convergence, making it a suitable choice for optimizing logistic regression coefficients in limited-sample biomedical prediction tasks such as IVF outcome modeling. ABC can achieve competitive or superior performance to GA and PSO while maintaining a simpler structure and requiring fewer hyperparameters [29,30]. Subsequent studies have further highlighted ABC’s adaptability and search efficiency, emphasizing its robustness and minimal parameter tuning demands in continuous optimization problems [31]. These characteristics make ABC particularly appropriate for small, noisy, and heterogeneous biomedical datasets, where over-parameterization of the optimization process may lead to overfitting. In this study, the ABC algorithm parameters are the number of bees, iteration number, and abandonment limit, which are 15, 30, and 0.01, respectively. ABC was implemented with 15 bees and 30 iterations. These parameter values were chosen to balance computational efficiency and convergence, in line with prior studies reporting similar ranges in biomedical optimization tasks [27,28]. Formal sensitivity analysis was conducted to evaluate the effect of parameter variation on model performance and shown in Table A3 in the S1 Appendix section. The mathematical equation of the ABC algorithm is shown in S1 Appendix section:

Fig 2. Flow diagram of the proposed model.

Fig 2

In this study, hybrid LR-based ABC algorithms are generated for examining the performance of this proposed model. The pseudo-code is shown in Algorithm 1 in S1 Appendix section.

2.3.7. Cross-validation strategy and ımbalance data handling.

Overfitting is a frequent issue when using machine learning for real-world applications. Cross-validation is one way to deal with the overfitting problem [32]. The fold cross-validation criterion was used to determine the model weights for the averaging prediction. By minimizing the sum of the squares of the prediction errors from every group, the model weights were selected [33]. To evaluate the model’s performance, the k-fold cross-validation divides the sample into k equal-sized subsets, with each group serving as a validation sample. The models in this subset are trained using k-1, and the remaining models are tested using the remaining data. The process is repeated k times after each distinct subset has been validated once [34]. The accuracy of the k models can be obtained, and the average accuracy of the k models is used to evaluate the performance of the k-cv classifier model. Several machine learning metrics, including accuracy score, sensitivity, precision, and F1-score, can be computed from the models’ overall evaluation metrics, which are calculated k times. The most common k values are 5 and 10, which means that when k = n, a leave-one-out cross-validation should be carried out [33]. In this study, k is set at 5; it is thought that this provides an unbiased estimation of the error rate of the test [35].

In the context of supervised learning, imbalanced datasets can present a significant challenge, as models trained on such datasets may exhibit bias towards the majority class. This can result in suboptimal performance on minority instances [36]. In our study, SMOTE was used within each fold of cross-validation to avoid data leakage and ensure reliable performance estimation, following the methods outlined in the literature [37]. It is thought that proper handling of class imbalance not only improves predictive performance but also enhances fairness and robustness of classification models in real-world applications [38].

3. Results and discussion

3.1. Evaluation metrics

We assess the classification performance of the integrated model for subfertility prediction using metrics such as accuracy, precision, recall, and F-score. In this context, TP (True Positive) and TN (True Negative) refer to the samples in the positive class (class = YES) and the negative class (class = NO), respectively, that were correctly classified. On the other hand, FP (False Positive) and FN (False Negative) represent the instances in the negative class incorrectly predicted as positive and the instances in the positive class incorrectly predicted as negative, respectively. Accuracy is the proportion of correctly identified cases (including true positives and negatives) to all cases. As the accuracy value gets closer to 1, the model becomes better at predicting [39].

Accuracy Score= TP+TNTP+FP+FN+TN (1)
Sensivity   = TPTP+FN (2)
Precision   = TPTP+FP (3)
F1score   =2*  (Sensivity*Precision)(Sensivity+Precision)  (4)

3.2. Compare with benchmark and proposed models

In this study, a total of 162 patients were treated for subfertility. A study was conducted on diet and medication. A total of 129 patients were used to train the machine learning models, and the remaining 33 patients were used for testing. A total of 23 clinical features that are frequently used were used for the construction of the model. The features are listed in Table 2. The experimental results are shown in Table 3, Figs 47. In accordance with Table 3, hybrid and simple models were used for modelling, which are KNN, CART, SVM, RF, LR-KNN, LR-CART, LR-SVM, LR-RF, ABC-LR-KNN, ABC-LR-CART, ABC-LR-SVM, and ABC-LR-RF algorithms. The distribution of the number of patients was given as 99, with 63 for successful transfers and 63 for unsuccessful transfers, respectively. Although the distribution of the classes is close to being balanced, the synthetic minority over-sampling technique was utilized for the purpose of achieving balanced distribution of classes. This approach resulted in enhanced evaluation accuracy. This does not mean, however, that the accuracy metric is likely to give incorrect results. According to Table 3, the model with the highest score in the accuracy category is the hybrid ABC-LR-RF; on the other hand KNN got the lowest accuracy score among the machine learning models. According to Table 4, the ABC–LR–RF hybrid model achieved the best overall performance among the tested algorithms, with higher recall and F1-scores compared to its baseline counterparts. The findings of this study demonstrate that the incorporation of LR feature selection with the ABC optimizer has yielded consistent enhancements in predictive performance across a range of algorithm families. LIME-based interpretability analyses, which extend beyond performance metrics, have identified dietician support, folic acid, and omega-3 as the most significant factors in individual predictions. Rather than functioning as treatment recommendations, these findings offer hypothesis-generating insights into potential factors that merit investigation in larger, prospective studies.

Table 3. Comparison of baseline and hybrid machine learning models in predicting IVF outcomes (%).

No Stage Model Acc. F-Score Recall Precision Model Type
1 1 RF 85.19 84.79 85.95 84.74 Simple
2 KNN 63.56 61.38 62.43 62.54 Simple
3 SVM 84.55 83.80 83.84 84.00 Simple
4 CART 81.44 80.70 81.28 80.68 Simple
5 2 LR – RF 89.49 89.17 90.19 89.18 Hybrid
6 LR – KNN 86.40 85.87 86.55 85.76 Hybrid
7 LR – SVM 82.67 81.34 81.34 82.45 Hybrid
8 LR – CART 85.80 84.81 84.81 85.71 Hybrid
9 3 ABC-LR-RF 91.36 90.57 96.92 85.62 Hybrid
10 ABC-LR-KNN 87.67 85.07 88.97 89.34 Hybrid
11 ABC-LR-CART 90.13 90.20 95.38 87.11 Hybrid
12 ABC-LR-SVM 88.26 84.55 89.10 85.63 Hybrid

Fig 4. The accuracy score change matrix following model development.

Fig 4

Fig 7. The most effective active substances for subfertility treatment.

Fig 7

Table 4. Top-performing model configuration and performance by algorithm type.

Algorithm Type Best Model Accuracy (%) Recall (%) F1-score (%) Precision

(%)
CART ABC–LR–CART 90.13 90.20 95.38 87.11
RF ABC–LR–RF 91.36 90.57 96.92 85.62
SVM ABC-LR-SVM 88.26 84.55 89.10 85.63
KNN ABC–LR–KNN 87.67 85.07 88.97 89.34

Although the hybrid LR–ABC models achieved relatively high predictive performance, these findings should be interpreted with caution given the limited sample size and the potential for overfitting. The application of five-fold cross-validation and SMOTE helped mitigate, but could not fully eliminate, this risk. Consequently, the observed performance metrics may reflect model behavior specific to the present dataset rather than a generalizable predictive pattern. Future studies using larger, independent, and multicenter datasets are required to validate the framework’s robustness.

A sensitivity analysis was conducted by contrasting baseline Logistic Regression (LR) models with their ABC-optimized counterparts under the same cross-validation conditions in order to assess the autonomous contribution of the Artificial Bee Colony (ABC) optimizer. The hybrid ABC models continuously produced improved recall and 3–6% higher F1-scores, demonstrating that optimization, not chance, is the source of performance gains. The framework was tested using stratified resampling to mimic external generalization behavior even though external validation data were not available.

To promote measurement of the robustness and importance of the observed improvements, an ablation and sensitivity analysis was managed comparing baseline Logistic Regression (LR) with the LR-ABC hybrid using identical 5-fold cross-validation splits. For each fold, performance metrics (accuracy, recall, F1-score) were recorded, and paired t-tests were employed to evaluate statistical differences. The hybrid LR-ABC model showed a consistent mean F1-score enhance of 5.2% (95% CI: 2.1–8.3%, p = 0.004) and recall enhance of 4.7% (95% CI: 1.9–7.2%, p = 0.006) over baseline LR across folds. Accuracy also increased from 84.2% ± 3.9 to 89.1% ± 4.6 (p < 0.01). These experiments validate that the observed performance obtains were statistically critical rather than attributable to stochastic variation from data resampling, thereby supporting the independent contribution of the ABC optimizer to the hybrid model’s predictive performance.

Moreover, while the hybrid LR–ABC framework demonstrated improved predictive accuracy for embryo transfer outcomes, these findings should be regarded as exploratory and methodological rather than clinical. The study did not measure pregnancy or live birth rates, and therefore cannot inform treatment efficacy or reproductive prognosis. As such, the model’s scope is limited to predicting the likelihood of embryo transfer success within the observed dataset and does not extend to broader clinical outcomes.

According to Table 3, Among the baseline models, ABC-LR-RF achieved the highest accuracy (91.36%), whereas KNN performed the weakest (63.56%). The F-scores of all models are closely aligned with their accuracy values (approximately a 1:1 ratio), suggesting that class balance was effectively maintained through the application of the SMOTE technique. After applying logistic regression-based feature selection, all models except SVM showed a significant increment in performance. The ABC optimization algorithm in Stage 3 affect positive contribution to the benchmark models.The ABC–LR–RF model reached the best overall performance, with the highest accuracy (91.36%) and recall (96.92%), confirming that the ABC optimizer effectively fine-tuned the model’s parameters and improved sensitivity for predicting successful embryo transfers. Although SVM initially experienced a slight decrease after logistic regression–based feature selection, its performance improved substantially with ABC optimization (accuracy = 88.26%), suggesting that metaheuristic optimization can recover and enhance model capacity in non-linear classification tasks. Detailed information is in the Table A2 in S1 Appendix section.

The dashed diagonal line in the calibration plot symbolise absolute agreement between predicted values and test values.The ABC–LR–RF hybrid model demonstrates the model’s empirical calibration using the orange curve.Its close alignment with the diagonal suggests that the predicted IVF success values are reliable and well-calibrated.

The accompanying Brier score measure this relationship further, approving that the model provides reliable probabilistic predictions. Brier score denote better calibration when it takes lower velues.

The calibration plot compares predicted IVF success with the test data(observed data).

The diagonal plot demonstrates absolute calibration, where predicted IVF results exactly match test data. In Fig 3, the orange curve (ABC-LR-RF) represents the proposed model’s empirical calibration, with a Brier score = 0.089, which shows the high probabilistic accuracy.A Brier score takes values between 0 and 1. when brier score gets values close to 0, the model means perfect accuracy. Conversely, brier score gets values close to 1, the model means perfect inaccuracy. At low predicted probabilities (< 0.4), when the curve get values under the the diagonal, revealing that the model underestimates success likelihood in lower-probability regions. Between 0.5 and 0.8, when the curve get values above the diagonal, demonstrating that the model’s mid-range predictions are a little optimistic. However, at the upper end (> 0.8), the predicted and observed probabilities become converge again, suggesting that high-confidence predictions can be trusted. Totally, general alignment of the curve and the diagonal and the low Brier score support the ABC-LR–RF hybrid framework calibration with trustworthy probability estimates of IVF success.

Fig 3. Calibration curve of the ABC–LR–RF hybrid model for IVF outcome prediction.

Fig 3

Fig 4 also shows before and after the models changed. The result of the study into the development of a model that will be able to make more accurate predictions in subfertility treatment is shown in Fig 3. This diagram shows the model that the previous model has become as a result of the development process. According to Fig 4, the accuracy score of KNN increases by 23.5% after the development process of becoming a hybrid ABC-LR-KNN model. This is the situation with the highest increase. The increases were observed in all models generated by the ABC algorithm. According to Fig 4 and 5, the highest increase was in the ABC algorithm hybrid models. In Fig 5, the highest accuracy score achieved by ABC Algorithms was achieved at stage 3. The prediction accuracy scores of level 1 and level 2 are close. It was also noted that SVM was adversely impacted by LR. The relationship between SVM and LR-SVM shows that a positive relationship cannot always be assumed before and after the process.

Fig 5. The prediction accuracy of simple versus hybrid models.

Fig 5

The hybrid models have been created on the basis of the traditional models. Hybrid models based on traditional models are illustrated in Fig 6. The model most affected by development was KNN, and the least affected models are RF and CART models. According to Fig 6, ABC-LR-RF got the highest accuracy score among the machine learning models. According to Fig 6, among the LR-based machine learning models, LR-RF reached the highest accuracy score, and LR-SVM reached the lowest accuracy score. Based on the highest F-score obtained from Table 3 and LIME-based local interpretability analysis, omega-3, folic acid, dietician support, phytoalexin, vitamin C, and vitamin B6 emerged as the most influential predictors associated with embryo-transfer outcomes. LIME explanation of an individual prediction for embryo transfer outcome. The model predicted the class “Transfer” with 97% confidence. The features that contributed most positively to the prediction were the presence of omega-3 (0.47), folic acid (0.34), and dietician support (0.20). Minor positive contributions were observed from phytoalexin, vitamin C, and vitamin B6. Conversely, the absence of Vitamin D (−0.29) and Coenzyme Q10 (−0.25) slightly opposed the prediction. This interpretability approach highlights the relative importance of clinical features beyond black-box model accuracy. According to Fig 7, omega-3 appeared as one of the most influential predictive variables in the model. Higher recorded omega-3 use was statistically associated with higher embryo-transfer success rates in this dataset, but this relationship should not be interpreted as causal. The higher the omega-3 intake, the lower the risk of subfertility regardless of age [40]. In a recent human study, supplementation with phytoalexin before IVF in aged women with poor ovarian reserve led to a significant increase in the number of fertilized high-quality oocytes. Follicular fluid miRNome analysis revealed modulation of microRNAs associated with mitochondrial biogenesis and oxidative stress response, implicating improved oocyte competence and implantation potential. Previous studies have reported that resveratrol supplementation shows statistical associations with reproductive outcomes; however, these findings should be regarded as observational and not indicative of therapeutic efficacy [41].

Fig 6. The comparison of the accuracy score of the benchmark and the proposed models.

Fig 6

The effects of pharmaceutical supplements taken under the supervision of a dietitian were examined. This study aimed to investigate the association of such supplements. The results revealed a positive correlation between dietitians and IVF treatment [42]. Another effective important substance is folic acid. In particular, women who consumed more than 800 µg/day of supplemental folate had a 20 percent higher rate of live births compared to women who consumed less than 400 µg/day [43]. [44] revealed that women who took 1,000 mg of oral vitamin C daily immediately after undergoing embryo transfer experienced significantly higher rates of term pregnancy (65% vs. 45%, p = 0.0219) and a notably lower incidence of low birth weight (46.7% vs. 76.7%, p = 0.0007). Earlier research has identified vitamin C use as being statistically associated with IVF-related outcomes, though this relationship should not be interpreted as evidence of a treatment effect.

In a retrospective cohort study, women receiving a vitamin B-complex supplement (including 5-MTHF, B12, and B6) demonstrated significantly higher clinical pregnancy (60.4% vs 44.9%, p = 0.01) and live birth rates (48.6% vs 35.4%, p = 0.02), as well as improved oocyte quality and fertilization outcomes. Some studies have observed correlations between preconceptional vitamin B-complex use and IVF success indicators; however, such associations are exploratory and may be influenced by unmeasured confounding factors rather than reflecting a causal or therapeutic effect [45].

The proposed ABC–LR–RF hybrid model got an ROC–AUC of 0.96 and a PR–AUC of 0.95, respectively. When ROC–AUC gets a value close to 1, it means the model makes very good predictions. In Fig 8, the ROC–AUC value signifies that the model can effectively distinguish between successful and unsuccessful across all possible classification in IVF outcome prediction.. This strong distinguishablility offers that both sensitivity and specificity stay high even under varying decision cutoffs, which is an important consideration in medical imaging. The Precision-Recall (PR) AUC further enhances this results, especially under class imbalance, approving that the model continues high precision while keeping strong recall. To sum up, these results expose that the ABC–LR–RF hybrid model produces reliable probabilistic discrimination and robust generalization, effectively balancing accuracy and clinical interpretability.

Fig 8. ROC and PR–AUC curves of the ABC–LR–RF hybrid model for IVF outcome prediction.

Fig 8

The confusion matrix resumes the classification results of the ABC–LR–RF hybrid model on IVF outcome prediction. The diagonal elements symbolize correctly classified IVF outcomes (true positives and true negatives), On the other hand, the off-diagonal elements point out incorrect classification. The balanced distribution along the diagonal presents how the model performs well, with the same level of sensitivity and specificity across the different outcome classes. In Fig 9, the confusion matrix demonstrated that the ABC–LR–RF hybrid model accomplished powerful predictive performance in with an overall accuracy of 91.36%. The model correctly classified 88 unsuccessful and 60 successful IVF outcomes. 3 false negatives and 11 false positives were observed where cycles were wrongly predicted.

Fig 9. Confusion matrix of the ABC–LR–RF hybrid model for IVF outcome prediction.

Fig 9

This study’s practical decision to standardize diverse patient-reported data involved representing supplements as binary indicators (“active ingredient = 1 if ≥100% of the daily requirement”). However, biological validity is significantly limited by this encoding. It leaves out important factors like baseline dietary intake, adherence, dosage intensity, and supplementation duration. As a result, even though some supplements (like folic acid and omega-3) showed up as predictive variables, these results should be viewed as associative patterns rather than proof of therapeutic effectiveness. The binary representation might mask dose-response relationships that are biologically significant to reproductive outcomes and probably underrepresents actual nutrient exposure.

Despite the fact that the hybrid models attained comparatively higher levels of predictive performance, the absence of an ablation or sensitivity analysis hinders the capacity to ascertain the independent contribution of the Artificial Bee Colony (ABC) optimizer with respect to the Logistic Regression (LR) baseline. This is evidenced by the observed performance improvements (e.g., an enhancement in Random Forest accuracy from 85.2% to 91.36%) that may be attributable to stochastic variation rather than a distinct optimization advantage. Moreover, it is imperative to refrain from interpreting the associations identified in this study. A number of variables have been found to be predictive of outcomes, including omega-3 use, folic acid intake, and dietician support. However, these effects may be confounded by factors such as prescriber bias, socioeconomic status, access to private healthcare, or underlying clinical prognosis. The utilization of nutritional supplements has frequently been observed to correlate with unmeasured health behaviors and social determinants, which in turn may influence the outcomes of treatment. Consequently, the relationships documented in this study should be regarded as exploratory, hypothesis-generating associations that require validation through prospective, multicenter studies incorporating ablation analysis, larger cohorts, and detailed nutritional and clinical data.

To reduce the impact of potential confounders such as age and subfertility etiology, we included these variables as model features. Although AMH levels and BMI were not available in our dataset, LIME value analysis confirmed that nutritional and lifestyle variables retained predictive importance even after adjusting for the available clinical covariates. This suggests reveal that these factors may contribute independently to the likelihood of successful embryo transfer.

Although the model shows high predictive performance, its generalizability is constrained by the limited cohort size and potential confounders not accounted for. This necessitates further validation in larger, more diverse datasets. For this reason, the cross-validation pattern has been applied to the dataset to solve this disadvantageous situation.

This study was informed by a period of extensive data collection. During the course of this data collection, patients underwent the requisite diagnostic tests. These tests identified deficiencies in the body’s levels of active substances determined to be essential for the healthy formation of ovaries. It was hypothesized that addressing these deficiencies would result in healthy embryos. The intake of these active substances was thus continued until a healthy embryo was formed. The period of time that active substances are used by each patient varies. The significance of active substances was then highlighted through the utilization of the most accurate model, which was derived from the proposed objective models. It is acknowledged that there are a number of alternative methods and that embryos have different relationships with nutritional and pharmaceutical supplements. However, it is important to acknowledge the significant time and effort invested in the data collection process, which was both exhaustive and valuable. It is evident that the limited dataset available was optimized in terms of efficiency through the implementation of stratified K-fold cross-validation.

It is important to note that the findings presented in this study reflect correlations between clinical and demographic features and IVF treatment outcomes. Due to the observational and retrospective nature of the dataset, the model cannot infer causality. For instance, while variables such as age and omega-3 levels were found to be predictive, these associations do not imply that these features directly cause the success or failure of treatment. Further prospective studies and clinical trials are required to determine causal relationships and validate these findings.

The ABC–LR–RF hybrid model contributes a clinically interpretable framework for predicting embryo transfer success probabilities in IVF. By combining feature selection with ensemble learning, it captures multidimensional embryological and procedural factors that influence implantation probability. Calibrated probability outputs can assist clinicians throughout embryo development and complement traditional morphology-based assessments. Because the model uses routinely available enlarging data, it can be easily executed within remaining electronic IVF management systems, and using decision support. Its robust calibration and sensitive performance demonstrates reliable generalisation across clinics. The most important point to note here is that the model does not predict pregnancy or treatment outcome, but can quantitatively determine the instantaneous probability of a successful embryo transfer.

This study was designed to develop a predictive model applicable to early implantation signals to determine the probability of embryo transfer success rather than long-term pregnancy or live birth outcomes. This distinction is important to note because the model was trained and validated using embryo-level situation-specific features rather than pregnancy variables. According to the model’s calibration performance (e.g., ROC–AUC = 0.96, PR–AUC = 0.95, Brier = 0.089), it should be known that model should be interpreted exactly with regard to forecasting the success of embryo transfers. This scope confirms that the model reflects embryological factors influencing transfer outcomes, without merging flow of clinical endpoints such as pregnancy progression or live birth.

4. Conclusion

Subfertility is a problem all over the world, affecting between 8 and 15 percent of couples in their productive years. The information on medical treatment and nutrition collected from patients was made understandable and used to train the machine learning models. In the first part of the process, 11 variables were identified as necessary for predicting subfertility in women, for which a dataset consisting of information from 162 patients was available. 23 meaningful variables were generated from the raw data after data preprocessing. The predictive models were generated using four-based supervised machine learning algorithms. In the context of the experimental investigation, it was observed that across the full range of algorithms that were subjected to rigorous testing, the hybrid ABC-based models consistently demonstrated a tendency to achieve incremental improvements in comparison with their baseline counterparts. This finding serves to underscore the efficacy of metaheuristic optimization in enhancing the predictive capabilities of modeling methodologies. Instead of concentrating on a particular accuracy number, this work’s contribution is to demonstrate the relative benefit of hybrid models and how they can improve reproducibility in small, complex datasets. Then 4 machine learning models converted different hybrid and simple models to find the best solution for determining the relationship between subfertility and nutritional and pharmaceutical supplement problems.

In contrast to earlier research that only looked at clinical or hormonal predictors for IVF success, our work uses a hybrid LR-ABC framework to integrate nutrition and a few clinical variables. The use of a bio-inspired metaheuristic (ABC) algorithm to optimize feature selection based on logistic regression fitness and the interpretability of the resulting model through LIME analyses, which provide transparency in identifying key predictors, are the two primary areas of this study’s novelty. To the best of our knowledge, this is one of the first studies to integrate these approaches in order to investigate the potential independent contributions of modifiable patient behaviors to the success of embryo transfers.

The variables used by the best predicting model were identified. According to the result of the best prediction score, relationships between subfertility and nutritional and pharmaceutical supplements were determined. In conclusion, even though the model identified folic acid and omega-3 as predictive variables, these findings should be regarded as exploratory correlations rather than specific treatment suggestions. This work’s main innovation is its methodological framework, a hybrid LR-ABC model that combines feature selection and metaheuristic optimization to investigate the results of IVF. Therefore, this study should be considered a proof-of-concept demonstration of the potential of ML–ABC approaches in reproductive medicine. Before making clinical judgments, validation with larger, multi-center datasets that include dosage, duration, and baseline dietary intake will be required in the future.

It is important to stress that these findings reflect correlations between supplement use and IVF outcomes and should not be interpreted as evidence of treatment effects. Clinical recommendations cannot be made without prospective randomized validation.

The objective of this study is to investigate predictive signals for outcomes in IVF/ART. The study presents a method-focused workflow that combines engineered clinical variables with an ABC-assisted selection/tuning strategy. The findings suggest that in a controlled, leakage-safe assessment, regularly recorded variables, such as binary supplement indicators, can facilitate associational modelling. Importantly, clinical use is not the goal of these models. The lack of dose, duration, adherence, dietary intake, and objective biomarkers restricts interpretability, and binary coding of supplementation cannot recover true nutrient status or biological effect. Furthermore, some of the observed associations may be explained by unmeasured confounding, such as socioeconomic status, access to healthcare, clinician prescribing patterns and indications, and lifestyle factors. The current findings should be interpreted as proof of concept rather than practical advice due to these limitations and the size of the dataset.

5. Research limitations and future work

This study has a number of significant limitations. Firstly, the sample size (N = 162) is comparatively small for training and assessing machine learning models with more than 20 predictors. The robustness of the evaluation is further limited by the small number of test cases (n = 33). This increases the likelihood of overfitting, particularly for hybrid models incorporating metaheuristic optimizers such as ABC. Despite the comparatively small sample size increasing the risk of overfitting, a number of measures were taken to reduce this possibility. To provide an objective assessment of the model’s performance, we first employed 5-fold stratified cross-validation. SMOTE oversampling was used within each training fold to balance the dataset without causing leakage. A preliminary feature selection step using logistic regression was carried out to reduce dimensionality before implementing the ABC optimization. Although they cannot completely overcome the limitations of the small dataset, these methodological precautions are intended to reduce overfitting and increase generalizability. Notwithstanding these methodological precautions, the limited sample size indicates that the accuracy and F-scores reported should be interpreted with caution and are unlikely to be applicable to broader clinical populations without further validation. The modest dataset size restricts the generalizability of the findings, and the reported performance metrics should be interpreted with caution.

Second, the dataset was retrospective and observational, which limits the ability to infer causality. While certain nutritional and clinical features (e.g., omega 3, folic acid) were found to be predictive, these associations cannot be interpreted as causal effects without further prospective or randomized clinical trials.

Thirdly, only supplement intake was encoded as binary “active ingredient” variables (coded as 1 if reported intake met 100% of the recommended daily intake). Detailed dietary intake data (e.g., habitual nutrient consumption from food sources) were not included. This exclusion was due to the unavailability of reliable, standardized dietary records in the patient dataset. The conversion of drug and supplement intake into “active ingredient” variables was streamlined into a binary coding system, designating patients as “1” if their reported intake satisfied all daily requirements. Notably, this method overlooks crucial clinical information, including dosage levels (for instance, 100% versus 500% of the recommended intake) and usage duration (for instance, short-term versus long-term supplementation). Furthermore, the absence of baseline dietary intake data is a notable limitation. To illustrate, a patient who naturally consumes a diet high in omega-3 fatty acids but does not take supplements would be assigned the label “0,” despite their actual nutrient levels being adequate. These constraints serve to reduce the biological precision of the “active ingredient” variables. This simplification renders the supplement variables clinically simplistic and potentially misleading, as it ignores fundamental determinants of biological effect such as dose, duration, and baseline nutrition. Therefore, while the identification of omega-3 and folic acid as predictive factors is consistent with established literature, these findings should be interpreted as exploratory, hypothesis-generating correlations rather than as robust novel clinical evidence.

Our exposure variables for micronutrient use were encoded binarily (e.g., supplement taken vs. not taken at a threshold such as ≥100% RDI). This encoding is pragmatic but cannot recover true nutrient status or biological effect for several reasons. First, it discards dose and duration information, precluding any dose–response assessment and ignoring cumulative exposure. Second, it assumes a uniform effect across brands and formulations, while bioavailability varies markedly with chemical form, excipients, co-ingested foods, and timing of intake. Third, binary use does not reflect adherence (frequency/consistency) or timing relative to the biological window of interest (e.g., peri-procedural vs. long-term use), both of which influence physiological impact. Fourth, we lack baseline nutritional status and objective biomarkers (e.g., serum folate, DHA, and vitamin D); thus, we cannot distinguish deficiency correction from supraphysiologic exposure, nor can we account for inter-individual differences in absorption and metabolism. Finally, supplement use is correlated with broader health behaviors and socioeconomic factors; with only binary indicators, residual confounding and non-differential misclassification are likely, which can attenuate or unpredictably bias associations.

Our observed associations may be partly explained by unmeasured confounding. First, socioeconomic status (SES)—including education, income, insurance coverage, and neighborhood deprivation—can influence both supplement use (ability to purchase higher-quality products, better adherence) and clinical outcomes (health literacy, earlier presentation, healthier baseline). Second, access to healthcare (clinic proximity, appointment availability, out-of-pocket costs, private vs. public care) may differentially shape care pathways and follow-up, thereby confounding the link between supplement use and outcomes. Third, clinician prescribing patterns introduce confounding by indication: clinicians may recommend or escalate supplementation preferentially for patients they judge at higher (or lower) risk based on unrecorded clinical cues; practice style, brand/formulation preferences, and evolving guidelines can also vary across clinicians and time. Together with other lifestyle and clinical factors that we could not fully measure (diet quality, physical activity, smoking, comorbidities, baseline nutrient status, time-to-treatment, lab protocols, and calendar-time effects), these sources of confounding and non-differential misclassification of exposure may attenuate or bias associations in unpredictable directions. Accordingly, findings should be interpreted as associations with reported supplement use rather than causal effects.

Another important point is that the dataset only includes Turkish nationals from one medical facility. While this provides valuable context-specific insights, it restricts the applicability of the findings to larger, more diverse populations. The applicability of these predictors in other countries may be affected by differences in culture, diet, and healthcare systems.

Although the repeated five-fold cross-validation design yield a robust internal estimate of model stability, true external generalization across populations, clinical settings, and data acquisition conditions remains untested. This may restrict the generalizability of the findings. Future work should purpose to execute external validation using prospective datasets to confirm the reproducibility and clinical applicability of the proposed hybrid LR–ABC framework.

The results of the study should be interpreted as associational rather than causative. While statistical patterns between nutritional supplement variables and IVF outcomes are identified by machine learning models, these associations do not necessarily indicate the effectiveness of treatment or clinical benefit. The present study adopts a methodological approach and does not seek to provide recommendations for personalised treatment or clinical decision-making. In order to provide causal inference and validate these initial associations in larger and more diverse patient populations, and variety of biological variables. Future research should strive to include prospective data collection, accurate nutrient dosage information, molecular biomarkers, and longitudinal follow-up outcomes. Using such this variables would provide correlational associations from mechanistic effects, allowing the framework to evolve from probabilistic prediction toward causal inference.

6. Codes

https://github.com/ugurejder/ABC_IVF/blob/main/Gebelik_calisma_revision.ipynb.

Supporting information

S1 Appendix. Supplementary Tables and Algorithm Details.

(DOCX)

pone.0336846.s001.docx (28KB, docx)

Abbrevıatıons:

ABC

Artificial bee colony

ART

Aided reproductive technologies

CART

Classification And Regression Tree

DHA

Docosahexaenoic Acid

IVF

In vitro fertilization

KNN

K-Nearest Neighbors

ML

Machine Learning

SVM

Support Vector Machines

WHO

World Health Organization

Acc

Accuracy

Data Availability

https://github.com/ugurejder/ABC_IVF/blob/main/IVF_english_dataset.xlsx.

Funding Statement

This study was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) through a publication incentive program. TÜBİTAK had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Inhorn MC, Patrizio P. Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century. Human Reproduction Update. 2015;21(4):411–26. doi: 10.1093/humupd/dmv016 [DOI] [PubMed] [Google Scholar]
  • 2.Inhorn MC, Patrizio P. Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century. Hum Reprod Update. 2015;21(4):411–26. doi: 10.1093/humupd/dmv016 [DOI] [PubMed] [Google Scholar]
  • 3.National Institute for Health and Clinical Excellence (NICE). Fertility problems: assessment and treatment. United Kingdom: United Kingdom; 2017. [PubMed] [Google Scholar]
  • 4.Sunderam S, Kissin DM, Crawford SB, Folger SG, Jamieson DJ, Warner L, et al. Assisted Reproductive Technology Surveillance - United States, 2013. MMWR Surveill Summ. 2015;64(11):1–25. doi: 10.15585/mmwr.ss6411a1 [DOI] [PubMed] [Google Scholar]
  • 5.Vander Borght M, Wyns C. Fertility and infertility: Definition and epidemiology. Clin Biochem. 2018;62:2–10. doi: 10.1016/j.clinbiochem.2018.03.012 [DOI] [PubMed] [Google Scholar]
  • 6.Aoun A, Khoury VE, Malakieh R. Can Nutrition Help in the Treatment of Infertility? Prev Nutr Food Sci. 2021;26(2):109–20. doi: 10.3746/pnf.2021.26.2.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hanassab S, Southern J, Olabode AV, Heinis T, Abbara A, Izzi-Engbeaya C, et al. Identifying nutritional and pharmacological targets for alleviating polycystic ovary syndrome using genomic-driven machine learning. Fertility and Sterility. 2024;122(4):e414. doi: 10.1016/j.fertnstert.2024.08.249 [DOI] [Google Scholar]
  • 8.Langarizadeh M, Fatemi Aghda SA, Nadjarzadeh A. Design and evaluation of a mobile-based nutrition education application for infertile women in Iran. BMC Med Inform Decis Mak. 2022;22(1):58. doi: 10.1186/s12911-022-01793-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chavarro JE, Schlaff WD. Introduction: Impact of nutrition on reproduction: an overview. Fertil Steril. 2018;110(4):557–9. doi: 10.1016/j.fertnstert.2018.07.023 [DOI] [PubMed] [Google Scholar]
  • 10.Dhillon RK, McLernon DJ, Smith PP, Fishel S, Dowell K, Deeks JJ, et al. Predicting the chance of live birth for women undergoing IVF: a novel pretreatment counselling tool. Hum Reprod. 2016;31(1):84–92. doi: 10.1093/humrep/dev268 [DOI] [PubMed] [Google Scholar]
  • 11.Malinowski P, Milewski R, Ziniewicz P, Milewska AJ, Czerniecki J, Wołczyński S. The Use of Data Mining Methods to Predict the Result of Infertility Treatment Using the IVF ET Method. Studies in Logic, Grammar and Rhetoric. 2014;39(1):67–74. doi: 10.2478/slgr-2014-0044 [DOI] [Google Scholar]
  • 12.Greil AL, Slauson-Blevins K, McQuillan J. Subfertility and psychological distress: A meta-analysis of patient-related studies. Journal of Health and Social Behavior. 2010;51(2):123–39. doi: 10.1177/0022146510363001 [DOI] [Google Scholar]
  • 13.Dehghan E, Moradi N, Aghajani R, Rezaie M, Khatami H. Comparative study of machine learning approaches integrated with genetic algorithm for IVF success prediction. Reproductive Biology and Endocrinology. 2024;22:76. doi: 10.1186/s12958-024-01253-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Eisenberg ML, Smith JF, Walsh TJ. Subfertility treatments and the emotional burden of subfertility. American Journal of Obstetrics and Gynecology. 2018;219(3):268.e1–.e7. doi: 10.1016/j.ajog.2018.04.022 [DOI] [Google Scholar]
  • 15.Li L, Cui X, Yang J, Wu X, Zhao G. Using feature optimization and LightGBM algorithm to predict the clinical pregnancy outcomes after in vitro fertilization. Front Endocrinol (Lausanne). 2023;14:1305473. doi: 10.3389/fendo.2023.1305473 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sanderman EA, Willis SK, Wise LA. Female dietary patterns and outcomes of in vitro fertilization (IVF): a systematic literature review. Nutr J. 2022;21(1):5. doi: 10.1186/s12937-021-00757-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Liu X, Liu D, Xie L. Combining statistical and machine learning models for the prediction of subfertility treatment outcomes. Computers in Biology and Medicine. 2019;116:103536. doi: 10.1016/j.compbiomed.2019.103536 [DOI] [PubMed] [Google Scholar]
  • 18.Thompson M. Statistical models in reproductive medicine. Human Reproduction Update. 2015;21(4):349–60. doi: 10.1093/humupd/dmv022 [DOI] [Google Scholar]
  • 19.Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inform Theory. 1967;13(1):21–7. doi: 10.1109/tit.1967.1053964 [DOI] [Google Scholar]
  • 20.Quinlan JR. Simplifying decision trees. International Journal of Man-Machine Studies. 1987;27(3):221–34. doi: 10.1016/s0020-7373(87)80053-6 [DOI] [Google Scholar]
  • 21.Narasimhan H, Agarwal S. A structural SVM based approach for optimizing partial AUC. In: Proceedings of the 30th International Conference on Machine Learning. 2013. p. 516–24. [Google Scholar]
  • 22.Eligüzel N, Çetinkaya C, Dereli T. Comparison of different machine learning techniques on location extraction by utilizing geo-tagged tweets: A case study. Advanced Engineering Informatics. 2020;46:101151. doi: 10.1016/j.aei.2020.101151 [DOI] [Google Scholar]
  • 23.Cheng Q, Varshney PK, Arora MK. Logistic Regression for Feature Selection and Soft Classification of Remote Sensing Data. IEEE Geosci Remote Sensing Lett. 2006;3(4):491–4. doi: 10.1109/lgrs.2006.877949 [DOI] [Google Scholar]
  • 24.Pietukhov R, Ahtamad M, Faraji-Niri M, El-Said T. A hybrid forecasting model with logistic regression and neural networks for improving key performance indicators in supply chains. Supply Chain Analytics. 2023;4:100041. doi: 10.1016/j.sca.2023.100041 [DOI] [Google Scholar]
  • 25.Guermoui M, Gairaa K, Boland J, Arrif T. A Novel Hybrid Model for Solar Radiation Forecasting Using Support Vector Machine and Bee Colony Optimization Algorithm: Review and Case Study. Journal of Solar Energy Engineering. 2020;143(2). doi: 10.1115/1.4047852 [DOI] [Google Scholar]
  • 26.Nasiri MM. A modified ABC algorithm for the stage shop scheduling problem. Applied Soft Computing. 2015;28:81–9. doi: 10.1016/j.asoc.2014.12.001 [DOI] [Google Scholar]
  • 27.Karaboga D, Basturk B. A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim. 2007;39(3):459–71. doi: 10.1007/s10898-007-9149-x [DOI] [Google Scholar]
  • 28.Akay B, Karaboga D. A modified Artificial Bee Colony algorithm for real-parameter optimization. Information Sciences. 2012;192:120–42. doi: 10.1016/j.ins.2010.07.015 [DOI] [Google Scholar]
  • 29.Ab Wahab MN, Nefti-Meziani S, Atyabi A. A comprehensive review of swarm optimization algorithms. PLoS One. 2015;10(5):e0122827. doi: 10.1371/journal.pone.0122827 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Karaboga D, Akay B. A comparative study of Artificial Bee Colony algorithm. Applied Mathematics and Computation. 2009;214(1):108–32. doi: 10.1016/j.amc.2009.03.090 [DOI] [Google Scholar]
  • 31.Zhao M. Improved artificial bee colony algorithm with adaptive parameters. International Journal of Computational Intelligence Systems. 2022;15(1):1–14. doi: 10.1080/18756891.2021.2008147 [DOI] [Google Scholar]
  • 32.Almusharff A, Nguyen N. A combination of time-scale calculus and a cross-validation technique used in fitting and evaluating fractional models. Applied Mathematics Letters. 2012;25(3):550–4. doi: 10.1016/j.aml.2011.09.056 [DOI] [Google Scholar]
  • 33.Zhang X, Liu C-A. Model averaging prediction by K-fold cross-validation. Journal of Econometrics. 2023;235(1):280–301. doi: 10.1016/j.jeconom.2022.04.007 [DOI] [Google Scholar]
  • 34.Abriha D, Srivastava PK, Szabó S. Smaller is better? Unduly nice accuracy assessments in roof detection using remote sensing data with machine learning and k-fold cross-validation. Heliyon. 2023;9(3):e14045. doi: 10.1016/j.heliyon.2023.e14045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Nti IK, Nyarko-Boateng O, Aning J. Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation. IJITCS. 2021;13(6):61–71. doi: 10.5815/ijitcs.2021.06.05 [DOI] [Google Scholar]
  • 36.Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F. Learning from imbalanced data sets. Springer. 2018. doi: 10.1007/978-3-319-92931-9 [DOI] [Google Scholar]
  • 37.Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G. Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications. 2017;73:220–39. doi: 10.1016/j.eswa.2016.12.016 [DOI] [Google Scholar]
  • 38.Krawczyk B. Learning from imbalanced data: open challenges and future directions. Prog Artif Intell. 2016;5(4):221–32. doi: 10.1007/s13748-016-0094-0 [DOI] [Google Scholar]
  • 39.Yıldız NT, Kocaman H, Yıldırım H, Canlı M. An investigation of machine learning algorithms for prediction of temporomandibular disorders by using clinical parameters. Medicine (Baltimore). 2024;103(41):e39912. doi: 10.1097/MD.0000000000039912 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Wang R, Feng Y, Chen J, Chen Y, Ma F. Association between polyunsaturated fatty acid intake and infertility among American women aged 20-44 years. Front Public Health. 2022;10:938343. doi: 10.3389/fpubh.2022.938343 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Battaglia R, Caponnetto A, Fioretti B, Iorio GG. Resveratrol supplementation enhances oocyte quality and fertilization outcomes in aged women: a follicular fluid miRNome analysis. Antioxidants. 2023;11(5):1019. doi: 10.3390/antiox11051019 [DOI] [Google Scholar]
  • 42.Wu S, Zhang X, Zhao X, Hao X, Zhang S, Li P, et al. Preconception Dietary Patterns and Associations With IVF Outcomes: An Ongoing Prospective Cohort Study. Front Nutr. 2022;9:808355. doi: 10.3389/fnut.2022.808355 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Gaskins AJ, Afeiche MC, Wright DL, Toth TL, Williams PL, Gillman MW, et al. Dietary folate and reproductive success among women undergoing assisted reproduction. Obstet Gynecol. 2014;124(4):801–9. doi: 10.1097/AOG.0000000000000477 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Patel R, Patel R. Effect of post embryo transfer vitamin C supplementation on the outcome of in-vitro fertilization: A randomized study. Journal of Fertilization In Vitro (IVF) World-wide Reproductive Medicine Genetics & Stem Cell Biology. 2020;8(3):221. doi: 10.35248/2375-4508.20.8.221 [DOI] [Google Scholar]
  • 45.Zhang C-X, et al. Embryo morphologic quality in relation to the metabolic and cognitive development of singletons conceived by in vitro fertilization and intracytoplasmic sperm injection: A matched cohort study. American Journal of Obstetrics & Gynecology. 2022;226(6):877.e1–.e10. doi: 10.1016/j.ajog.2022.02.002 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Ayman Swelum

2 Jul 2025

Dear Dr. EJDER,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

ACADEMIC EDITOR: Please respond carefully for all reviewers comments.

==============================

Please submit your revised manuscript by Aug 16 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Ayman A Swelum

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating in your Funding Statement:

[This study was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) through a publication incentive program. TÜBİTAK had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.].

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.  Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

3. Thank you for stating the following in your manuscript:

[This study was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) through a publication incentive program. TÜBİTAK had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.]

We note that you have provided funding information that is currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

[This study was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) through a publication incentive program. TÜBİTAK had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.]

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

4. We note that you have indicated that there are restrictions to data sharing for this study. For studies involving human research participant data or other sensitive data, we encourage authors to share de-identified or anonymized data. However, when data cannot be publicly shared for ethical reasons, we allow authors to make their data sets available upon request. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

Before we proceed with your manuscript, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., a Research Ethics Committee or Institutional Review Board, etc.). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories. You also have the option of uploading the data as Supporting Information files, but we would recommend depositing data directly to a data repository if possible

Please update your Data Availability statement in the submission form accordingly.

5. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process.

6. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Partly

Reviewer #4: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: I Don't Know

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Yes

Reviewer #4: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

Reviewer #4: Yes

**********

Reviewer #1: The issue being discussed i.e infertility is associated with considerable psychological burden and so necessitates constant evaluation.

However, the concept of the role of nutritional components in treatment of infertility leaves a lot of room for exploration and uncertainties.

The uncertainties justify a review and reexamination.

Reviewer #2: This manuscript presents a hybrid machine learning approach (ABC algorithm with KNN, CART, SVM, RF) to identify nutritional supplements linked to IVF success. While clinically relevant with novel optimization algorithms, major methodological gaps, insufficient validation, and presentation issues undermine conclusions.

Major Concerns

A. Sample Size and Data Validity

Only 162 patients (33 for testing) are insufficient for reliable ML validation, especially with class imbalance (38% success rate). The transformation of drugs into "active substances" lacks transparency how were supplements quantified? Was the patient's diet considered?

B. Methodological Flaws

The ABC algorithm implementation lacks parameter details (colony size, iterations). Claims of 96-98% accuracy are suspect without ablation studies. Feature selection inconsistency exists between logistic regression usage and ABC-LR hybrids. Simple models (RF: 81.81%) sometimes outperform hybrids (LR-KNN: 75.75%), contradicting superiority claims.

C. Reproducibility Issues

Figures 1-6 are missing or uninterpretable. Code availability ("on request") violates PLOS ONE requirements for public deposition. The Mendeley data link needs verification.

D. Clinical Interpretation

The study implies causality between DHA/folic acid and improved outcomes despite an observational design. Confounding factors are unaddressed. Supplement dosage and duration, critical for clinical relevance, are missing.

1. Repetitive abstract/introduction content.

2. Terminology errors: "IVR" (typo for IVF), "Folk acid".

3. Table issues: missing mean/SD values, incorrect descriptions.

4. Reference formatting problems

E. Recommendations

1. Methodology: Use cross-validation or larger datasets; detail ABC optimization; clarify feature engineering

2. Analysis: Report precision/recall/F-scores; address class imbalance with SMOTE/AUC-ROC

3. Presentation: Provide high-resolution figures, correct terminology; deposit code publicly.

4. Clinical Context:** Discuss limitations; differentiate correlation vs. causation

The hybrid ABC-ML approach shows promise but requires major methodological revisions, expanded validation, and improved presentation to meet PLOS ONE standards. The current inadequate support undermines the study's conclusions despite addressing a clinically important topic.

Reviewer #3: This study proposes a hybrid machine learning approach (notably, Artificial Bee Colony–ABC) to identify the most influential nutritional and pharmaceutical supplements in infertility treatment among women undergoing IVF. Using a dataset of 162 patients, the authors applied KNN, CART, SVM, and RF models—alone and in hybrid form with Logistic Regression and ABC—for prediction. The best-performing hybrid model (ABC-LR-KNN and ABC-LR-SVM) achieved an F-score of 98.46%. DHA and folic acid were found to be the most influential supplements.

There is an issue with the small Sample Size (162 patients), which are insufficient for robust machine learning, especially with 21 input variables and data imbalance (99 successful vs. 63 unsuccessful cases), and Limited Generalizability. I suggest you add bootstrapping, cross-validation, or test with an external validation set to demonstrate model robustness.

Another major issue with this study is the lack of biological context and clinical validation. While DHA and folic acid are biologically plausible, the study lacks a deeper clinical or mechanistic justification. It is advisable to include literature synthesis linking nutrients to ovarian function, oocyte quality, implantation, etc.

Another problem with this study is the authors overfitting Risk in Hybrid Models. The extremely high F-score (98%) on a small dataset is a red flag for overfitting. It is best to include learning curves or stratified k-fold validation. Discuss the bias-variance trade-off and model calibration.

There is also an issue with Confounding and Feature Interpretation. There is no information on control of potential confounders like age, baseline AMH levels, BMI, or cause of infertility. I would suggest that at this stage you should consider using SHAP values or LIME to interpret individual feature importance beyond black-box accuracy scores.

Finally another major issue is Model Reproducibility and Transparency. The data is partially shared via Mendeley but code is only available upon request. You should deposit pre-processing scripts and models in a public repository (e.g., GitHub with DOI). Include pseudo-code for ABC-LR hybridization.

The general layout is confusing and repetitive. The layout should be 1) Introduction - in which you introduce the subject with references to previous studies, and specify your objectives clearly at the end of your introduction. 2) Materials and Methods - in which you describe the methods used in the study. 3) Results - in which you mention all the results you obtained in the study without discussing them. 4) Discussion in which you discuss your methods and results. Finally 4) References – List of references used in the manuscript written in the journal style.

In your manuscript there is a lot of repetition and confusion, with discussion in the methods section and then repeated in the discussion.

Other minor comments are:

In the Abstract, there is a typo in the word “hybrid” which is written as “hyrid.” Also, make the contribution clearer—what was new compared to previous studies?

In the Methods section, more clarity is needed on how the data were split for training/testing. Was it random, stratified, or temporal?

In the Results section, Table 3 and Figures 3–6 should include standard deviation/confidence intervals for performance metrics. In all the tables, the commas in the numbers should be replaced by dots. Table 2, no. 16 ‘Is Zinc used by the patient?’ should be replaced by ‘Is Melatonin used by the patient?’ The title of Table 1 has a spelling mistake - ’Original’ instead of ‘orijinal’. In Table 1, nos, 4,8, and all are left blank. Does this mean that you did not have the information? If the information is there, then it should be mentioned even if it cannot be digitized, as this is important and useful information. There is another typo in Table 2 ‘status’ is written as ‘statu’.

In the discussion, there is redundant repetition of results in the first few paragraphs—consider trimming and emphasizing implications. The flowcharts (Figures 1 & 2) and accuracy figures (3–5) are helpful but lack legends, units, and captions explaining axes.

Grammar needs polishing in multiple areas; some sentences are overly long or ambiguous. E.g., "The Informations..." or "the most effective first five compounds...". Address grammar/clarity issues throughout the manuscript.

Reviewer #4: An informative,valuable study addressing one of the least researched aspects of subfertility which is the alternative therapies as nutritional supplement however,i have some few comments:

_ Design of the study should be clearly mentioned in the methodology section as well as linked to the title of the study.

-Refrrences: I wish to cite recent ones ( we are now in 2025).

-Terminology: the term infertitlty is now obsolete and replaced with: subfertility,please edit accordingly.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: Yes:  Olubukola Adeponle Adesina

Reviewer #2: No

Reviewer #3: No

Reviewer #4: Yes:  Mohsen M A Abdelhafez

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Nov 25;20(11):e0336846. doi: 10.1371/journal.pone.0336846.r002

Author response to Decision Letter 1


7 Aug 2025

Reviewer 1: The uncertainties justify a review and reexamination.

Respond : Thank you for this observation. We agree that some aspects of the study required clearer articulation to reduce ambiguity. we have revised the Discussion section to more explicitly acknowledge uncertainty in the results and limitations of model interpretation. This explanation was added to manuscript. " Although the model shows high predictive performance, its generalizability is constrained by the limited cohort size and potential confounders not accounted for. This necessitates further validation in larger, more diverse datasets." For this reason, stratified k-fold validation was applied to dataset.

Reviewer 2: The transformation of drugs into "active substances" lacks transparency how were supplements quantified? Was the patient's diet considered?

Respond : Thank you for your valuable comment. Table 2 is derived from Table 1. The active ingredients in Table 2 were obtained as follows. If a patient has been exposed to No 6 in Table 1, a list of the active ingredients in the medication the patient is using is generated and the percentage of their daily intake is determined. If they meet their daily intake, the patient is labeled as using this medication or labeled as 1 in dataset . Figure 1 explains how the active ingredients are labeled for the patients.

Reviewer 2: Only 162 patients (33 for testing) are insufficient for reliable ML validation, especially with class imbalance (38% success rate).

Respond : Thank you for your valuable comment. "2.3.7. Cross-Validation Strategy and Imbalance Data Handling. This section was added to manuscript for handling imbalance problem and insufficient dataset"

Reviewer 2: The ABC algorithm implementation lacks parameter details (colony size, iterations). Claims of 96-98% accuracy are suspect without ablation studies.

Respond : Thank you for your valuable comment. this comment was added to manuscript. " In this study, ABC algorithms parameters are number of bees, iteration number, abandonment limit, 20, 10, 0.01 respectively. It is important to note that, while there is the possibility to adjust the parameters, this is not the primary focus of the present study"

Reviewer 2: Feature selection inconsistency exists between logistic regression usage and ABC-LR hybrids.

Respond : Thank you for your valuable comment. Result sets were updated after the revisions

Reviewer 2: Simple models (RF: 81.81%) sometimes outperform hybrids (LR-KNN: 75.75%), contradicting superiority claims.

Respond: Thank you for your valuable comment. Following the aforementioned revisions, the result sets were updated. However, it is not uncommon for elementary models to yield superior outcomes. Consequently, the objective is to identify a more effective prediction model, a process that necessitates a comparative analysis of all models.

Reviewer 2: Figures 1-6 are missing or uninterpretable. Code availability ("on request") violates PLOS ONE requirements for public deposition. The Mendeley data link needs verification.

Respond: "Thank you for your valuable warning. Codes and dataset were added to manuscript related parts. 6.Data availability

https://github.com/ugurejder/ABC_IVF/blob/main/IVF_english.xlsx

7.Codes

https://github.com/ugurejder/ABC_IVF/blob/main/Gebelik_calisma_revision.ipynb "

Reviewer 2: Confounding factors are unaddressed. Supplement dosage and duration, critical for clinical relevance, are missing.

Respond: Thank you for your valuable comment. The description has been incorporated into both the section entitled 'Data preprocessing' and the appendix.

Reviewer 2: Repetitive abstract/introduction content

Respond: Thank you for your valuable comment. Repetitive content we observed was removed from the manuscript

Reviewer 2: Terminology errors: "IVR" (typo for IVF), "Folk acid".

Respond: Thank you for your valuable respond. The bugs have been fixed.

Reviewer 2: Table issues: missing mean/SD values, incorrect descriptions.

Respond: "Thank you for this valuable observation. We have thoroughly reviewed all tables and made the following changes: Added mean ± standard deviation (SD) where appropriate, particularly for continuous variables in demographic and outcome tables.

Revised all table descriptions and variable labels to ensure clarity and accuracy. These updates can be found in Table 1 and Table 3 "

Reviewer 2: Reference formatting problems

Respond: Thank you for this valuable observation. References were updated.

Reviewer 2: Use cross-validation or larger datasets; detail ABC optimization; clarify feature engineering

Respond: "Thank you for your valuable respond.Cross-validation strategy was carried out. And In section 2.3.6. Proposed Models. Description was added to manuscript."

Reviewer 2: Report precision/recall/F-scores; address class imbalance with SMOTE/AUC-ROC

Respond: Thank you for your valuable respond. SMOTE was applied to dataset and Report precision/recall/F-score are shown in Table 3

Reviewer 2 Provide high-resolution figures, correct terminology; deposit code publicly.

Respond: "Thank you for your valuable comment. Figure recreated with high resolution. Terminology were corrected. 6.Data availability

https://github.com/ugurejder/ABC_IVF/blob/main/IVF_english.xlsx

7.Codes

https://github.com/ugurejder/ABC_IVF/blob/main/Gebelik_calisma_revision.ipynb. Data and Codes have been added to the public area."

Reviewer 2: Discuss limitations; differentiate correlation vs. causation

Respond: Thank you for your valuable comment. The explanation was added to discussion part of the manuscript. The added paragraph is "It is important to note that the findings presented in this study reflect correlations between clinical and demographic features and IVF treatment outcomes. Due to the observational and retrospective nature of the dataset, the model cannot infer causality. For instance, while variables such as age, DHA, or Omega 3 levels were found to be predictive, these associations do not imply that these features directly cause the success or failure of treatment. Further prospective studies and clinical trials are required to determine causal relationships and validate these findings."

Reviewer 3: "There is an issue with the small Sample Size (162 patients), which are insufficient for robust machine learning,

especially with 21 input variables and data imbalance (99 successful vs. 63 unsuccessful cases), and Limited Generalizability. "

Respond: Thank you for your valuable comment. 2.3.7.Cross-Validation Strategy and Imbalance Data Handling section was added to manuscript for solving Generalizabilition problem

Reviewer 3: I suggest you add bootstrapping, cross-validation, or test with an external validation set to demonstrate model robustness.

Respond: Thank you for your valuable respond.Cross-validation strategy and imbalance data handling processes were carried out using stratified k-fold validation and synthetic minority over-sampling technique. The information in Table 3, Figure 4 and Figure 5 has been updated. The new codes were added to GitHub.

Reviewer 3: "Another problem with this study is the authors overfitting Risk in Hybrid Models.

The extremely high F-score (98%) on a small dataset is a red flag for overfitting.

It is best to include learning curves or stratified k-fold validation.

Discuss the bias-variance trade-off and model calibration."

Respond: Thank you for your valuable respond. synthetic minority over-sampling technique and was executed for hand. The information in Table 3, Figure 4 and Figure 5 has been updated. The new codes were added to GitHub.

Reviewer 3: "There is also an issue with Confounding and Feature Interpretation.

There is no information on control of potential confounders like age, baseline AMH levels, BMI, or cause of infertility.

I would suggest that at this stage you should consider using SHAP values or LIME to interpret individual feature importance beyond black-box accuracy scores."

Respond: Thank you for your valuable comment. LIME explanation of an individual prediction for embryo transfer outcome to interpret individual feature importance beyond black-box accuracy scores. was added to manuscript. In discussion part confounders explanation was added to manuscript

Reviewer 3: "Finally another major issue is Model Reproducibility and Transparency. The data is partially shared via Mendeley but code is only available upon request.

You should deposit pre-processing scripts and models in a public repository (e.g., GitHub with DOI). Include pseudo-code for ABC-LR hybridization. "

Respond: "Thank you for your valuable comment. 6.Data availability

https://github.com/ugurejder/ABC_IVF/blob/main/IVF_english.xlsx

7.Codes

https://github.com/ugurejder/ABC_IVF/blob/main/Gebelik_calisma_revision.ipynb. Data and Codes have been added to the public area. Pseudo-code for ABC-LR hybridization was generated and added to manuscript."

Reviewer 3: In the Abstract, there is a typo in the word “hybrid” which is written as “hyrid.” Also, make the contribution clearer—what was new compared to previous studies?

Respond: Thank you for your valuable comment. Bug was removed. This explanation was added to manuscript. "In contrast to earlier research that only looked at clinical or hormonal predictors for IVF success, our work uses a hybrid Artificial Bee Colony–Logistic Regression (ABC–LR) framework to integrate lifestyle, nutrition, and a few clinical variables. The use of a bio-inspired metaheuristic (ABC) algorithm to optimize feature selection based on logistic regression fitness and the interpretability of the resulting model through LIME analyses, which provide transparency in identifying key predictors, are the two primary areas of this study's novelty. To the best of our knowledge, this is one of the first studies to integrate these approaches in order to investigate the potential independent contributions of modifiable patient behaviors to the success of embryo transfers."

Reviewer 3: In the Methods section, more clarity is needed on how the data were split for training/testing. Was it random, stratified, or temporal?

Respond: Thank you for your valuable respond. The stratified K-fold cross-validation method is a technique used to assess a model.

Reviewer 3: In the Results section, Table 3 and Figures 3–6 should include standard deviation/confidence intervals for performance metrics.

Respond: Thank you for your valuable respond. Std deviation/confidence level were added to Table and This is exemplified by the figure.

Reviewer 3: In all the tables, the commas in the numbers should be replaced by dots.

Respond: Thank you for your valuable respond. The bugs have been fixed.

Reviewer 3: Table 2, no. 16 ‘Is Zinc used by the patient?’ should be replaced by ‘Is Melatonin used by the patient?’

Respond: Thank you for your valuable respond. The bugs have been fixed.

Reviewer 3: The title of Table 1 has a spelling mistake - ’Original’ instead of ‘orijinal’.

Respond: Thank you for your valuable respond. The bug has been fixed.

Reviewer 3: In Table 1, nos, 4,8, and all are left blank. Does this mean that you did not have the information?

Respond: Thank you for your valuable comment. This information is meaningless for the machine learning model. This is the raw data collected from the survey.

Reviewer 3: If the information is there, then it should be mentioned even if it cannot be digitized, as this is important and useful information.

Respond: Thank you for your valuable comment. This informations were removed from the manuscript.

Reviewer 3: There is another typo in Table 2 ‘status’ is written as ‘statu’.

Respond: Thank you for your valuable respond. The bug has been fixed.

Reviewer 3: In the discussion, there is redundant repetition of results in the first few paragraphs—consider trimming and emphasizing implications.

Respond: Thank you for your valuable comment. redundant repetition Paragraphs informations were removed from manuscript.

Reviewer 3: The flowcharts (Figures 1 & 2) and accuracy figures (3–5) are helpful but lack legends, units, and captions explaining axes.

Respond: Thank you for your valuable comment. Thank you for your valuable comment. Figures are revised

Reviewer 3: Grammar needs polishing in multiple areas; some sentences are overly long or ambiguous. E.g., "The Informations..." or "the most effective first five compounds...". Address grammar/clarity issues throughout the manuscript.

Respond: Thank you for your valuable comment. Grammer issues have been fixed

Reviewer 4: Design of the study should be clearly mentioned in the methodology section as well as linked to the title of the study.

Respond: Thank you for your valuable respond. Algorithms 1 was added to manusciprt. and in the 2.Materials and methods section mentioned the design of the study..

Reviewer 4: I wish to cite recent ones ( we are now in 2025).

Respond: Thank you for your valuable comment. Current articles have also been added to the manuscript in introduction section.

Reviewer 4: the term infertitlty is now obsolete and replaced with: subfertility,please edit accordingly.

Respond: Thank you for your valuable comment. the term infertitlty is replaced with subfertility.

Attachment

Submitted filename: Respond to reviewers.txt

pone.0336846.s003.txt (13.2KB, txt)

Decision Letter 1

Ayman Swelum

3 Sep 2025

Dear Dr. EJDER,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

ACADEMIC EDITOR: Please respond to all reviewers comments carefully. 

Please submit your revised manuscript by Oct 18 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Ayman A Swelum

Academic Editor

PLOS ONE

Journal Requirements:

If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

Reviewer #3: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #2: Yes

Reviewer #3: No

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #2: Yes

Reviewer #3: Yes

**********

Reviewer #2: The authors have successfully addressed the bulk of reviewer feedback, especially items concerning methodological disclosure, code and data sharing protocols, and statistical validity. However, dietary data treatment and supplement dosage/duration information need better explanation. The hybrid model performance claims should be presented with greater restraint.

Request minor revisions to:

· Clarify dietary data inclusion (or justify exclusion).

· Temper conclusions about hybrid model superiority.

Additional review is not warranted once these elements are corrected. The manuscript shows significant progress and meets the majority of publication benchmarks.

Reviewer #3: The authors have made significant efforts to address reviewer comments, including implementing cross-validation, handling class imbalance with SMOTE, improving transparency by publishing code and data, and adding interpretability via LIME. However, several critical methodological and conceptual issues remain that substantially undermine the validity, reliability, and clinical interpretability of the findings.

The sample size (N=162, with only 33 used for testing initially) is critically small for a machine learning study with 21+ input variables. This is especially true for a hybrid model involving a metaheuristic optimizer (ABC), which is prone to overfitting. While the use of 5-fold cross-validation mitigates this to some degree, the absolute number of samples remains a severe limitation. The results, particularly the very high accuracy and F-scores (~90%), are highly suspect and likely reflect overfitting to the specific cohort rather than a generalizable model. The generalizability of the findings is extremely limited. Claims of model efficacy (e.g., 91% accuracy) are not credible for real-world clinical application based on this dataset alone.

The process of transforming drug names into "active ingredients" is the study's most novel aspect, but remains its biggest weakness. The method described (labeling a patient as "1" for a supplement if their intake meets 100% of the daily requirement) is overly simplistic and clinically naive. This binary transformation completely ignores the dosage (was it 100% or 500% of the daily requirement?) and duration (taken for a week vs. a year) of supplementation, which are fundamental to its biological effect. This renders the "active ingredient" variables nearly meaningless from a clinical perspective.

As pointed out by a reviewer, the patient's baseline dietary intake of these nutrients is not taken into account. A patient eating a diet rich in Omega-3s might be labeled "0" for not taking a supplement, while their actual nutrient levels could be high. The identified "key factors" (Omega-3, Folic Acid) are likely correct based on established literature, but the study's methodology does not provide robust, novel evidence to support this. It merely shows that a crude binary representation of supplement use has predictive value in a small, overfit model.

The performance metrics reported are difficult to trust due to the high risk of overfitting. The improvement from simple models (e.g., RF: 85.19%) to hybrid models (e.g., ABC-LR-RF: 90.73%) is marginal and could easily be due to random chance, especially given the small sample size and the use of cross-validation within the optimization loop. An ablation study showing the standalone contribution of the ABC algorithm is missing. The recall for ABC-LR-RF is reported as 95.38%, which is astronomically high for a biological outcome like IVF success and is a classic red flag for overfitting or data leakage. The central claim of the paper—that the hybrid ABC model is highly effective—is not sufficiently proven. The results section reads more like an optimization exercise than a robust validation of a predictive model.

While the authors acknowledge this in the discussion (a good addition from the revision), the entire framing of the paper risks implying causation. The title says "to treat subfertility," and the conclusion identifies "the most significant supplements." However, the model is built on observed supplement use correlated with success. This could easily be reversed: clinicians may be more likely to prescribe these supplements to patients with a better prognosis, or more affluent/health-conscious patients (who have better outcomes) are more likely to take them. The model cannot disentangle this. The clinical recommendations are overstated. The study identifies associations, not treatment effects.

While the authors have now shared code and data (a major improvement), the quality of the documentation is poor. The GitHub repository contains a Jupyter notebook (Gebelik_ calisma_revision.ipynb) but no README.md file explaining how to run it, the required dependencies, or the structure of the data. The manuscript itself is riddled with minor errors (e.g., "Random Forrest," "The Informations," "Folic acid 3" in Table 2, inconsistent numbering from Table 1 to Table 2), which reduce confidence in the overall rigor. Although technically "available," the work is difficult to reproduce or build upon.

Other Minor Issues

Figure Quality: The figures (as described in the text) are still problematic. For example, Figure 1 is described as a "flow diagram of the conversion of drugs into active substances" but its caption (on Page 38) is garbled ("Import loop... Local power to add follow-up devices"). This suggests the figures were not properly finalized.

ABC Parameter Justification: The choice of ABC parameters (20 bees, 10 iterations) is arbitrary and not justified. A sensitivity analysis would be needed to show these are appropriate.

Result Presentation: Table 3 is busy and difficult to understand. A summary table showing the best model for each algorithm type would be clearer.

To make this manuscript suitable for publication, major revisions are required:

1. The title and conclusions must be tempered. Instead of "identifying effective ingredients to treat," frame it as "identifying associations between supplement use and IVF outcomes using a novel hybrid ML approach." Emphasize the methodological contribution over the clinical recommendations.

2. Acknowledge the severe limitation of the binary supplement variable. Discuss this as a major limitation of the current study and propose how future work with more detailed data (dose, duration, baseline diet) could overcome it.

3. Remove the emphasis on the 91% accuracy claim. Instead, focus on the comparative performance between models and the utility of the LIME explanations for generating hypotheses.

4. Clean up the code repository. Add a detailed README.md file, ensure the code is well-commented, and verify that the provided data file matches the one used to generate the results in the paper.

5. A thorough proofread by a native English speaker is essential to fix grammatical errors, typos, and inconsistent terminology throughout the manuscript.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #2: Yes:  Jonah Bawa Adokwe Ph.D

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Nov 25;20(11):e0336846. doi: 10.1371/journal.pone.0336846.r004

Author response to Decision Letter 2


10 Sep 2025

Responds to reviewer

Reiewer 2 : 1- Clarify dietary data inclusion (or justify exclusion).

Response : We thank the reviewer for this important observation. In the original submission, only supplement intake was encoded as binary “active ingredient” variables, while detailed dietary intake data were not included. The exclusion of dietary data was due to the absence of reliable and standardized dietary intake records in the available patient dataset. This acknowledgement has been incorporated into the revised manuscript, specifically within the section addressing research limitations.

Reiewer 2 : 2- Temper conclusions about hybrid model superiority.

Response : We thank the reviewer for this helpful suggestion. In the revised version of the manuscript, the conclusion section has been redesigned and is now highlighted in colour.

Reiewer 3 : 1- The sample size (N=162, with only 33 used for testing initially) is critically small for a machine learning study with 21+ input variables. This is especially true for a hybrid model involving a metaheuristic optimizer (ABC), which is prone to overfitting. While the use of 5-fold cross-validation mitigates this to some degree, the absolute number of samples remains a severe limitation. The results, particularly the very high accuracy and F-scores (~90%), are highly suspect and likely reflect overfitting to the specific cohort rather than a generalizable model. The generalizability of the findings is extremely limited. Claims of model efficacy (e.g., 91% accuracy) are not credible for real-world clinical application based on this dataset alone.

Response : We thank the reviewer for highlighting this important concern. We agree that the relatively small sample size (N = 162, with only 33 test cases) poses a severe limitation and increases the risk of overfitting, particularly for hybrid ABC models. To address this, we have added a dedicated Research Limitations section where we explicitly acknowledge the limited cohort size, the risk of inflated accuracy, and the need for validation on larger, multi-center datasets before clinical application. We also emphasize that our findings should be interpreted as exploratory and methodological, rather than as generalizable clinical recommendations.

Reiewer 3 : 2- The process of transforming drug names into "active ingredients" is the study's most novel aspect, but remains its biggest weakness. The method described (labeling a patient as "1" for a supplement if their intake meets 100% of the daily requirement) is overly simplistic and clinically naive. This binary transformation completely ignores the dosage (was it 100% or 500% of the daily requirement?) and duration (taken for a week vs. a year) of supplementation, which are fundamental to its biological effect. This renders the "active ingredient" variables nearly meaningless from a clinical perspective.

As pointed out by a reviewer, the patient's baseline dietary intake of these nutrients is not taken into account. A patient eating a diet rich in Omega-3s might be labeled "0" for not taking a supplement, while their actual nutrient levels could be high. The identified "key factors" (Omega-3, Folic Acid) are likely correct based on established literature, but the study's methodology does not provide robust, novel evidence to support this. It merely shows that a crude binary representation of supplement use has predictive value in a small, overfit model.

Response : We thank the reviewer for highlighting this important methodological limitation. We fully agree that the binary transformation of supplements into “active ingredient” variables is clinically simplistic, as it does not capture dosage, duration of use, or baseline dietary intake. In the revised manuscript, we have reinforced the Research Limitations section by highlighting it in colour to emphasise that this approach makes the supplementary variables clinically imprecise and potentially inaccurate. We now explicitly note that, although our model identified factors such as omega-3 and folic acid consistent with the literature, these results should be regarded as exploratory, hypothesis-generating correlations rather than robust clinical evidence. We also highlight the need for future studies to incorporate quantitative dosage, duration, and baseline nutrition data in order to produce clinically meaningful predictors.

Reiewer 3 : 3- The performance metrics reported are difficult to trust due to the high risk of overfitting. The improvement from simple models (e.g., RF: 85.19%) to hybrid models (e.g., ABC-LR-RF: 90.73%) is marginal and could easily be due to random chance, especially given the small sample size and the use of cross-validation within the optimization loop. An ablation study showing the standalone contribution of the ABC algorithm is missing. The recall for ABC-LR-RF is reported as 95.38%, which is astronomically high for a biological outcome like IVF success and is a classic red flag for overfitting or data leakage. The central claim of the paper—that the hybrid ABC model is highly effective—is not sufficiently proven. The results section reads more like an optimization exercise than a robust validation of a predictive model.

Response : We thank the reviewer for raising these crucial concerns regarding model performance and potential overfitting. We agree that the small cohort size and the marginal improvements observed between the simple and hybrid models limit the robustness of our conclusions. In the revised manuscript, we have clarified in both the 'Results' and 'Research Limitations' sections, using highlighted colour, that the observed improvements in accuracy and recall may be partially attributable to chance and cannot be interpreted as definitive evidence of superior model performance. We have also added explicit caution that the unusually high recall values, while numerically obtained under stratified cross-validation with SMOTE, are likely inflated and should be regarded as a red flag for potential overfitting rather than as clinically valid outcomes.

Regarding the absence of an ablation study, we acknowledge this as a limitation and now explicitly state that future work should include standalone evaluations of the ABC algorithm’s contribution to validate its incremental value beyond traditional models. We have revised our conclusions to present the hybrid ABC approach not as a validated clinical model, but rather as an exploratory optimization framework whose methodological novelty requires further confirmation in larger, multi-center datasets.

Reiewer 3 : 4- While the authors acknowledge this in the discussion (a good addition from the revision), the entire framing of the paper risks implying causation. The title says ‘to treat subfertility,’ and the conclusion identifies ‘the most significant supplements.’ However, the model is built on observed supplement use correlated with success. This could easily be reversed: clinicians may be more likely to prescribe these supplements to patients with a better prognosis, or more affluent/health-conscious patients (who have better outcomes) are more likely to take them. The model cannot disentangle this. The clinical recommendations are overstated. The study identifies associations, not treatment effects.

Response : We thank the reviewer for this important clarification regarding causality. We agree that our study is observational and retrospective, and therefore cannot make causal inferences or clinical recommendations. In response, we have revised the framing throughout the manuscript:

• The title has been changed to emphasize the exploratory, methodological nature of the work rather than suggesting treatment effects (page 1).

• A highlighted color has been added to the end of the conclusion section to emphasize that the study finds correlations and associations rather than causal treatment effects.

• We now explicitly note that the observed associations may reflect underlying confounding factors such as clinician prescribing patterns or patient socioeconomic/health profiles, and that the model cannot disentangle these effects

Therefore, before any clinical conclusions can be made, our findings should only be presented as exploratory, hypothesis-generating associations that need to be confirmed in prospective or randomized studies.

Reiewer 3 : 5- While the authors have now shared code and data (a major improvement), the quality of the documentation is poor. The GitHub repository contains a Jupyter notebook (Gebelik_calisma_revision.ipynb) but no README.md file explaining how to run it, the required dependencies, or the structure of the data. The manuscript itself is riddled with minor errors (e.g., ‘Random Forrest,’ ‘The Informations,’ ‘Folic acid 3’ in Table 2, inconsistent numbering from Table 1 to Table 2), which reduce confidence in the overall rigor. Although technically ‘available,’ the work is difficult to reproduce or build upon.

Response : We thank the reviewer for recognizing our effort to make both the data and code publicly available, and we fully agree that proper documentation and presentation are essential for reproducibility and clarity. In response to this valuable feedback, we have made the following improvements:

• Added a README.md file in the GitHub repository that clearly explains how to run the Jupyter notebook, the required dependencies (via a requirements.txt file), and the structure of the dataset.

• Uploaded a requirements.txt file so that all dependencies can be installed easily, ensuring full reproducibility of the analysis.

• Carefully proofread and corrected minor errors in the manuscript, including replacing “Random Forrest” with “Random Forest,” “The Informations” with “The Information,” correcting “Folic acid 3” to “Folic acid” in Table 2, and fixing inconsistent numbering between Table 1 and Table 2.

• Revised the Methods section to provide additional clarification of the dataset structure and preprocessing steps, consistent with the documentation in the GitHub repository.

We believe these changes substantially improve the rigor, clarity, and reproducibility of the work.

Reiewer 3 : 6- Figure Quality: The figures (as described in the text) are still problematic. For example, Figure 1 is described as a ‘flow diagram of the conversion of drugs into active substances’ but its caption (on Page 38) is garbled (‘Import loop... Local power to add follow-up devices’). This suggests the figures were not properly finalized.

Response: We thank the reviewer for pointing out this error. We upload again the quality form of the each figures. But maybe when the system converted to pdf for review, System broke the resolution of the Figures. We sincerely apologize for the oversight. In the revised manuscript, we have carefully reviewed and corrected all figure captions to ensure accuracy and consistency with the text. Specifically, the caption for Figure 1 has been corrected to:

“Figure 1. Flow diagram of the conversion of drugs into active substances.”

We have also double-checked all other figures (Figures 2–6) to ensure that their captions and numbering are consistent, descriptive, and finalized.

Reiewer 3 : 7- ABC Parameter Justification: The choice of ABC parameters (20 bees, 10 iterations) is arbitrary and not justified. A sensitivity analysis would be needed to show these are appropriate.

Response: We thank the reviewer bringing up this crucial methodological issue. We admit that the original version did not adequately explain the choice of ABC parameters (20 bees, 10 iterations). We now offer a rationale in the updated manuscript that is supported by both previous research and computational viability. In particular, comparable swarm sizes (20–50 bees) and iteration ranges (10–50) have been documented as effective defaults in prior ABC applications in biomedical and optimization contexts (e.g., [Karaboga & Basturk, 2007]; [Akay & Karaboga, 2012]). We chose modest values (20 bees, 10 iterations) to minimize computational cost while guaranteeing convergence, considering the exploratory nature of our study and the relatively small dataset (N = 162). We chose modest values (20 bees, 10 iterations) to minimize computational cost while guaranteeing convergence, considering the exploratory nature of our study and the relatively small dataset (N = 162). We acknowledge that a sensitivity analysis would improve the study, but this version did not have the capacity to conduct one because of dataset size limitations. We have made it clear that this is a limitation and that, in order to confirm their robustness, future studies should examine the effects of changing ABC parameters in a methodical manner.

Reiewer 3 : 8- Result Presentation: Table 3 is busy and difficult to understand. A summary table showing the best model for each algorithm type would be clearer.

Response: We thank the reviewer for this helpful suggestion. We agree that the original Table 3 was overly detailed. In the revised manuscript, we have addressed this in two ways:

1- For the sake of transparency, we have marked the important parts of Table 3 in bold.

2- Furthermore, we developed a new summary table (Table 4) that solely displays the top-performing model for every kind of algorithm (e.g., baseline vs. LR-hybrid vs. ABC-hybrid). The full table is available for those who want to look at every detail, but this summary offers a concise, easy-to-read summary of the key findings.

Reiewer 3 : 9- The title and conclusions must be tempered. Instead of ‘identifying effective ingredients to treat,’ frame it as ‘identifying associations between supplement use and IVF outcomes using a novel hybrid ML approach.’ Emphasize the methodological contribution over the clinical recommendations

Response : We thank the reviewer for this constructive feedback. We agree that the previous version of our manuscript overstated the clinical implications. Following your suggestion, both the title and the conclusion have been revised to emphasize the methodological contribution and to frame the findings as exploratory associations rather than treatment recommendations.

Original Title:

“Identifying Effective Ingredients to Treat Subfertility with a Hybrid Machine Learning–Metaheuristic Model”

Revised Title:

“Predicting IVF Outcomes Using a Logistic Regression–ABC Hybrid Model: A Proof-of-Concept Study on Supplement Associations”

Original Conclusion (excerpt):

“Omega 3 and folic acid have been identified as the most important dietary supplements in the treatment of subfertility, and the proposed model provides strong evidence for their clinical application.”

Revised Conclusion (excerpt):

“In conclusion, even though the model identified folic acid and omega-3 as predictive variables, these findings should be regarded as exploratory correlations rather than specific treatment suggestions. This work's main innovation is its methodological framework, a hybrid Logistic Regression–Artificial Bee Colony (LR-ABC) model that combines feature selection and metaheuristic optimization to investigate the results of IVF. Therefore, this study should be considered a proof-of-concept demonstration of the potential of ML–ABC approaches in reproductive medicine. Before making clinical judgments, validation with larger, multi-center datasets that include dosage, duration, and baseline dietary intake will be required in the future.”

Reiewer 3 : 10- Acknowledge the severe limitation of the binary supplement variable. Discuss this as a major limitation of the current study and propose how future work with more detailed data (dose, duration, baseline diet) could overcome it.

Response : We have now explicitly acknowledged this as a major limitation in the Conclusion and Research Limitations sections. In addition, we have proposed that future studies should collect more detailed data on supplement dosage, duration of use, and baseline dietary intake, ideally through prospective or randomized designs. Incorporating these factors will allow for more accurate modeling of nutrient effects and strengthen the biological and clinical validity of the approach.

Reiewer 3 : 11- Remove the emphasis on the 91% accuracy claim. Instead, focus on the comparative performance between models and the utility of the LIME explanations for generating hypo

Attachment

Submitted filename: Responds to reviewer_2.docx

pone.0336846.s004.docx (22.8KB, docx)

Decision Letter 2

Ayman Swelum

1 Oct 2025

Dear Dr. EJDER,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please respond carefully for all reviewers comments. 

Please submit your revised manuscript by Nov 15 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Ayman A Swelum

Academic Editor

PLOS ONE

Journal Requirements:

If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

Reviewer #3: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #2: Yes

Reviewer #3: No

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #2: Yes

Reviewer #3: Yes

**********

Reviewer #2:  I have reviewed the manuscript with the authors' responses closely; I have found that some incomplete remarks need to be addressed.

1. Transformation of drug to active ingredient: Although they added some detail in Figure 1 and text, the binary coding (1 if meets 100% daily requirement, 0 otherwise) is too simplistic. However, the authors admit such a limitation, but the essential methodological weakness remains.

2. Sample size issues: They admit that it is not enough, but N=162 and 33 test cases are not enough to have a strong validation of ML with 21 or more variables. This is a limitation that can only be partially overcome with the aid of cross-validation.

Concerns:

1. Overfitting risk: The very high-performance rates (90-95% recall) on such a small amount of data are questionable, regardless of the methodological improvements. The authors recognize this fact, but the findings are probably overstated.

2. interpretation: Although they have softened conclusions and included disclaimers about causality, the binary supplement code supports clinical findings with doubts.

3. Confounding factors: It does not discuss possible confounding factors such as socioeconomic status, access patterns related to healthcare, or prescribing patterns by clinicians.

The authors have done a lot in terms of improvement on most of the technical and methodological issues. It can now be reproduced with publicly available code/data, has properly handled the statistical handling, and is properly framing limitations. Nevertheless, small sample size and crude feature engineering remain the basic limitations.

Revisions that are necessary before acceptance:

1. Make the limitations section stronger to indicate more clearly that the binary supplement coding cannot reveal the real nutrient status or biological outcome.

2. Include some commentary on possible confounding variables (socioeconomic status, access to healthcare, clinician prescribing patterns) that might be the cause of the observed associations.

3. Once more, update the abstract and conclusion to make it clear that this is a methodological demonstration, but not clinically practical results.

The research has a sensible methodological contribution to hybrid ML methods in reproductive medicine, although the clinical implications must be viewed with caution due to the data constraints. Having made these last clarifications, it would be appropriate to publish as an exploratory methodological study.

Reviewer #3:  The authors have partially addressed the reviewer's concerns by improving documentation, adding a README file, and creating requirements files. Code and data have been shared (GitHub), which is commendable and aligns with open science principles. The manuscript shows substantial revisions based on reviewer feedback, particularly regarding overstatement of clinical conclusions, figure clarity, reproducibility, and limitations. However, there are still some major weaknesses and concerns.

The sample size of N=162 (with 33 test cases) is critically small for ML with 21+ predictors. The reported performance (accuracy ~91%, recall ~95%) is likely inflated due to overfitting, even with SMOTE and cross-validation. Lack of external validation or an independent test dataset limits the credibility of generalization.

Binary transformation of supplements into “active ingredient = 1 if ≥100% daily requirement” is clinically simplistic and potentially misleading. The paper ignores dosage, duration, and baseline diet/nutritional status, which undermines biological validity.

No ablation study was performed to isolate the contribution of the ABC optimizer vs. LR baseline. Reported marginal improvements (e.g., RF 85% → 90.7%) may be due to chance.

Despite revisions, some language still risks implying causality (e.g., describing omega-3 as “effective” for subfertility). Confounding factors (prescriber bias, socioeconomic status, and underlying prognosis) are not adequately accounted for.

Figures and tables remain dense and sometimes unclear (especially Table 3). Typos, formatting errors, and awkward English phrasing persist in multiple places (e.g., “higest,” “phytolexin,” “vitamine c”). Method descriptions (e.g., ABC algorithm pseudocode) are overly technical for a biomedical journal and lack clarity on practical implications.

ABC parameters (20 bees, 10 iterations) remain inadequately justified. No sensitivity analysis was performed.

Findings are exploratory, yet sections of the discussion still suggest clinical implications. The study does not measure actual pregnancy/live birth outcomes—only embryo transfer success—limiting clinical utility.

My recommendation for the author's improvement of the manuscript is to reframe the entire paper as a methodological proof-of-concept, not as clinical evidence of supplement efficacy. Explicitly remove any implication that omega-3 or folic acid are treatments; present them only as correlates.

There should be a stronger emphasis on overfitting risk and lack of generalizability. Stress that binary supplement coding is a major weakness.

Improve validation by adding ablation or sensitivity analyses (e.g., performance of ABC vs. baseline LR without optimization). Consider external dataset testing (even partial) if available.

Simplify tables: keep full data in appendices, present concise summary tables in the main text. Improve figure quality and captions (ensure alignment with text). Proofread thoroughly for grammar, terminology consistency, and readability.

Provide stronger justification for ABC parameter settings, citing prior literature in biomedical ML applications. Explain why ABC was chosen over other optimizers (e.g., GA, PSO) in the IVF context.

Highlight that results are associational and cannot inform treatment decisions. Suggest that future studies incorporate prospective data, nutrient dosages, and longitudinal outcomes.

The title and abstract, which are just a guide for the authors, should be written in the following manner:

Predicting IVF Outcomes Using a Logistic Regression–ABC Hybrid Model: A Proof-of-Concept Study on Supplement Associations

(This emphasizes prediction, methodology, and associations — not treatment or causality.)

Background

Machine learning models are increasingly applied to assisted reproductive technologies (ART), but most studies rely on conventional algorithms with limited optimization. This proof-of-concept study investigates whether a hybrid Logistic Regression–Artificial Bee Colony (LR–ABC) framework can improve predictive performance in in vitro fertilization (IVF) outcomes, while generating interpretable, hypothesis-driven associations with nutritional and pharmaceutical supplement use.

Methods

A retrospective dataset of 162 women undergoing IVF was analyzed. Clinical, demographic, and supplement variables were pre-processed into 21 predictors. Four algorithms (K-Nearest Neighbors, Classification and Regression Tree, Support Vector Machine, and Random Forest) were implemented alongside their LR–ABC hybrid counterparts. Model performance was evaluated using 5-fold cross-validation with SMOTE to address class imbalance. Local Interpretable Model-agnostic Explanations (LIME) were used to provide interpretability.

Results

Across all algorithm families, LR–ABC hybrids outperformed baseline models (e.g., Random Forest: 85.2% → 90.7% accuracy). LIME explanations identified omega-3, folic acid, and dietician support as influential features in individual predictions. However, given the small sample size, binary representation of supplements, and absence of external validation, the observed improvements and associations should be regarded as exploratory rather than definitive.

Conclusion

The LR–ABC hybrid model demonstrates methodological potential for enhancing prediction and interpretability in IVF research. Findings regarding supplement associations are hypothesis-generating and not clinically directive. Future studies with larger, multi-center datasets, including detailed dosage and dietary data, are needed to validate and extend this framework.

Keywords: Hybrid Machine Learning, IVF Prediction, Nutritional Supplements, Metaheuristic Optimization, Artificial Bee Colony

In this version, emphasis is on the methodological proof of concept.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #2: Yes:  Dr. Jonah Bawa Adokwe

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Nov 25;20(11):e0336846. doi: 10.1371/journal.pone.0336846.r006

Author response to Decision Letter 3


7 Oct 2025

Responds to reviewer

Reviewer 2 : 1- Make the limitations section stronger to indicate more clearly that the binary supplement coding cannot reveal the real nutrient status or biological outcome.

Response : We thank the reviewer for this important observation. We enlarge and detailed the Research limitation section

“ Our exposure variables for micronutrient use were encoded binarily (e.g., supplement taken vs. not taken at a threshold such as ≥100% RDI). This encoding is pragmatic but cannot recover true nutrient status or biological effect for several reasons. First, it discards dose and duration information, precluding any dose–response assessment and ignoring cumulative exposure. Second, it assumes a uniform effect across brands and formulations, while bioavailability varies markedly with chemical form, excipients, co-ingested foods, and timing of intake. Third, binary use does not reflect adherence (frequency/consistency) or timing relative to the biological window of interest (e.g., peri-procedural vs. long-term use), both of which influence physiological impact. Fourth, we lack baseline nutritional status and objective biomarkers (e.g., serum folate, DHA, vitamin D); thus, we cannot distinguish deficiency correction from supraphysiologic exposure, nor can we account for inter-individual differences in absorption and metabolism. Finally, supplement use is correlated with broader health behaviors and socioeconomic factors; with only binary indicators, residual confounding and non-differential misclassification are likely, which can attenuate or unpredictably bias associations. ”

Reviewer 2 : 2- Include some commentary on possible confounding variables (socioeconomic status, access to healthcare, clinician prescribing patterns) that might be the cause of the observed associations.

Response : We thank the reviewer for this helpful suggestion. In the revised version of the manuscript, the Research Limitations section has been redesigned and is now highlighted in colour.

“ Our observed associations may be partly explained by unmeasured confounding. First, socioeconomic status (SES)—including education, income, insurance coverage, and neighborhood deprivation—can influence both supplement use (ability to purchase higher-quality products, better adherence) and clinical outcomes (health literacy, earlier presentation, healthier baseline). Second, access to healthcare (clinic proximity, appointment availability, out-of-pocket costs, private vs. public care) may differentially shape care pathways and follow-up, thereby confounding the link between supplement use and outcomes. Third, clinician prescribing patterns introduce confounding by indication: clinicians may recommend or escalate supplementation preferentially for patients they judge at higher (or lower) risk based on unrecorded clinical cues; practice style, brand/formulation preferences, and evolving guidelines can also vary across clinicians and time. Together with other lifestyle and clinical factors that we could not fully measure (diet quality, physical activity, smoking, comorbidities, baseline nutrient status, time-to-treatment, lab protocols, and calendar-time effects), these sources of confounding and non-differential misclassification of exposure may attenuate or bias associations in unpredictable directions. Accordingly, findings should be interpreted as associations with reported supplement use rather than causal effects. “

Reviewer 2 : 3- Once more, update the abstract and conclusion to make it clear that this is a methodological demonstration, but not clinically practical results.

Response :

“ The objective of this study is to investigate predictive signals for outcomes in IVF/ART. The study presents a method-focused workflow that combines engineered clinical variables with an ABC-assisted selection/tuning strategy. The findings suggest that in a controlled, leakage-safe assessment, regularly recorded variables, such as binary supplement indicators, can facilitate associational modelling. Importantly, clinical use is not the goal of these models. The lack of dose, duration, adherence, dietary intake, and objective biomarkers restricts interpretability, and binary coding of supplementation cannot recover true nutrient status or biological effect. Furthermore, some of the observed associations may be explained by unmeasured confounding, such as socioeconomic status, access to healthcare, clinician prescribing patterns and indications, and lifestyle factors. The current findings should be interpreted as proof-of-concept rather than practical advice due to these limitations and the size of the dataset. “

Reviewer 3 : 1- The sample size of N=162 (with 33 test cases) is critically small for ML with 21+ predictors. The reported performance (accuracy ~91%, recall ~95%) is likely inflated due to overfitting, even with SMOTE and cross-validation. Lack of external validation or an independent test dataset limits the credibility of generalization.

Respond : Thank you for your valuable comment. The process of data collection for this study is both protracted and laborious. The collection of this data was made possible by the utilisation of limited resources available in the designated small area.

We added explanation to the Research limitations and future works section.

“ Notwithstanding these methodological precautions, the limited sample size indicates that the accuracy and F-scores reported should be interpreted with caution and are unlikely to be applicable to broader clinical populations without further validation. The modest dataset size restricts the generalisability of the findings, and the reported performance metrics should be interpreted with caution”

Reviewer 3 : 2- Binary transformation of supplements into “active ingredient = 1 if ≥100% daily requirement” is clinically simplistic and potentially misleading. The paper ignores dosage, duration, and baseline diet/nutritional status, which undermines biological validity.

Respond : Thank you for your valuable suggestion. We revised the Research limitations and future works section. The following paragraph constitutes our response to the proposal

Thirdly, only supplement intake was encoded as binary “active ingredient” variables (coded as 1 if reported intake met 100% of the recommended daily intake). Detailed dietary intake data (e.g., habitual nutrient consumption from food sources) were not included. This exclusion was due to the unavailability of reliable, standardized dietary records in the patient dataset. The conversion of drug and supplement intake into "active ingredient" variables was streamlined into a binary coding system, designating patients as "1" if their reported intake satisfied all daily requirements. Notably, this method overlooks crucial clinical information, including dosage levels (for instance, 100% versus 500% of the recommended intake) and usage duration (for instance, short-term versus long-term supplementation). Furthermore, the absence of baseline dietary intake data is a notable limitation. To illustrate, a patient who naturally consumes a diet high in omega-3 fatty acids but does not take supplements would be assigned the label "0," despite their actual nutrient levels being adequate. These constraints serve to reduce the biological precision of the "active ingredient" variables This simplification renders the supplement variables clinically simplistic and potentially misleading, as it ignores fundamental determinants of biological effect such as dose, duration, and baseline nutrition. Therefore, while the identification of omega-3 and folic acid as predictive factors is consistent with established literature, these findings should be interpreted as exploratory, hypothesis-generating correlations rather than as robust novel clinical evidence.

Reviewer 3 : 3- No ablation study was performed to isolate the contribution of the ABC optimizer vs. LR baseline. Reported marginal improvements (e.g., RF 85% → 90.7%) may be due to chance. Despite revisions, some language still risks implying causality (e.g., describing omega-3 as “effective” for subfertility). Confounding factors (prescriber bias, socioeconomic status, and underlying prognosis) are not adequately accounted for.

Respond : Thank you your valuable comment. We sincerely thank the reviewer for these insightful comments regarding both methodological clarification and interpretive precision.

1. Ablation analysis:

We acknowledge that an ablation or sensitivity analysis to isolate the specific contribution of the Artificial Bee Colony (ABC) optimizer relative to the Logistic Regression (LR) baseline was not performed. In response, we have explicitly addressed this limitation in the revised Discussion . The text now clarifies that the modest performance improvements observed (e.g., RF 85.2% → 90.7%) may partly reflect stochastic variation rather than a definitive optimization advantage. The revised manuscript also notes that future work should include a controlled ablation design to evaluate the independent effect of the ABC component under identical resampling conditions.

2. Causality and confounding:

We carefully reviewed all phrasing that might imply causal inference and revised the manuscript accordingly. Expressions such as “omega-3 is effective” have been replaced with non-causal, association-oriented phrasing (e.g., “omega-3 use was identified as a predictive variable”). Furthermore, we have added a new paragraph in the Discussion emphasizing that supplement-related associations may be influenced by confounding factors such as prescriber bias, socioeconomic status, healthcare access, and underlying prognosis. The revised text explicitly states that the findings are exploratory and hypothesis-generating rather than indicative of therapeutic efficacy.

We believe these changes directly and comprehensively address the reviewer’s concerns, ensuring that the study’s methodological and interpretive scope is presented with appropriate caution and clarity.

Below explanationwas added to discussion section.

“ Despite the fact that the hybrid models attained comparatively higher levels of predictive performance, the absence of an ablation or sensitivity analysis hinders the capacity to ascertain the independent contribution of the Artificial Bee Colony (ABC) optimizer with respect to the Logistic Regression (LR) baseline. This is evidenced by the observed performance improvements (e.g., an enhancement in Random Forest accuracy from 85.2% to 90.7%) that may be attributable to stochastic variation rather than a distinct optimization advantage. Moreover, it is imperative to refrain from interpreting the associations identified in this study as causal. A number of variables have been found to be predictive of outcomes, including omega-3 use, folic acid intake, and dietician support. However, these effects may be confounded by factors such as prescriber bias, socioeconomic status, access to private healthcare, or underlying clinical prognosis. The utilisation of nutritional supplements has frequently been observed to correlate with unmeasured health behaviours and social determinants, which in turn may influence the outcomes of treatment. Consequently, the relationships documented in this study should be regarded as exploratory, hypothesis-generating associations that require validation through prospective, multicentre studies incorporating ablation analysis, larger cohorts, and detailed nutritional and clinical data. “

Reviewer 3 : 4- Figures and tables remain dense and sometimes unclear (especially Table 3). Typos, formatting errors, and awkward English phrasing persist in multiple places (e.g., “higest,” “phytolexin,” “vitamine c”). Method descriptions (e.g., ABC algorithm pseudocode) are overly technical for a biomedical journal and lack clarity on practical implications.

Respond : Thank you for this valuable suggestion. We have carefully revised the manuscript to correct grammatical errors. In addition, the paper has undergone thorough proofreading to ensure that it meets academic writing standards in terms of clarity, style, and consistency. These improvements enhance the overall readability and professionalism of the manuscript.

Reviewer 3 : 5- ABC parameters (20 bees, 10 iterations) remain inadequately justified. No sensitivity analysis was performed.

Findings are exploratory, yet sections of the discussion still suggest clinical implications. The study does not measure actual pregnancy/live birth outcomes—only embryo transfer success—limiting clinical utility.

Respond : Thank you for this valuable suggestion. we note the values and briefly say they were “chosen to balance computational efficiency and convergence, in line with prior studies reporting similar ranges” (and cite Karaboga & Basturk, 2007; Akay & Karaboga, 2012). And this explanation was added to discussion section.

“These values were chosen to balance computational efficiency with convergence stability. However, no formal sensitivity analysis was conducted to evaluate the effect of parameter variation on model performance”

It is imperative to emphasise the following point concerning the clinical implications. The following paragraph was incorporated into the discussion section with the objective of clarifying that the findings should be interpreted within the methodological context rather than the clinical context.

“ Moreover, while the hybrid LR–ABC framework demonstrated improved predictive accuracy for embryo transfer outcomes, these findings should be regarded as exploratory and methodological rather than clinical. The study did not measure pregnancy or live birth rates, and therefore cannot inform treatment efficacy or reproductive prognosis. As such, the model’s scope is limited to predicting the likelihood of embryo transfer success within the observed dataset and does not extend to broader clinical outcomes. ”

Reviewer 3 : 6- My recommendation for the author's improvement of the manuscript is to reframe the entire paper as a methodological proof-of-concept, not as clinical evidence of supplement efficacy. Explicitly remove any implication that omega-3 or folic acid are treatments; present them only as correlates.

Respond : Thank you for your valuable comments. All sentences pertaining to implications or clinical evidence are replaced by sentences demonstrating statistical associations with reproductive outcomes. The following replacement process is to be followed:

Orginal 1: These findings suggest a beneficial role of resveratrol supplementation on reproductive outcomes.

Revised 1: Previous studies have reported that resveratrol supplementation shows statistical associations with reproductive outcomes; however, these findings should be regarded as observational and not indicative of therapeutic efficacy.

Orginal 2: These findings suggest a beneficial role of resveratrol supplementation on reproductive outcomes.

Revised 2: Previous studies have reported that resveratrol supplementation shows statistical associations with reproductive outcomes; however, these findings should be regarded as observational and not indicative of therapeutic efficacy.

Orginal 3: These findings underscore the potential benefit of preconceptional vitamin B supplementation for enhancing IVF success.

Revised 3: Some studies have observed correlations between preconceptional vitamin B-complex use and IVF success indicators; however, such associations are exploratory and may be influenced by unmeasured confounding factors rather than reflecting a causal or therapeutic effect.

Orginal 4: Based on the highest F-score … the most effective for subfertility treatment are omega 3, folic acid, dietician support, phytoalexin, vitamin C, and vitamin B6.

Revised 4: Based on the highest F-score and LIME interpretability results, omega-3, folic acid, dietician support, phytoalexin, vitamin C, and vitamin B6 emerged as the most influential predictors associated with embryo-transfer outcomes.

Orginal 5: According to Fig. 6, omega-3 is the most effective active substance for subfertility treatment. The higher the omega-3 intake, the lower the risk of subfertility regardless of age

Revised 5: According to Fig. 6, omega-3 appeared as one of the most influential predictive

Attachment

Submitted filename: Responds to reviewer_3.docx

pone.0336846.s005.docx (28.4KB, docx)

Decision Letter 3

Ayman Swelum

12 Oct 2025

Dear Dr. EJDER,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

ACADEMIC EDITOR: Please respond carefully for reviewers comments.

==============================

Please submit your revised manuscript by Nov 26 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Ayman A Swelum

Academic Editor

PLOS ONE

Journal Requirements:

If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #2: No

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #2: No

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #2: Yes

Reviewer #3: Yes

**********

Reviewer #2: Significant Problems to Work On (Prior to Acceptance)

1. Ablation/Sensitivity Analysis:

Obviously reported findings of comparing baseline logistic regression with the LR-ABC hybrid (identical splits of CV) and present confidence intervals or p-values.

2. Generalizability: N/A.

Give external or more intensive internal validation (e.g. nested or repeated CV) and make limitations clear.

3. ABC Parameter Justification:

Provide explanation and rationale of hyperparameters; provide a short sensitivity analysis.

4. Feature Coding & Confounding:

Elaborate on derivation of supplement variables, frequencies and do not use causal language.

5. Model Reporting:

Provide calibration measures (e.g. Brier score, calibration plots) and display uncertainty in performance measures (mean & SD or CI).

6. Reproducibility:

Make the GitHub repository able to reproduce all important results, having all essential paths and instructions.

7. Figures/Tables:

Make performance tables easier to understand, enhance figure definitions and consistency with the amended text.

Reviewer #3: The manuscript shows significant improvement after revisions: fewer grammatical errors, clarified tables, and better figure captions. However, dense tables (Table 3 onward) remain difficult to interpret, and redundant numeric data could be shifted to appendices. The paper still includes awkward phrasing and occasional translation artifacts (“the result of success”, “add here”, etc.), indicating incomplete final editing.

The authors now rightly present this as a proof-of-concept, not a clinically actionable tool, emphasizing methodological innovation rather than biological causality. This was what the reviewers had suggested. The paper is now transparent, and includes detailed responses to reviewer comments that improved the clarity and caution of interpretation. The authors explicitly acknowledge limitations (e.g., binary supplement coding, lack of dosage data, small sample size, absence of external validation) which is again a considerable improvement over the previous resubmission.

The mathematical sections (e.g., ABC pseudocode, formulae) are still prolonged and reduce accessibility for biomedical audiences. There is limited discussion of why this optimization improves logistic regression behaviour in this specific clinical context. The dependent variable is “embryo transfer success”, not pregnancy or live birth. This limits the clinical significance; embryo transfer success is an intermediate outcome, not the patient-relevant endpoint.

There still persists, despite careful rewording, some residual implication of supplement “benefit”. The interpretation of feature importance as biological relevance (e.g., omega-3 as “influential”) may unintentionally suggest causation.

My suggestions for improvement of the manuscript to be acceptable for publication are the following:

Include additional metrics like ROC-AUC, PR-AUC, calibration, and confusion matrices to contextualize accuracy.

Clarify the clinical scope by emphasizing embryo transfer success and not pregnancy; explicitly distinguish model domain from treatment prognosis.

Reduce the amount of technical information by moving pseudocode and derivations to supplementary materials and expand discussion on biomedical implications and usability.

Future work should collect quantitative dosage data, biomarkers, and longitudinal outcomes to strengthen causal interpretability.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #2: Yes:  Dr. Jonah Bawa Adokwe

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Nov 25;20(11):e0336846. doi: 10.1371/journal.pone.0336846.r008

Author response to Decision Letter 4


29 Oct 2025

Responds to reviewer

Reviewer 2 : 1- Ablation/Sensitivity Analysis: Obviously reported findings of comparing baseline logistic regression with the LR-ABC hybrid (identical splits of CV) and present confidence intervals or p-values.

Response : We thank the reviewer for this important observation. We enlarge and detailed the Results and Discussion section with below explanation. With this explanation, we demonstrate statistical significance (confidence interval or p-value). We compared the performance of two models (LR vs LR-ABC) with CV (cross-validation) splits.

“ To promote measurement of the robustness and importance of the observed improvements, an ablation and sensitivity analysis was managed comparing baseline Logistic Regression (LR) with the LR-ABC hybrid using identical 5-fold cross-validation splits. For each fold, performance metrics (accuracy, recall, F1-score) were recorded, and paired t-tests were employed to evaluate statistical differences. The hybrid LR-ABC model showed a consistent mean F1-score enhance of 5.2% (95% CI: 2.1–8.3%, p = 0.004) and recall enhance of 4.7% (95% CI: 1.9–7.2%, p = 0.006) over baseline LR across folds. Accuracy also increased from 84.2% ± 3.9 to 89.1% ± 4.6 (p < 0.01). These experiments validate that the observed performance obtains were statistically critical rather than attributable to stochastic variation from data resampling, thereby supporting the independent contribution of the ABC optimizer to the hybrid model’s predictive performance. “

Reviewer 2 : 2- 2. Generalizability: N/A.Give external or more intensive internal validation (e.g. nested or repeated CV) and make limitations clear.

Response : We thank the reviewer for this important observation. In this study, the data collection process was very long and difficult. Therefore, we plan to conduct experiments on other populations in our future studies. Meanwhile, our data collection process is still ongoing. We added below explanation for this issue to the Research limitations and future work section.

“ Although the repeated five-fold cross-validation design yield a robust internal estimate of model stability, true external generalization across populations, clinical settings, and data acquisition conditions remains untested. This may restrict the generalizability of the findings. Future work should purpose to execute external validation using prospective datasets to confirm the reproducibility and clinical applicability of the proposed hybrid LR–ABC framework. ”

Reviewer 2 : 3- ABC Parameter Justification: Provide explanation and rationale of hyperparameters; provide a short sensitivity analysis.

Response : Thank you for your valuable explanation. We determine the optimal hyperparameters for ABC algorithms by varying the number of iterations and the bee count. The experimental results can be found on GitHub, along with an additional explanatory table in the appendix. We have organised the updated information in the manuscript.

Table A3. Determination of the hyperparameters used in ABC algorithms (%).

Fitness Function Bee Count Iteration Acc. Bee Count Iteration Accuracy

Random Forest 5 10 88.30 5 20 84.37

10 10 88.30 10 20 88.30

15 10 89.51 15 20 90.13

20 10 88.30 20 20 86.19

5 30 85.63 5 40 85.50

10 30 88.30 10 40 88.30

15 30 91.36 15 40 90.13

20 30 85.63 20 40 86.19

Reviewer 2 : 4- Feature Coding & Confounding:

Elaborate on derivation of supplement variables, frequencies and do not use causal language.

Response : Thank you for your contribution. We have removed all causal language from the manuscript. We discuss the derivation of the supplementary variables in the section on research limitations and future work. This section previously lacked a detailed paragraph. The explanation below details the derivation of nutritional and pharmaceutical supplements.

“ Thirdly, only supplement intake was encoded as binary “active ingredient” variables (coded as 1 if reported intake met 100% of the recommended daily intake). Detailed dietary intake data (e.g., habitual nutrient consumption from food sources) were not included. This exclusion was due to the unavailability of reliable, standardized dietary records in the patient dataset. The conversion of drug and supplement intake into "active ingredient" variables was streamlined into a binary coding system, designating patients as "1" if their reported intake satisfied all daily requirements. Notably, this method overlooks crucial clinical information, including dosage levels (for instance, 100% versus 500% of the recommended intake) and usage duration (for instance, short-term versus long-term supplementation). Furthermore, the absence of baseline dietary intake data is a notable limitation. To illustrate, a patient who naturally consumes a diet high in omega-3 fatty acids but does not take supplements would be assigned the label "0," despite their actual nutrient levels being adequate. These constraints serve to reduce the biological precision of the "active ingredient" variables. This simplification renders the supplement variables clinically simplistic and potentially misleading, as it ignores fundamental determinants of biological effect such as dose, duration, and baseline nutrition. Therefore, while the identification of omega-3 and folic acid as predictive factors is consistent with established literature, these findings should be interpreted as exploratory, hypothesis-generating correlations rather than as robust novel clinical evidence.”

Reviewer 2 : 5- Model Reporting:

Provide calibration measures (e.g. Brier score, calibration plots) and display uncertainty in performance measures (mean & SD or CI).

Response : Thank you for your contribution. We explain the calibration measurements. Figure 3 and below explanation were added to the manuscript.

According to Table 3, Among the baseline models, ABC-LR-RF achieved the highest accuracy (91.36%), whereas KNN performed the weakest (63.56%). The F-scores of all models are closely aligned with their accuracy values (approximately a 1:1 ratio), suggesting that class balance was effectively maintained through the application of the SMOTE technique. After incorporating logistic regression–based feature selection, all models except SVM demonstrated a noticeable increase in performance. The integration of the ABC optimization algorithm in Stage 3 led to additional improvements across all models.The ABC–LR–RF model achieved the best overall performance, with the highest accuracy (91.36%) and recall (96.92%), confirming that the ABC optimizer effectively fine-tuned the model’s parameters and enhanced sensitivity for predicting successful embryo transfers. Although SVM initially experienced a slight decrease after logistic regression–based feature selection, its performance improved substantially with ABC optimization (accuracy = 88.26%), suggesting that metaheuristic optimization can recover and enhance model capacity in non-linear classification tasks.

The dashed diagonal line in the calibration plot symbolise absolute agreement between predicted values and test values.The ABC–LR–RF hybrid model demonstrates the model’s empirical calibration using the orange curve.Its close alignment with the diagonal suggests that the predicted IVF success values are reliable and well-calibrated.

The accompanying Brier score measure this relationship further, approving that the model provides reliable probabilistic predictions. Brier score denote better calibration when it takes lower velues.

Reviewer 2 : 6- Reproducibility:

Make the GitHub repository able to reproduce all important results, having all essential paths and instructions.

Response : Thank you for your suggestion. We added updated files to the github repository. Researchers can reach the files with this link. https://github.com/ugurejder/ABC_IVF

Reviewer 2 : 6- Figures/Tables:

Make performance tables easier to understand, enhance figure definitions and consistency with the amended text.

Response : Thank you for your valuable comment. We shift complex table to the appendix section and revise it with simple version and control the figure and tables names.

Reviewer 3 : 1- There still persists, despite careful rewording, some residual implication of supplement “benefit”. The interpretation of feature importance as biological relevance (e.g., omega-3 as “influential”) may unintentionally suggest causation..

Respond : Thank you for your valuable comment. Although it may seem like causality, throughout the manuscript we have tried to emphasize that the active ingredients used are effective in the relationship and correlation in the estimation process, not in causality.

Reviewer 3 : 2- Include additional metrics like ROC-AUC, PR-AUC, calibration, and confusion matrices to contextualize accuracy.

Respond : Thank you for your valuable contribution. We added ROC-AUC, PR-AUC, calibration, and confusion matrices figures and explanations to consolidate accuracy.

Reviewer 3 : 3- Clarify the clinical scope by emphasizing embryo transfer success and not pregnancy; explicitly distinguish model domain from treatment prognosis.

Respond : Thank you for your valuable contribution. We added below explanation to the result and discussion section. We emphasis that the scope of the embryological factors influencing transfer outcomes rather than pregnancy variables

“ This study was designed to develop a predictive model applicable to early implantation signals to determine the probability of embryo transfer success rather than long-term pregnancy or live birth outcomes. This distinction is important to note because the model was trained and validated using embryo-level situation-specific features rather than pregnancy variables. According to the model’s calibration performance (e.g., ROC–AUC = 0.96, PR–AUC = 0.95, Brier = 0.089), it should be known that model should be interpreted exactly with regard to forecasting the success of embryo transfers. This scope confirms that the model reflects embryological factors influencing transfer outcomes, without merging flow of clinical endpoints such as pregnancy progression or live birth.”

Reviewer 3 : 4- Reduce the amount of technical information by moving pseudocode and derivations to supplementary materials and expand discussion on biomedical implications and usability.

Respond : Thank you for your valuable suggestion. The previous reviewer want to us to add pseudocode code. Now we shift the technical information to the appendix section. and expand discussion on biomedical implications and usability with adding below explanation.

“ The ABC–LR–RF hybrid model contributes a clinically interpretable framework for predicting embryo transfer success probabilities in IVF. By combining feature selection with ensemble learning, it captures multidimensional embryological and procedural factors that influence implantation probability. Calibrated probability outputs can assist clinicians throughout embryo development and complement traditional morphology-based assessments. Because the model uses routinely available enlarging data, it can be easily executed within remaining electronic IVF management systems, and using decision support. Its robust calibration and sensitive performance demonstrates reliable generalisation across clinics. The most important point to note here is that the model does not predict pregnancy or treatment outcome, but can quantitatively determine the instantaneous probability of a successful embryo transfer. “

Reviewer 3 : 4- Future work should collect quantitative dosage data, biomarkers, and longitudinal outcomes to strengthen causal interpretability.

Respond : Thank you for your valuable suggestion. The last paragraph of the research limitation and future work contain the dosage and casuality of the study.

Attachment

Submitted filename: Responds to reviewer_4.docx

pone.0336846.s006.docx (25.1KB, docx)

Decision Letter 4

Ayman Swelum

2 Nov 2025

<p>Predicting IVF Outcomes Using a Logistic Regression–ABC Hybrid Model: A Proof-of-Concept Study on Supplement Associations

PONE-D-25-18542R4

Dear Dr. EJDER,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Ayman A Swelum

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #2: Yes

**********

Reviewer #2: Authors thoroughly revised the manuscript and provided justifications, methodological transparency, discussion of confounding factors, calibration analysis, and open-source reproducibility.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #2: Yes:  Jonah Bawa Adokwe Ph.D

**********

Acceptance letter

Ayman Swelum

PONE-D-25-18542R4

PLOS ONE

Dear Dr. EJDER,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Ayman A Swelum

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Supplementary Tables and Algorithm Details.

    (DOCX)

    pone.0336846.s001.docx (28KB, docx)
    Attachment

    Submitted filename: Respond to reviewers.txt

    pone.0336846.s003.txt (13.2KB, txt)
    Attachment

    Submitted filename: Responds to reviewer_2.docx

    pone.0336846.s004.docx (22.8KB, docx)
    Attachment

    Submitted filename: Responds to reviewer_3.docx

    pone.0336846.s005.docx (28.4KB, docx)
    Attachment

    Submitted filename: Responds to reviewer_4.docx

    pone.0336846.s006.docx (25.1KB, docx)

    Data Availability Statement

    https://github.com/ugurejder/ABC_IVF/blob/main/IVF_english_dataset.xlsx.


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES