Application of machine learning to predict delayed fecundability among women in sub-Saharan Africa

Meron Asmamaw Alemayehu; Nebiyu Mekonnen Derseh; Tigist Kifle Tsegaw; Tilahun Yemanu Birhan; Banchlay Addis; Berhanie Addis Ayele; Emebet Birhanu Lealem; Eyob Akalewold Alemu; Fetlework Gubena Arage; Gebrie Getu Alemu; Getaneh Awoke Yismaw; Habtamu Abebe Getahun; Habtamu Wagnew Abuhay; Mekuriaw Nibret Aweke

doi:10.1530/RAF-25-0068

. 2025 Oct 7;6(4):e250068. doi: 10.1530/RAF-25-0068

Application of machine learning to predict delayed fecundability among women in sub-Saharan Africa

Meron Asmamaw Alemayehu ^1,^✉, Nebiyu Mekonnen Derseh ¹, Tigist Kifle Tsegaw ², Tilahun Yemanu Birhan ¹, Banchlay Addis ³, Berhanie Addis Ayele ¹, Emebet Birhanu Lealem ¹, Eyob Akalewold Alemu ², Fetlework Gubena Arage ², Gebrie Getu Alemu ¹, Getaneh Awoke Yismaw ¹, Habtamu Abebe Getahun ¹, Habtamu Wagnew Abuhay ¹, Mekuriaw Nibret Aweke ⁴

PMCID: PMC12927445 PMID: 40956608

Abstract

Graphical Abstract

graphic file with name RAF-25-0068inf1.jpg

Abstract

Delayed fecundability, defined as trying to conceive for ≥12 months without success, is a growing global concern due to the threat of fertility rates falling below the replacement level. This study aimed to predict delayed fecundability and identify influential predictors. Secondary data from recent Performance Monitoring for Action (PMA) surveys on fertility, contraception, and reproductive health in five sub-Saharan African countries were used. Preprocessing and feature engineering included imputation, encoding, and correlation filtering. Feature selection was done using the Boruta algorithm. Machine learning models, including random forest, XGBoost, and LightGBM, were developed and optimized via grid search with cross-validation. Models were compared using default hyperparameters. Interpretability was enhanced through SHapley Additive exPlanations (SHAP) plots, and heterogeneity was explored with subgroup SHAP analysis to identify context-specific predictor effects. Delayed fecundability was present in 31.01% of women. Grid search optimization improved model performance, with random forest achieving the highest accuracy (79.2%) and AUC (0.94). SHAP analysis identified key predictors, including age 36–49 (0.211), being married (0.208), ovulation-inducing treatment (0.173), and herbal remedy use (0.118). Subgroup SHAP analysis revealed heterogeneity: younger age reduced risk in 15–25-year-olds, fertility treatment history was the main risk driver in treated women, and marital status and childbirth had variable effects across subgroups. The random forest model best predicted delayed fecundability, with age, marital status, and treatment history as key predictors. Subgroup SHAP analysis revealed risk patterns across populations. Targeted screening and tailored fertility counseling, especially for couples with prior fertility treatments, are recommended to support timely conception.

Lay summary

Many women struggle to get pregnant even after trying for a year or more, a condition called delayed fecundability. This issue is becoming more common worldwide and can signal problems with fertility. We used data from surveys in five African countries to find out which factors may predict this delay. Using computer models that can learn from data, we found that age, marital status, and past use of fertility treatments were strong predictors. Our best model correctly identified nearly 80% of women with delayed fecundability. To make the findings easy to understand, we used a method that explains how each factor influences the result. We also found that the effects of these factors vary by age and treatment history. Our results can help health workers identify women at higher risk earlier, especially in places where fertility services are limited, and provide them with better, more personalized care.

Keywords: delayed fecundability, infertility, machine learning, SHAP analysis, subgroup analysis, interpretable AI, reproductive health, PMA data

Introduction

Delayed fecundability, defined as a prolonged time to achieve conception despite regular, unprotected sexual intercourse, affects a significant proportion of couples globally and serves as an early indicator of subfertility or infertility (Dunson & Colombo 2004, Wesselink et al. 2017). Understanding its predictors is increasingly critical in light of shifting fertility patterns, delayed childbearing, and rising exposure to modifiable risk factors such as obesity, environmental stressors, and endocrine-disrupting chemicals (Barbouni et al. 2025). Globally, individuals are postponing childbearing due to extended education, career priorities, economic pressures, and evolving social norms, a trend evident in both high-income countries and low- and middle-income countries (LMICs). In addition, global challenges such as the COVID-19 pandemic, geopolitical instability, and climate change may further exacerbate fertility declines (Sobotka & Zeman 2022, Gietel-Basten & Sobotka 2022). In response, several countries have adopted pronatalist policies as birth rates fall below replacement levels (Sobotka et al. 2019, Demoskop 2022, NHC-China 2022, Räsänen & Smajdor 2025).

These demographic changes are particularly concerning in LMICs, where access to fertility services remains limited or stigmatized. Early identification of women at risk of delayed fecundability is therefore crucial for enabling timely clinical intervention and informing national reproductive health strategies. While prior research has identified key biological and behavioral factors affecting fecundability, including maternal age, body mass index (BMI), menstrual regularity, and lifestyle behaviors (Sapra & Maisog 2016, Stanford et al. 2019, Liu & Shen 2021, Konishi et al. 2021), most studies have relied on traditional statistical models that may not adequately capture complex, nonlinear relationships or interactions among predictors.

In recent years, machine learning (ML) techniques have shown promise in modeling such complexities, offering improved predictive performance in various reproductive health applications (Oudshoorn & Steegers-Theunissen 2023, Khan & Khare 2024). However, a key limitation of many ML approaches is their ‘black-box’ nature, which limits transparency and practical utility in clinical and policy contexts. Interpretable machine learning (IML) methods address this gap by combining predictive accuracy with model explainability, making them more applicable to real-world decision-making (Molnar 2022). For example, SHapley Additive exPlanations (SHAP) have been used with tree-based models such as XGBoost to elucidate feature contributions in pregnancy prediction models (Cao & Song 2020).

Despite the growing interest in IML, there remains a significant gap in its application to delayed fecundability, especially using large, population-based data in low- and middle-income countries (LMICs). The Performance Monitoring for Action (PMA) surveys provide high-quality reproductive health data across multiple LMICs and offer an underutilized opportunity for predictive modeling in this area. While infertility affects both women and men, this study focuses on women due to the structure of PMA surveys, which collect detailed fertility and reproductive health data from female respondents. This focus also aligns with the current landscape of reproductive health interventions, which predominantly target women.

Therefore, this study aims to develop and evaluate IML models to predict delayed fecundability among conception-seeking, reproductive-aged women in selected PMA survey countries and to identify the most influential predictors associated with this outcome.

Methods

Study design and data source

This study employed a cross-sectional analytical design using secondary data from the Performance Monitoring for Action (PMA) project. PMA conducts nationally and sub-nationally representative surveys focused on reproductive health and family planning in low- and middle-income countries using standardized instruments and mobile-assisted personal interviews. We analyzed the most recent PMA survey rounds available as of 2024 from five countries: Nigeria, Niger, Kenya, the Democratic Republic of Congo, and Côte d'Ivoire. These datasets were selected because they included key variables on conception-seeking behavior and time to pregnancy, which are essential for assessing delayed fecundability. The surveys employed a stratified cluster sampling approach and collected detailed information from reproductive-aged women. All datasets are publicly accessible through the PMA DataLab platform and were used in accordance with data use agreements (PMA 2023, 2024a,b,c,d).

Study population

The study population comprised 2,206 women of reproductive age (15–49 years) who participated in the most recent PMA surveys conducted in Nigeria, Niger, Kenya, the Democratic Republic of Congo, and Côte d'Ivoire. The data used are cross-sectional, collected at a single point in time for each respondent, and do not include longitudinal follow-up. Women were included if they reported being in a union or partnership, were actively trying to become pregnant at the time of the survey, and had been having regular, unprotected sexual intercourse.

Study variables

The primary outcome variable in this study was delayed fecundability, defined as a binary outcome representing whether a woman had been actively trying to conceive for 12 months or longer through regular, unprotected sexual intercourse without achieving pregnancy.

A comprehensive set of predictor variables was included based on theoretical relevance, availability, and empirical evidence from prior research. These predictors encompassed sociodemographic characteristics such as age, country of residence, marital status, religion, education level, recent occupation, wealth index, and presence of co-wives in polygynous unions. Reproductive history variables included parity, history of childbirth, history of pregnancy loss, prior use of family planning methods, and age at first sexual intercourse. Health-related variables comprised self-reported general health status and health insurance coverage. Additional variables captured fertility-related behaviors and interventions, including prior use of fertility treatments or support such as in vitro fertilization, ovulation-inducing medications, sexually transmitted infection treatment, surgery, hormonal injections, oral pills, herbal medicine, traditional drinks, and home remedies. Contextual variables included media exposure and household size.

Data preprocessing

Missing data handling

Before model development, the data underwent a series of preprocessing steps to ensure analytical rigor and suitability for ML applications. Variables with missing values were first assessed for the extent (in percentage) and pattern of missingness. Features with low levels of missing data (religion (1.6%), marital status (0.45%), age (0.46%), parity (7.3%)) were imputed using mode imputation. For variables with moderate levels of missingness, which were whether the husband has other wives (16.1%) and wealth index (10.6%), multiple imputation by chained equations (MICE) was applied, under the assumption that the data were missing at random. Another feature with a high level of missingness was history of pregnancy loss (58.6%). Given its theoretical relevance to the prediction task, we chose not to exclude this variable without empirical evaluation. Instead, we applied MICE to handle the missing data and assessed its predictive value. As the imputed variable did not demonstrate any meaningful contribution to the model, it was subsequently excluded from the final analysis. The missingness pattern of the dataset is illustrated in Supplementary Fig. 1 (see section on Supplementary materials given at the end of the article).

Feature engineering

A series of feature engineering steps were undertaken to prepare the dataset for ML modeling. These included feature construction, encoding, correlation analysis, and feature selection.

Feature construction

The variable media exposure was constructed by aggregating responses from three binary indicators: access to internet, radio, and television exposure. A composite score was generated as the sum of these indicators, which was subsequently recoded into a binary variable where any exposure was classified as ‘Yes’ and no exposure as ‘No’. Several continuous variables were discretized to facilitate analysis and capture nonlinear relationships. Age and age at first sexual intercourse were both initially continuous and were categorized into meaningful intervals based on distribution and theoretical relevance. Similarly, family size, originally a continuous variable, was recoded into grouped categories. The wealth index, a categorical variable with five original levels (poorest, poorer, middle, richer, richest), was lumped into three levels: poor, middle, and rich, to simplify the analysis and ensure adequate sample sizes within each category.

Encoding and correlation analysis

To prepare the data for ML algorithms, all features were encoded into numeric form using one-hot encoding. This was implemented with the dummyVars() function from the caret package in R, which expanded 26 categorical variables into 69 binary (0/1) features. This transformation ensures compatibility with algorithms that require numerical input and avoids introducing unintended ordinal relationships between categories.

To reduce redundancy and prevent multicollinearity, we then conducted a correlation analysis on the encoded feature set. A Pearson correlation matrix was used to identify pairs of features with a correlation coefficient greater than 0.8. Features exhibiting high pairwise correlations were considered redundant and removed. In total, 19 such features were dropped, including variables such as Other_WivesNo, Family_Size1-6, Media_ExposureNo, and several indicators of the absence of fertility-related interventions (e.g. Preg_help_IVFNo, Preg_help_HerbsNo). Removing highly correlated features reduces overfitting and improves the stability of model estimates.

The resulting feature set contained 50 predictors optimized for model training, improving both model interpretability and computational efficiency. A visual summary of the correlation structure is provided in Supplementary Fig. 2.

Feature selection

Feature selection was performed using the Boruta algorithm, a robust wrapper method built around random forests (RFs). This approach identified 17 features as significantly associated with delayed fecundability. Among the top predictors were a history of ovulation-inducing treatment (Boruta importance = 32.05), prior childbirth (22.93), use of herbal remedies (21.01), fertility pills (20.27), and traditional drinks (18.97). These findings suggest a multifaceted interplay of biomedical, behavioral, and sociodemographic factors influencing conception delays. The selected features informed downstream modeling and interpretation.

Class imbalance

The outcome variable showed a class distribution of approximately 68.99% not delayed and 31.01% delayed fecundability, resulting in a moderate imbalance (ratio of 2.2:1). Given this ratio, and as most methodological guidelines suggest that SMOTE is more appropriate when the minority class represents less than 20–30 of the total distribution, SMOTE was not used. Instead, a stratified train-test split and ten-fold cross-validation, along with appropriate performance metrics (such as AUC and F1-score), were applied to mitigate any potential bias arising from the imbalance.

Machine learning algorithms

To develop predictive models for delayed fecundability, we employed nine well-established ML models, each representing a distinct analytical approach. These included logistic regression (LR), RF, support vector machine (SVM), K-nearest neighbors (KNN), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), Naive Bayes (NB), and decision tree (DT).

LR was used as a baseline due to its interpretability and capacity to model linear relationships between predictors and the likelihood of delayed fecundability. RF and DT models are particularly well-suited to capture nonlinear interactions and hierarchical decision rules that may underlie complex reproductive behaviors. SVM is effective in separating classes in high-dimensional spaces, which is advantageous given the large number of features derived from one-hot encoding. K-nearest neighbors, a non-parametric method, leverage local similarity patterns, which can help detect subtle behavioral or demographic clusters associated with conception delays. GBM, XGBoost, and LightGBM are powerful ensemble techniques that iteratively reduce prediction error by focusing on hard-to-classify instances, making them well-equipped to handle imbalanced or nuanced patterns in fertility-related data. NB, while based on strong independence assumptions, offers speed and simplicity, and can perform surprisingly well in structured health datasets. Together, these models provide a robust comparative framework for identifying the most effective predictive strategy for delayed fecundability.

Hyperparameter tuning and performance evaluation

To ensure optimal performance of each ML model, two optimization strategies were employed: the default hyperparameters and hyperparameter tuning via grid search. Models were initially trained using their default parameter settings to establish baseline performance. Subsequently, a systematic grid search approach was implemented to identify the best combination of hyperparameters for each algorithm, using cross-validation to prevent overfitting. The performance of models before and after hyperparameter optimization was compared to assess the impact of tuning on predictive accuracy and generalizability.

Model performance was evaluated using five commonly used classification metrics: accuracy, area under the receiver operating characteristic curve (AUC), precision, recall, and F1-score. Accuracy measures the proportion of correctly classified instances out of all observations and serves as a general indicator of model performance. AUC reflects the model’s ability to discriminate between the positive (delayed fecundability) and negative classes across all classification thresholds, with higher values indicating better discrimination. Precision quantifies the proportion of true positive predictions among all positive predictions made by the model, highlighting its reliability in identifying actual cases of delayed fecundability. Recall (or sensitivity) measures the proportion of actual delayed fecundability cases correctly identified by the model, emphasizing its completeness. The F1-score is the harmonic mean of precision and recall, providing a balanced metric in scenarios with class imbalance. While all five metrics were considered, accuracy and AUC were used as the primary criteria for model comparison due to their interpretability and effectiveness in binary classification tasks.

Subgroup analysis

To explore potential heterogeneity in feature contributions and assess the contextual behavior of key predictors, a subgroup SHAP analysis was conducted. Subgroups were defined based on the top three most influential features from the global SHAP beeswarm and bar plots: age category (15–25, 26–35, 36–49), marital status (never married, divorced/separated, living with a partner), and history of fertility treatment with ovulation induction. For each subgroup, individual-level SHAP values were subset using logical indexing based on the one-hot encoded group definitions. SHAP beeswarm plots were then generated separately for each group to visualize the relative magnitude and direction of predictor contributions within each subgroup. This approach allowed for a more granular, interpretable understanding of whether and how the model’s risk attribution varied across distinct demographic and clinical populations. The method also supports assessment of model fairness and the presence of context-specific predictor effects that may be masked in global importance summaries.

Analytical software and packages

All analyses were conducted using R (version 4.4.1) within the RStudio (version 2024.04.2 + 764, ‘Chocolate Cosmos’ release) integrated development environment. ML models were developed using the caret, Xgboost, lightgbm, rf, and gbm packages for model training and tuning. SHAP (SHapley Additive exPlanations) values were computed and visualized using the shapviz package to enhance model interpretability.

Results

Participant characteristics

Among women with delayed fecundability, the largest age group was 36–49 years, comprising 283 women (41.4%), while the most common age group among those without delay was 15–25 years, with 521 women (34.2%) (χ² = 69.6, P-value < 0.001). Marital status also varied between groups; the majority of women in both groups were married, with 520 (76.0%) among those with delayed fecundability and 1,048 (68.9%) among those without. A smaller proportion of never-married women was observed in the delayed group (48 women, 7.0%) compared to the non-delayed group (197 women, 12.9%) (χ² = 25.4, P-value < 0.001). Regarding family planning history, 402 women (58.8%) with delayed fecundability reported ever using family planning, compared to 931 women (61.1%) without delayed fecundability (χ² = 1.13, P-value < 0.287). These figures provide a descriptive Chi-square test of significance overview of all features and reproductive characteristics among the study population (Table 1).

Table 1.

Distribution of sociodemographic, reproductive, and health characteristics of women by fecundability status (n = 2,206). Data are presented as n (%).

Characteristics	Fecundability		χ ²	P
Characteristics	Delayed	Not delayed	χ ²	P
n	684	1,522
Age			69.6	<0.001
15–25	143 (20.9)	521 (34.2)
26–35	258 (37.7)	617 (40.5)
36–49	283 (41.4)	384 (25.2)
Marital status			25.4	<0.001
Divorced/separated	38 (5.6)	56 (3.7)
Living with a partner	73 (10.7)	208 (13.7)
Married	520 (76.0)	1,048 (68.9)
Never married	48 (7.0)	197 (12.9)
Widow/widower	5 (0.7)	13 (0.9)
Country			338.3
Cote d’Ivoire	74 (10.8)	486 (31.9)		<0.001
DRC	151 (22.1)	80 (5.3)
Kenya	238 (34.8)	220 (14.5)
Niger	185 (27.1)	553 (36.3)
Nigeria	36 (5.3)	183 (12.0)
Religion			101.3	<0.001
Animism	8 (1.2)	28 (1.8)
Catholic	223 (32.6)	379 (24.9)
Evangelical	19 (2.8)	123 (8.1)
Methodist	22 (3.2)	75 (4.9)
Muslim	283 (41.4)	780 (51.3)
No religion	24 (3.5)	49 (3.2)
Other	11 (1.6)	22 (1.4)
Other Christian	94 (13.7)	66 (4.3)
General health			10.5	<0.032
Bad	17 (2.5)	21 (1.4)
Good	370 (54.1)	775 (50.9)
Moderate	116 (17.0)	247 (16.2)
Very bad	2 (0.3)	1 (0.1)
Very good	179 (26.2)	478 (31.4)
School			19.0	<0.001
Never	166 (24.3)	501 (32.9)
Primary	180 (26.3)	334 (21.9)
Secondary	208 (30.4)	450 (29.6)
Tertiary	130 (19.0)	237 (15.6)
Husband has other wives			0.006	0.935
No	539 (78.8)	1,207 (79.3)
Yes	145 (21.2)	315 (20.7)
Recent occupation			3.67	<0.055
No	337 (49.3)	817 (53.7)
Yes	347 (50.7)	705 (46.3)
Ever gave birth			62.4	<0.001
No	95 (13.9)	67 (4.4)
Yes	589 (86.1)	1,455 (95.6)
History of pregnancy loss			0.204	<0.651
No	651 (95.2)	1,420 (93.3)
Yes	33 (4.8)	102 (6.7)
History of FP use			1.13	<0.287
No	282 (41.2)	591 (38.9)
Yes	402 (58.8)	931 (61.1)
Family size			5.70	<0.058
1–6	498 (72.8)	1,031 (67.8)
7–12	160 (23.4)	421 (27.7)
>12	26 (3.8)	70 (4.6)
Media exposure			1.81	<0.178
No	187 (27.3)	459 (30.2)
Yes	497 (72.7)	1,063 (69.8)
Parity			11.9	<0.008
0	3 (0.4)	0 (0)
1–5	617 (90.3)	1,313 (86.3)
6–10	61 (8.9)	202 (13.3)
>11	3 (0.4)	7 (0.5)
Health insurance			6.03	<0.014
No	568 (83.0)	1,324 (87.0)
Yes	116 (17.0)	198 (13.0)
Age at first sex			67.53	<0.411
<20	582 (85.1)	1,274 (83.7)
>20	102 (14.9)	248 (16.3)
History of fertility treatment
IVF			15.4	<0.001
No	674 (98.5)	1,520 (99.9)
Yes	10 (1.5)	2 (0.1)
Treatment to increase ovulation			122.5	<0.001
No	594 (86.8)	1,491 (98.0)
Yes	90 (13.2)	31 (2.0)
STI treatment			36.37	<0.001
No	648 (94.7)	1,506 (99.0)
Yes	36 (5.3)	16 (1.0)
Surgery			33.6	<0.001
No	669 (97.8)	1,522 (100)
Yes	15 (2.2)	0 (0)
Injections			88.0	<0.001
No	626 (91.5)	1,509 (99.1)
Yes	58 (8.5)	13 (0.9)
Pills			102	<0.001
No	592 (86.5)	1,484 (97.6)
Yes	92 (13.5)	38 (2.5)
Medicinal drink			48.4	<0.001
No	637 (93.1)	1,502 (98.7)
Yes	47 (6.9)	20 (1.3)
Home remedy			35.04	<0.001
No	644 (94.2)	1,501 (98.6)
Yes	40 (5.8)	21 (1.4)
Herbs			84.4	<0.001
No	584 (85.4)	1,465 (96.3)
Yes	100 (14.6)	57 (3.7)
Wealth index			11.4	<0.022
Middle	111 (16.2)	271 (17.8)
Poor	269 (39.3)	530 (34.8)
Rich	304 (44.5)	721 (47.4)

Open in a new tab

The categorical flow diagram (Fig. 1) reveals a clear trend in the distribution of delayed fecundability across age groups. While the majority of women aged 15–25 years did not experience delayed fecundability, the proportion of delayed cases increases notably in the 26–35 and 36–49 age groups. This pattern is further supported by the stacked bar chart (Supplementary Fig. 3), which visually demonstrates an upward trend in the proportion of delayed fecundability with increasing age. The observed pattern is statistically significant, as shown by the chi-square test (Table 1), which yielded a test statistic of χ² = 69.6 with a P-value <0.001. These visual and statistical findings are consistent with the model’s results, reinforcing that age is a strong predictor of delayed fecundability, with higher age groups contributing a disproportionately larger share to the delayed outcome.

Flow-based visualization depicting the distribution of delayed fecundability across age groups.

Model building and hyperparameter optimization

To ensure optimal performance of each ML model, two optimization strategies were employed: the default hyperparameters and hyperparameter tuning via grid search. Each model underwent a systematic grid search using cross-validation, with parameter ranges tailored to the model’s learning structure. XGBoost achieved the best results with conservative settings, including a low learning rate (0.01) and moderate tree depth (8), helping mitigate overfitting. For RF, predictive accuracy peaked when six variables were sampled at each split (mtry = 6). LightGBM favored a deeper tree structure (max_depth = 20) with a higher number of leaves and a faster learning rate. In contrast, GBM performed optimally with a lower learning rate and shallower depth, emphasizing regularized growth. SVM showed improved generalization with a small penalty term (C = 0.01) and fine kernel width (sigma = 0.001). KNN performed best at k = 19, suggesting a broader neighborhood was beneficial, while LR favored a sparse solution via Lasso regularization. NB improved with kernel estimation enabled, and the DT model performed best under minimal pruning with deeper splits. Supplementary Table 1 summarizes the key hyperparameters before and after tuning across all models.

Model comparison

Performance at default hyperparameters

Among models trained using default hyperparameters, accuracy ranged from 0.7089 to 0.7481 across algorithms. The highest accuracy was observed in the SVM model at 0.7481 95% CI: 0.7133, 0.7807), followed closely by the NB model at 0.7466 (95% CI: 0.7117, 0.7793) and LR at 0.7436 (95% CI: 0.7086, 0.7764). In contrast, the KNN model had the lowest accuracy at 0.7089 (95% CI: 0.6727, 0.7432). In terms of discriminative ability, as measured by the area under the curve (AUC), LR (0.6975), SVM (0.6973), and XGBoost (0.6931) showed the strongest performance, while DT (0.6014) and KNN (0.6322) had the lowest AUCs, indicating limited capacity to distinguish between delayed and non-delayed fecundability cases.

When evaluating overall classification performance using the F1-score, which balances precision and recall, the NB model exhibited an F1-score of 0.8396, followed closely by SVM (0.8363) and LR (0.8340). Models such as DT (0.8225) and KNN (0.8181) also performed reasonably well, but with lower AUC and accuracy values, their overall predictive robustness was more limited.

In summary, based on a combination of high accuracy, relatively high AUC, and strong F1-score, the SVM algorithm outperformed other models under default hyperparameter settings, making it the strongest initial performer before tuning (Table 2).

Table 2.

Performance metrics of machine learning models using default hyperparameters for predicting delayed fecundability in reproductive-aged women. Models are ordered by accuracy (highest to lowest).

Model	Accuracy (95%CI)	Recall	Precision	F1-score	AUC
Support vector machine	0.7481 (0.7133, 0.7807)	0.9387	0.7540	0.8363	0.6973
Naive Bayes	0.7466 (0.7117, 0.7793)	0.9562	0.7483	0.8396	0.6800
Logistic regression	0.7436 (0.7086, 0.7764)	0.9344	0.7531	0.8340	0.6975
Decision tree	0.7240 (0.6883, 0.7577)	0.9278	0.7387	0.8225	0.6014
Extreme gradient boosting	0.7221 (0.6862, 0.7559)	0.8462	0.7715	0.8071	0.6931
Random forest	0.7195 (0.6836, 0.7534)	0.9825	0.7161	0.8284	0.6683
Light gradient boosting	0.7145 (0.6784, 0.7486)	0.8462	0.7639	0.8029	0.6910
Gradient boosting machine	0.7104 (0.6742, 0.7447)	0.8753	0.7477	0.8065	0.6921
K-nearest neighbors	0.7089 (0.6727, 0.7432)	0.9497	0.7185	0.8181	0.6322

Open in a new tab

Performance at grid search hyperparameters

Following grid search-based hyperparameter tuning, all models showed improvements in at least one performance metric compared to their default configurations. The highest overall accuracy was achieved by the RF model at 0.7921 (95% CI: 0.7801, 0.8015), followed by LightGBM at 0.7711 (95% CI: 0.7656, 0.7846) and XGBoost at 0.7387 (95% CI: 0.7034, 0.7718). In contrast, the NB model, which previously performed best under default settings, recorded a notably lower accuracy of 0.6938 (95% CI: 0.6572, 0.7287) after tuning.

In terms of discriminative ability, RF again demonstrated the strongest performance with an AUC of 0.9482, followed closely by LGB at 0.9383. These values were substantially higher than those of models such as DT and NB, which showed AUCs of 0.5593 and 0.7231, respectively (Fig. 2).

AUC-ROC curves for prediction of delayed fecundability among reproductive-aged women.

For the F1-score, RF (0.8330), KNN (0.8340), and SVM (0.8327) performed comparably, with only slight variations. While models such as NB and DT showed acceptable F1-scores (0.8185 and 0.8034, respectively), their lower accuracy and AUC scores limited their overall performance.

Taken together, the RF model with tuned hyperparameters emerged as the top-performing algorithm, demonstrating the highest accuracy, the best AUC, and one of the strongest F1-scores. Compared across both default and optimized settings, this model consistently outperformed all others, making it the most robust and reliable for predicting delayed fecundability in this study (Table 3).

Table 3.

Performance metrics of machine learning models using grid search hyperparameter optimization for predicting delayed fecundability in reproductive-aged women. Models are ordered by accuracy (highest to lowest).

Model	Accuracy (95% CI)	Recall	Precision	F1-score
Random forest	0.7921 (0.7801, 0.8015)	0.9344	0.7518	0.8330
Light gradient boosting	0.7711 (0.7656, 0.7846)	0.8857	0.7618	0.8181
Extreme gradient boosting	0.7387 (0.7034, 0.7718)	0.9407	0.7456	0.8320
Support vector machine	0.7360 (0.7007, 0.7692)	0.9540	0.7390	0.8327
Gradient boosting machine	0.7345 (0.6992, 0.7678)	0.9322	0.7461	0.8283
Logistic regression	0.7330 (0.6976, 0.7664)	0.9584	0.7349	0.8315
K-nearest neighbors	0.7270 (0.6914, 0.7606)	0.9934	0.7184	0.8340
Decision tree	0.7059 (0.6696, 0.7403)	0.8753	0.7435	0.8034
Naive bayes	0.6938 (0.6572, 0.7287)	0.9978	0.6930	0.8185

Open in a new tab

Model interpretability

The SHAP beeswarm and bar plots of the best-performing model, RF, illustrate the influence of individual features on the model’s output for predicting delayed fecundability. High feature values are represented in red, while low values appear in blue, with the SHAP values indicating the direction and magnitude of each feature’s effect.

Among the most influential predictors, women aged 36–49 exhibited the greatest contribution toward an increased likelihood of delayed fecundability (SHAP value ≈ 0.211), followed closely by being married (SHAP value ≈ 0.208). Women who had a history of receiving treatment to increase ovulation were also associated with an increased probability of delayed fecundability (SHAP value ≈ 0.173). Conversely, younger women aged 15–25 showed a reduced contribution toward delayed fecundability (SHAP value ≈ 0.156), reflecting the generally higher fecundability in this age group (Figs 3 and 4).

SHAP beeswarm plot illustrating the global importance and impact of features in the RF model predicting delayed fecundability among reproductive-aged women.

SHAP bar plot showing the global feature importance in the RF model for predicting delayed fecundability among reproductive-aged women.

A history of previous births (SHAP value ≈ 0.138) was similarly associated with a reduced likelihood of delayed fecundability, aligning with known reproductive patterns. Having a history of receiving fertility treatments, such as pills (SHAP value ≈ 0.124) and herbal remedies (SHAP value ≈ 0.118), indicated a higher tendency toward delayed conception. Notably, women who had never been married (SHAP value ≈ 0.109) contributed less to delayed fecundability, potentially reflecting lower exposure to conception risk (Figs 3 and 4).

Other contributing factors included having a husband with other wives (SHAP value ≈ 0.098), which was associated with an increased model prediction for delayed fecundability. In contrast, the contribution of prior use of family planning methods (SHAP value ≈ 0.097) should be interpreted with caution, as the feature exhibited a mixed distribution of high and low values across both classes. This variability suggests potential confounding influences or diverse contraceptive histories, making its isolated impact on delayed fecundability less definitive.

Overall, the SHAP analysis highlights how specific sociodemographic and reproductive health characteristics differentially shape the model’s predictions, offering a transparent view into the drivers of delayed fecundability in the population (Figs 3 and 4).

Subgroup analysis

To further investigate heterogeneity in feature effects, a subgroup SHAP analysis was conducted to assess how the model’s key predictors varied in influence across distinct demographic and clinical groups.

Subgroup analysis based on SHAP values revealed nuanced differences in how the model interprets key predictors of delayed fecundability across population segments. Among individuals aged 15–25, the most influential feature was their age category itself, with a SHAP value of −0.2461. This indicates that being in this younger reproductive age group is a central reason the model assigns a lower likelihood of delayed fecundability for this subgroup. In addition, prior childbirth (−0.1643) further reinforced this lower–risk profile, while being married slightly increased the predicted likelihood of delay (+0.2287), suggesting that, within this age group, marriage may signal a greater likelihood of attempting conception, possibly revealing latent fertility issues.

For individuals aged 26–35, prior use of ovulation-inducing fertility treatments emerged as a key signal associated with increased likelihood of delayed fecundability (+0.2059). Marriage also remained influential in the same direction (+0.1738), whereas older age (being in the 36–49 group) was associated with reduced predicted risk within this group (−0.1606), potentially because those in this age bracket who were still in the 26–35 group by one-hot encoding were comparatively less represented and possibly healthier.

Marital status subgroups showed distinct patterns. Among never-married individuals, never having been married itself was the most influential factor (−0.4811), suggesting that this characteristic aligns with lower predicted risk, likely due to a shorter duration of fertility attempts or lower clinical suspicion of infertility. Being married (−0.3940) and prior childbirth (−0.3547) also supported lower predicted risk in this group, reflecting demographic and reproductive patterns typically associated with lower fecundability concern. In contrast, for individuals who were divorced or separated, having that marital history was among the top reasons for higher predicted risk (+0.3186), along with prior use of contraceptive pills (+0.2068). Marriage in this group signaled a lower likelihood of delay (−0.3293), possibly reflecting prior successful conception within that relationship.

Those living with a partner also exhibited a protective pattern. In this group, marriage (−0.3781), being aged 15–25 (−0.2422), and cohabiting itself (−0.2031) were all linked with lower predicted risk. This suggests that for partnered but unmarried individuals, demographic and relational stability may align with fewer indicators of delayed fecundability from the model’s perspective. Finally, in individuals with a history of ovulation-related fertility treatment, that treatment was the most influential feature by far (+1.2847), with additional support from prior use of fertility pills (+0.2740). Younger age (15–25) was again associated with lower predicted risk (−0.2112). This pattern aligns with clinical intuition: the model identifies medical fertility intervention as a dominant signal of concern, even while adjusting for age (Fig. 5).

SHAP plots stratified by subgroups, illustrating feature contributions to the prediction of delayed fecundability among reproductive-aged women.

Discussion

This study demonstrates the promising application of ML algorithms to predict delayed fecundability among reproductive-aged women using sociodemographic and reproductive health data. Our findings reveal that while several models yielded moderate predictive accuracy, the RF algorithm consistently outperformed others after grid search hyperparameter tuning, achieving an accuracy of 79.2% and the highest discriminative ability (AUC of 0.9482). This underscores the utility of ensemble-based approaches in modeling complex, nonlinear relationships inherent in reproductive health outcomes. Importantly, the integration of IML techniques, SHAP analysis, elucidated the nuanced roles of key predictors and their differential impacts across subpopulations. These insights not only enhance the transparency of ML predictions but also offer actionable guidance for targeted interventions and personalized fertility support. Collectively, our results highlight both the methodological advances and practical potential of ML-driven risk stratification in reproductive health, warranting further validation and implementation in clinical and public health settings.

Moreover, the subgroup analyses highlighted critical variations in how predictors influenced delayed fecundability across different demographic and clinical groups, underscoring the need for tailored risk stratification. For example, younger women with prior childbirth showed a markedly lower risk, reflecting established fertility potential. Conversely, marriage’s association with delayed fecundability varied by age and marital history, suggesting that social and biological factors interact differently across subpopulations, emphasizing the complexity of fertility determinants within marital contexts. These multifaceted differences emphasize the limitations of applying uniform predictive models and support the use of ML-driven personalized risk profiles to improve fertility counseling and intervention. Clinically, this enables more precise targeting of fertility support services, optimizing resource allocation and potentially improving reproductive outcomes. From a policy perspective, such stratified approaches can inform equitable health service planning and guide research toward understanding subgroup-specific mechanisms influencing fecundability. Ultimately, this subgroup insight advances a move toward personalized reproductive healthcare, ensuring interventions align with individual profiles for maximal effectiveness and impact.

Advanced maternal age had the highest SHAP contribution to the model’s prediction, indicating it was a key driver of the algorithm’s output, with a SHAP value of approximately 0.211. This finding supports that fecundability declines with increasing age due to diminished ovarian reserve, poorer oocyte quality, and higher prevalence of reproductive pathologies (Trawick et al. 2021). For instance, Delbaere et al. demonstrated a significant reduction in conception probability after age 35, reinforcing the biological plausibility of our results (Delbaere & Tydén 2020). From a policy standpoint, these insights underscore the importance of promoting reproductive education on age-related fertility decline and expanding access to fertility preservation technologies for women who postpone childbirth. Health care providers should discuss reproductive life plans with patients, as recommended by the CDC. Further research is needed to explore behavioral modifiers that might mitigate age-related fertility decline.

Marriage was another attribute that contributed strongly to an increased likelihood of delayed fecundability (SHAP ≈ 0.208), consistent with prior studies showing married couples often report more attempts to conceive and higher fertility-related healthcare engagement (Chehreh et al. 2019). Greil et al. noted that married individuals may experience greater psychological stress related to conception expectations, potentially impacting fecundability (Greil et al. 2010). In addition, sociocultural dynamics within marriage, such as timing of childbearing or communication patterns, may influence reproductive outcomes. This suggests the need for integrated fertility counseling services within marital health programs, addressing psychosocial as well as biomedical aspects (Reisi et al. 2024). Research exploring how marital quality and partner support affect fecundability could further refine interventions. These findings affirm marriage as a social context that shapes reproductive behavior and access to fertility care.

Our analysis found that a history of ovulation-inducing fertility treatment significantly predicted delayed fecundability (SHAP ≈ 0.173), reflecting the known role of ovulatory disorders, such as Polycystic Ovary Syndrome (PCOS), as major contributors to subfertility (Persson et al. 2019, Lentscher & Torrealday 2021). Ellakwa et al. documented similar findings, emphasizing that women requiring such treatments often represent a clinically distinct subpopulation with intrinsic reproductive difficulties. This relationship may also reflect treatment-seeking behavior as an indicator for underlying infertility. Policy efforts must focus on enhancing equitable access to diagnostic and therapeutic fertility services to enable early identification and management (Ellakwa et al. 2016). Further research is warranted to evaluate long-term reproductive outcomes following ovulation induction therapies. Ultimately, these results affirm the critical need for accessible fertility treatments tailored to ovulatory dysfunction.

In addition to clinical treatments, non-prescribed fertility practices such as herbal remedies also showed notable associations. Both fertility pills and herbal remedies were associated with increased predicted risk of delayed fecundability (SHAP ≈ 0.124 and 0.118). While fertility pills are standard medical treatments, the use of herbal remedies likely reflects attempts at self-management or barriers to formal healthcare. Unregulated use of herbal substances may delay proper diagnosis or treatment, potentially prolonging time to conception. Integrating traditional and modern reproductive health practices, alongside public education campaigns about safe fertility interventions, could improve outcomes (Akbaribazm & Rahimi 2021, Institute 2023). Research should investigate the efficacy and safety of common herbal fertility remedies to guide evidence-based recommendations. These findings highlight the importance of culturally sensitive, accessible fertility care that bridges traditional and biomedical approaches.

The presence of polygamous marital arrangements, indicated by husbands having other wives, was moderately associated with delayed fecundability (SHAP ≈ 0.098). Prior studies suggest that co-wife competition, reduced sexual frequency with individual wives, and psychosocial stress inherent in polygamous unions can negatively impact reproductive outcomes (Josephson 2002, Rahmanian et al. 2021). Cultural norms and relationship dynamics in such households may also limit women’s autonomy in seeking fertility care. Policy interventions should incorporate culturally competent reproductive health counseling and community engagement to address these complex social factors (Dyer 2007, Shaiful Bahari et al. 2021). Further qualitative research exploring the lived experiences of women in polygamous marriages can inform tailored support strategies.

Finally, prior use of family planning was modestly associated with delayed fecundability (SHAP ≈ 0.097), echoing mixed evidence from previous studies. While most contraceptive methods allow rapid return to fertility after discontinuation, some women may experience transient delays or may have used contraception in response to existing reproductive concerns (Farrow et al. 2002). Variability by contraceptive type, duration of use, and individual physiology complicates this relationship. Strengthening contraceptive counseling to set realistic expectations about fertility return and providing follow-up support for women experiencing delays could mitigate concerns (Yland et al. 2020). Future research should disaggregate contraceptive types and examine behavioral factors influencing post-contraceptive fecundability. Overall, family planning history represents complex but modifiable factor influencing conception timing.

In summary, these findings illustrate the value of integrating interpretable ML methods into reproductive health research to uncover complex, non-obvious predictors of delayed fecundability. By bridging data-driven insights with clinical relevance, this study lays the groundwork for more personalized and equitable fertility care. However, to fully realize this potential, it is important to consider the study’s methodological and contextual limitations and strengths.

Limitations and strengths

Despite promising results, this study has notable limitations. First, reliance on secondary cross-sectional data limits control over important unmeasured confounders such as BMI, smoking status, mental health, detailed reproductive history, and biological markers (e.g., hormonal profiles or ovulatory health), all of which can influence fecundability. As such, the findings are based on sociodemographic and behavioral correlates available in the dataset, rather than direct biological measurements. Future studies integrating clinical or biological data could provide more comprehensive insights into the determinants of delayed fecundability. In addition, the data were self-reported, introducing potential recall and social desirability biases. While SHAP analysis enhanced the interpretability of the ML models, it does not establish causal relationships. As such, these findings should be interpreted as associations. The absence of external validation may limit the generalizability of the model to populations beyond the study cohort. Furthermore, this study focuses solely on women due to the structure of the PMA surveys, which collect reproductive health data from female respondents. As a result, male factors in delayed fecundability are not captured. Future primary data studies should aim to include male partners to enable a more comprehensive analysis of factors contributing to delayed fecundability.

Notwithstanding these limitations, the study has several strengths. It applied a comprehensive ML algorithm to a rich, multidimensional dataset combining sociodemographic and reproductive health indicators. Rigorous hyperparameter optimization enhanced model performance within the study sample. The use of SHAP values enabled interpretable, transparent modeling, crucial for clinical and public health applicability. Subgroup analyses provided additional granularity, uncovering potential effect modifiers and supporting the development of tailored fertility interventions. Collectively, these strengths enhance the practical and methodological value of the study and lay a foundation for further prospective validation and implementation.

Conclusion and recommendations

This study highlights the potential of ML, particularly the RF algorithm, to predict delayed fecundability in reproductive-age women using sociodemographic and reproductive health data. The model’s strong discriminative performance, combined with SHAP-based interpretability, provides clinically meaningful insights into key predictors such as maternal age, marital status, and history of fertility treatment. These results offer a data-driven approach to risk stratification in reproductive health.

To translate these findings into practice, several steps are recommended. First, further validation in prospective and diverse cohorts is essential to assess the generalizability and stability of the model. Second, health programs can consider piloting ML-based risk assessment tools to identify individuals at elevated risk and tailor fertility counseling accordingly. Clinicians should pay particular attention to age and ovulation history when discussing reproductive planning with patients. Finally, policymakers should invest in digital health infrastructure and analytic capacity to support the integration of predictive modeling into clinical workflows.

Supplementary materials

supplementary_materials.pdf^{(188.7KB, pdf)}

Declaration of interest

The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the work reported.

Funding

This research did not receive any specific grant from any funding agency in the public, commercial, or not-for-profit sector.

Author contribution statement

MAA was responsible for conceptualization and design. MAA, NMD, and TKT were responsible for data extraction and cleaning. MAA, TYB, BA, and BAA helped in investigation. MAA, EBL, HAG, and EAA contributed to methodology. MAA, FGA, GGA, and GAY were responsible for software and writing. MAA, HWA, and MNA provided validation. All the authors read and approved the manuscript. MAA is the guarantor of the study and accepts full responsibility for the work, had access to the data, and controlled the decision to publish.

Data availability

The data analyzed in this study are publicly available from the Performance Monitoring and Action (PMA) surveys at https://www.pmadata.org/. All datasets used can be accessed without restriction.

Acknowledgments

We sincerely thank the Performance Monitoring and Action (PMA) program for providing access to the survey data used in this study.

References

Akbaribazm M, Goodarzi N & Rahimi M. 2021. Female infertility and herbal medicine: an overview of the new findings. Food Sci Nutr 9 5869–5882. ( 10.1002/fsn3.2523) [DOI] [PMC free article] [PubMed] [Google Scholar]
Barbouni K, Jotautis V, Metallinou D, et al. 2025. When weight matters: how obesity impacts reproductive health and pregnancy – a systematic review. Curr Obes Rep 14 37. ( 10.1007/s13679-025-00629-9) [DOI] [PMC free article] [PubMed] [Google Scholar]
Cao SSLX & Song BT. 2020. Interpretable machine learning models for predicting clinical pregnancies associated with surgical sperm retrieval from testes of different etiologies: a retrospective study. BMC Urol 20 156. ( 10.1186/s12894-024-01537-1) [DOI] [PMC free article] [PubMed] [Google Scholar]
Chehreh ROG, Abolmaali K, Nasiri M, et al. 2019. Comparison of the infertility-related stress among couples and its relationship with infertility factors. IJWHR 7 313–318. ( 10.15296/ijwhr.2019.52) [DOI] [Google Scholar]
Delbaere I, Verbiest S & Tydén T. 2020. Knowledge about the impact of age on fertility: a brief review. Upsala J Med Sci 125 167–174. ( 10.1080/03009734.2019.1707913) [DOI] [PMC free article] [PubMed] [Google Scholar]
Demoskop 2022. In Russia, The Honorary Title “Mother Heroine” Will Be Awarded Again [In Russian], vol 16, pp 497–498: Demoskop Weekly. (http://www.demoscope.ru/weekly/2022/0949/rossia01.php) [Google Scholar]
Dunson DBBD & Colombo B. 2004. Increased infertility with age in men and women. Obstet Gynecol 103 51–56. ( 10.1097/01.AOG.0000100153.24061.45) [DOI] [PubMed] [Google Scholar]
Dyer 2007. The value of children in African countries–insights from studies on infertility. J Psychosom Obstet Gynecol 28 69–77. ( 10.1080/01674820701409959) [DOI] [PubMed] [Google Scholar]
Ellakwa HE, Sanad ZF, Hamza HA, et al. 2016. Predictors of patient responses to ovulation induction with clomiphene citrate in patients with polycystic ovary syndrome experiencing infertility. Int J Gynecol Obstet 133 59–63. ( 10.1016/j.ijgo.2015.09.008) [DOI] [PubMed] [Google Scholar]
Farrow A, Hull MGR, Northstone K, et al. 2002. Prolonged use of oral contraception before a planned pregnancy is associated with a decreased risk of delayed conception. Hum Reprod 17 2754–2761. ( 10.1093/humrep/17.10.2754) [DOI] [PubMed] [Google Scholar]
Gietel-Basten SRA & Sobotka T. 2022. Changing the perspective on low birth rates: why simplistic solutions won’t work. BMJ 379 e072670. ( 10.1136/bmj-2022-072670) [DOI] [PMC free article] [PubMed] [Google Scholar]
Greil AL, Slauson-Blevins K & McQuillan J. 2010. The experience of infertility: a review of recent literature. Sociol Health Illness 32 140–162. ( 10.1111/j.1467-9566.2009.01213.x) [DOI] [PMC free article] [PubMed] [Google Scholar]
Institute I F G 2023. Herbal Remedies For Fertility: Do They Really Work? Inovi Fertility & Genetics Institute Published. (https://www.inovifertility.com/blog/herbal-remedies-for-fertility-do-they-really-work/) [Google Scholar]
Josephson 2002. Does polygyny reduce fertility? Am J Hum Biol 14 222–232. ( 10.1002/ajhb.10045) [DOI] [PubMed] [Google Scholar]
Khan I & Khare BK. 2024. Exploring the potential of machine learning in gynecological care: a review. Arch Gynecol Obstet 309 2347–2365. ( 10.1007/s00404-024-07479-1) [DOI] [PubMed] [Google Scholar]
Konishi S, Kariya F, Hamasaki K, et al. 2021. Fecundability and sterility by age: estimates using time to pregnancy data of Japanese couples trying to conceive their first child with and without fertility treatment. Int J Environ Res Publ Health 18 5486. ( 10.3390/ijerph18105486) [DOI] [PMC free article] [PubMed] [Google Scholar]
Lentscher JASB & Torrealday S. 2021. Polycystic ovarian syndrome and fertility. Clin Obstet Gynecol 64 65–75. ( 10.1097/GRF.0000000000000595) [DOI] [PubMed] [Google Scholar]
Liu LWW & Shen X. 2021. Risk factors associated with infertility in couples attending fertility clinics in China. Reprod Health 18 73.33794936 [Google Scholar]
Molnar 2022. Interpretable Machine Learning, 2nd edn. Leanpub. (https://leanpub.com/interpretable-machine-learning) [Google Scholar]
NHC-China 2022. Guiding opinions on further improving and implementing active reproductive support measures [In Chinese]. (http://www.nhc.gov.cn/rkjcyjtfzs/s7785/202208/9247dd64744c42df9522c4fa2cb78e42.shtml)
Oudshoorn NVDWJ & Steegers-Theunissen R. 2023. Prediction of conception outcome using machine learning: a systematic review. Hum Reprod Update 29 295–311. [Google Scholar]
Persson S, Elenis E, Turkmen S, et al. 2019. Fecundity among women with polycystic ovary syndrome (PCOS) – a population-based study. Hum Reprod 34 2052–2060. ( 10.1093/humrep/dez159) [DOI] [PubMed] [Google Scholar]
PMA PMFA 2023. Kenya PMA2022 Household and Female Questionnaire Dataset [PMA2022_KEP3_HQFQ_v4.0_12Jul2023]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/ECRE-CF28) [DOI] [Google Scholar]
PMA PMFA 2024a. Côte d’Ivoire PMA2024 Household and Female Questionnaire Dataset [PMA2024_CIP4_HQFQ_v1.0_1Sep2024]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/DP52-4021) [DOI] [Google Scholar]
PMA PMFA 2024b. Democratic Republic of Congo PMA2022 (Kinshasa) Household and Female Questionnaire Dataset [PMA2022_CDP3_Kinshasa_HQFQ_v2.0_1Sep2024]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/DGPR-SZ30) [DOI] [Google Scholar]
PMA PMFA 2024c. Niger PMA2024 Household and Female Questionnaire Dataset [PMA2024_NEP4_HQFQ_v1.0_1Sep2024]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/7EX7-A912) [DOI] [Google Scholar]
PMA PMFA 2024d. Nigeria PMA2024 (Lagos) Household and Female Questionnaire Dataset [PMA2024_NGP4_Lagos_HQFQ_v1.0_30Aug2024]: Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/M935-XF71) [DOI] [Google Scholar]
Rahmanian PMK, Mukhtar F & Choudhry FR. 2021. Prevalence of mental health problems in women in polygamous versus monogamous marriages: a systematic review and meta-analysis. Arch Womens Ment Health 24 339–351. ( 10.1007/s00737-020-01070-8) [DOI] [PubMed] [Google Scholar]
Räsänen JSA & Smajdor A. 2025. Why not coercive pronatalism? J Med Ethics. ( 10.1136/jme-2025-110705) [DOI] [PubMed] [Google Scholar]
Reisi M, Kazemi A, Maleki S, et al. 2024. Relationships between couple collaboration, well-being, and psychological health of infertile couples undergoing assisted reproductive treatment. Reprod Health 21 119. ( 10.1186/s12978-024-01857-3) [DOI] [PMC free article] [PubMed] [Google Scholar]
Sapra KJBD & Maisog JM. 2016. Time-to-pregnancy and exposure to persistent organic pollutants: findings from a prospective pregnancy study. Environ Health Perspect 124 1554–1560.27258598 [Google Scholar]
Shaiful Bahari I, Norhayati MN, Nik Hazlina NH, et al. 2021. Psychological impact of polygamous marriage on women and children: a systematic review and meta-analysis. BMC Pregnancy Childbirth 21 823. ( 10.1186/s12884-021-04301-7) [DOI] [PMC free article] [PubMed] [Google Scholar]
Sobotka TJA & Zeman K. 2022. From bust to boom? Birth and fertility responses to the covid-19 pandemic. SocArXiv [Preprint.]. ( 10.31235/osf.io/87acb) [DOI] [Google Scholar]
Sobotka T, Matysiak A & Brzozowska Z. 2019. Policy Responses to Low Fertility: How Effective Are They? UNFPA. (https://www.unfpa.org/publications/policy-responses-low-fertility-how-effective-are-they) [Google Scholar]
Stanford JB, Willis SK, Hatch EE, et al. 2019. Fecundability in relation to use of fertility awareness indicators in a North American preconception cohort study. Fertil Steril 112 892–899. ( 10.1016/j.fertnstert.2019.06.036) [DOI] [PMC free article] [PubMed] [Google Scholar]
Trawick E, Pecoriello J, Quinn G, et al. 2021. Guidelines informing counseling on female age-related fertility decline: a systematic review. J Assist Reprod Genet 38 41–53. ( 10.1007/s10815-020-01967-4) [DOI] [PMC free article] [PubMed] [Google Scholar]
Wesselink AK, Rothman KJ, Hatch EE, et al. 2017. Age and fecundability in a North American preconception cohort study. Am J Obstet Gynecol 217 667.e1–667.e8. ( 10.1016/j.ajog.2017.09.002) [DOI] [PMC free article] [PubMed] [Google Scholar]
Yland JJBK, Hatch EE, Wesselink AK, et al. 2020. Pregravid contraceptive use and fecundability: prospective cohort study. BMJ 371 m3966. ( 10.1136/bmj.m3966) [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary_materials.pdf^{(188.7KB, pdf)}

Data Availability Statement

The data analyzed in this study are publicly available from the Performance Monitoring and Action (PMA) surveys at https://www.pmadata.org/. All datasets used can be accessed without restriction.

[bib1] Akbaribazm M, Goodarzi N & Rahimi M. 2021. Female infertility and herbal medicine: an overview of the new findings. Food Sci Nutr 9 5869–5882. ( 10.1002/fsn3.2523) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib2] Barbouni K, Jotautis V, Metallinou D, et al. 2025. When weight matters: how obesity impacts reproductive health and pregnancy – a systematic review. Curr Obes Rep 14 37. ( 10.1007/s13679-025-00629-9) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] Cao SSLX & Song BT. 2020. Interpretable machine learning models for predicting clinical pregnancies associated with surgical sperm retrieval from testes of different etiologies: a retrospective study. BMC Urol 20 156. ( 10.1186/s12894-024-01537-1) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] Chehreh ROG, Abolmaali K, Nasiri M, et al. 2019. Comparison of the infertility-related stress among couples and its relationship with infertility factors. IJWHR 7 313–318. ( 10.15296/ijwhr.2019.52) [DOI] [Google Scholar]

[bib6] Delbaere I, Verbiest S & Tydén T. 2020. Knowledge about the impact of age on fertility: a brief review. Upsala J Med Sci 125 167–174. ( 10.1080/03009734.2019.1707913) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] Demoskop 2022. In Russia, The Honorary Title “Mother Heroine” Will Be Awarded Again [In Russian], vol 16, pp 497–498: Demoskop Weekly. (http://www.demoscope.ru/weekly/2022/0949/rossia01.php) [Google Scholar]

[bib8] Dunson DBBD & Colombo B. 2004. Increased infertility with age in men and women. Obstet Gynecol 103 51–56. ( 10.1097/01.AOG.0000100153.24061.45) [DOI] [PubMed] [Google Scholar]

[bib9] Dyer 2007. The value of children in African countries–insights from studies on infertility. J Psychosom Obstet Gynecol 28 69–77. ( 10.1080/01674820701409959) [DOI] [PubMed] [Google Scholar]

[bib10] Ellakwa HE, Sanad ZF, Hamza HA, et al. 2016. Predictors of patient responses to ovulation induction with clomiphene citrate in patients with polycystic ovary syndrome experiencing infertility. Int J Gynecol Obstet 133 59–63. ( 10.1016/j.ijgo.2015.09.008) [DOI] [PubMed] [Google Scholar]

[bib11] Farrow A, Hull MGR, Northstone K, et al. 2002. Prolonged use of oral contraception before a planned pregnancy is associated with a decreased risk of delayed conception. Hum Reprod 17 2754–2761. ( 10.1093/humrep/17.10.2754) [DOI] [PubMed] [Google Scholar]

[bib12] Gietel-Basten SRA & Sobotka T. 2022. Changing the perspective on low birth rates: why simplistic solutions won’t work. BMJ 379 e072670. ( 10.1136/bmj-2022-072670) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] Greil AL, Slauson-Blevins K & McQuillan J. 2010. The experience of infertility: a review of recent literature. Sociol Health Illness 32 140–162. ( 10.1111/j.1467-9566.2009.01213.x) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] Institute I F G 2023. Herbal Remedies For Fertility: Do They Really Work? Inovi Fertility & Genetics Institute Published. (https://www.inovifertility.com/blog/herbal-remedies-for-fertility-do-they-really-work/) [Google Scholar]

[bib15] Josephson 2002. Does polygyny reduce fertility? Am J Hum Biol 14 222–232. ( 10.1002/ajhb.10045) [DOI] [PubMed] [Google Scholar]

[bib5] Khan I & Khare BK. 2024. Exploring the potential of machine learning in gynecological care: a review. Arch Gynecol Obstet 309 2347–2365. ( 10.1007/s00404-024-07479-1) [DOI] [PubMed] [Google Scholar]

[bib16] Konishi S, Kariya F, Hamasaki K, et al. 2021. Fecundability and sterility by age: estimates using time to pregnancy data of Japanese couples trying to conceive their first child with and without fertility treatment. Int J Environ Res Publ Health 18 5486. ( 10.3390/ijerph18105486) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] Lentscher JASB & Torrealday S. 2021. Polycystic ovarian syndrome and fertility. Clin Obstet Gynecol 64 65–75. ( 10.1097/GRF.0000000000000595) [DOI] [PubMed] [Google Scholar]

[bib18] Liu LWW & Shen X. 2021. Risk factors associated with infertility in couples attending fertility clinics in China. Reprod Health 18 73.33794936 [Google Scholar]

[bib19] Molnar 2022. Interpretable Machine Learning, 2nd edn. Leanpub. (https://leanpub.com/interpretable-machine-learning) [Google Scholar]

[bib20] NHC-China 2022. Guiding opinions on further improving and implementing active reproductive support measures [In Chinese]. (http://www.nhc.gov.cn/rkjcyjtfzs/s7785/202208/9247dd64744c42df9522c4fa2cb78e42.shtml)

[bib21] Oudshoorn NVDWJ & Steegers-Theunissen R. 2023. Prediction of conception outcome using machine learning: a systematic review. Hum Reprod Update 29 295–311. [Google Scholar]

[bib22] Persson S, Elenis E, Turkmen S, et al. 2019. Fecundity among women with polycystic ovary syndrome (PCOS) – a population-based study. Hum Reprod 34 2052–2060. ( 10.1093/humrep/dez159) [DOI] [PubMed] [Google Scholar]

[bib23] PMA PMFA 2023. Kenya PMA2022 Household and Female Questionnaire Dataset [PMA2022_KEP3_HQFQ_v4.0_12Jul2023]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/ECRE-CF28) [DOI] [Google Scholar]

[bib24] PMA PMFA 2024a. Côte d’Ivoire PMA2024 Household and Female Questionnaire Dataset [PMA2024_CIP4_HQFQ_v1.0_1Sep2024]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/DP52-4021) [DOI] [Google Scholar]

[bib25] PMA PMFA 2024b. Democratic Republic of Congo PMA2022 (Kinshasa) Household and Female Questionnaire Dataset [PMA2022_CDP3_Kinshasa_HQFQ_v2.0_1Sep2024]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/DGPR-SZ30) [DOI] [Google Scholar]

[bib26] PMA PMFA 2024c. Niger PMA2024 Household and Female Questionnaire Dataset [PMA2024_NEP4_HQFQ_v1.0_1Sep2024]. Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/7EX7-A912) [DOI] [Google Scholar]

[bib27] PMA PMFA 2024d. Nigeria PMA2024 (Lagos) Household and Female Questionnaire Dataset [PMA2024_NGP4_Lagos_HQFQ_v1.0_30Aug2024]: Baltimore, MD: Bill & Melinda Gates Institute for Population and Reproductive Health, Johns Hopkins Bloomberg School of Public Health. ( 10.34976/M935-XF71) [DOI] [Google Scholar]

[bib28] Rahmanian PMK, Mukhtar F & Choudhry FR. 2021. Prevalence of mental health problems in women in polygamous versus monogamous marriages: a systematic review and meta-analysis. Arch Womens Ment Health 24 339–351. ( 10.1007/s00737-020-01070-8) [DOI] [PubMed] [Google Scholar]

[bib29] Räsänen JSA & Smajdor A. 2025. Why not coercive pronatalism? J Med Ethics. ( 10.1136/jme-2025-110705) [DOI] [PubMed] [Google Scholar]

[bib30] Reisi M, Kazemi A, Maleki S, et al. 2024. Relationships between couple collaboration, well-being, and psychological health of infertile couples undergoing assisted reproductive treatment. Reprod Health 21 119. ( 10.1186/s12978-024-01857-3) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] Sapra KJBD & Maisog JM. 2016. Time-to-pregnancy and exposure to persistent organic pollutants: findings from a prospective pregnancy study. Environ Health Perspect 124 1554–1560.27258598 [Google Scholar]

[bib32] Shaiful Bahari I, Norhayati MN, Nik Hazlina NH, et al. 2021. Psychological impact of polygamous marriage on women and children: a systematic review and meta-analysis. BMC Pregnancy Childbirth 21 823. ( 10.1186/s12884-021-04301-7) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] Sobotka TJA & Zeman K. 2022. From bust to boom? Birth and fertility responses to the covid-19 pandemic. SocArXiv [Preprint.]. ( 10.31235/osf.io/87acb) [DOI] [Google Scholar]

[bib33] Sobotka T, Matysiak A & Brzozowska Z. 2019. Policy Responses to Low Fertility: How Effective Are They? UNFPA. (https://www.unfpa.org/publications/policy-responses-low-fertility-how-effective-are-they) [Google Scholar]

[bib35] Stanford JB, Willis SK, Hatch EE, et al. 2019. Fecundability in relation to use of fertility awareness indicators in a North American preconception cohort study. Fertil Steril 112 892–899. ( 10.1016/j.fertnstert.2019.06.036) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] Trawick E, Pecoriello J, Quinn G, et al. 2021. Guidelines informing counseling on female age-related fertility decline: a systematic review. J Assist Reprod Genet 38 41–53. ( 10.1007/s10815-020-01967-4) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib37] Wesselink AK, Rothman KJ, Hatch EE, et al. 2017. Age and fecundability in a North American preconception cohort study. Am J Obstet Gynecol 217 667.e1–667.e8. ( 10.1016/j.ajog.2017.09.002) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib38] Yland JJBK, Hatch EE, Wesselink AK, et al. 2020. Pregravid contraceptive use and fecundability: prospective cohort study. BMJ 371 m3966. ( 10.1136/bmj.m3966) [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Application of machine learning to predict delayed fecundability among women in sub-Saharan Africa

Meron Asmamaw Alemayehu

Nebiyu Mekonnen Derseh

Tigist Kifle Tsegaw

Tilahun Yemanu Birhan

Banchlay Addis

Berhanie Addis Ayele

Emebet Birhanu Lealem

Eyob Akalewold Alemu

Fetlework Gubena Arage

Gebrie Getu Alemu

Getaneh Awoke Yismaw

Habtamu Abebe Getahun

Habtamu Wagnew Abuhay

Mekuriaw Nibret Aweke

Abstract

Graphical Abstract

Abstract

Lay summary

Introduction

Methods

Study design and data source

Study population

Study variables

Data preprocessing

Missing data handling

Feature engineering

Feature construction

Encoding and correlation analysis

Feature selection

Class imbalance

Machine learning algorithms

Hyperparameter tuning and performance evaluation

Subgroup analysis

Analytical software and packages

Results

Participant characteristics

Table 1.

Figure 1.

Model building and hyperparameter optimization

Model comparison

Performance at default hyperparameters

Table 2.

Performance at grid search hyperparameters

Figure 2.

Table 3.

Model interpretability

Figure 3.

Figure 4.

Subgroup analysis

Figure 5.

Discussion

Limitations and strengths

Conclusion and recommendations

Supplementary materials

Declaration of interest

Funding

Author contribution statement

Data availability

Acknowledgments

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases