Abstract
Infertility is a growing concern in today’s technologically driven and mechanized world, with male related factors contributing to nearly half of all cases yet often remaining under diagnosed due to societal misconceptions and stigma. Prolonged sedentary behaviour, environmental exposures, and psychosocial stress further exacerbate reproductive health disorders. This study presents a hybrid diagnostic framework that combines a multilayer feedforward neural network with a nature-inspired ant colony optimization algorithm, integrating adaptive parameter tuning through ant foraging behaviour to enhance predictive accuracy and overcome the limitations of conventional gradient based methods. Unlike conventional fertility diagnostic approaches, this hybrid strategy demonstrates improved reliability, generalizability and efficiency. The model was evaluated on a publicly available dataset of 100 clinically profiled male fertility cases representing diverse lifestyle and environmental risk factors, with performance assessed on unseen samples. Remarkably, it achieved 99% classification accuracy, 100% sensitivity, and an ultra-low computational time of just 0.00006 seconds, highlighting its efficiency and real-time applicability. Clinical interpretability is achieved via feature-importance analysis, emphasizing key contributory factors such as sedentary habits and environmental exposures, thereby enabling healthcare professionals to readily understand and act upon the predictions. This cost effective, time efficient system has the potential to reduce diagnostic burden, enable early detection, and support personalized treatment planning, illustrating the effective synergy between machine learning and bio-inspired optimization in advancing male reproductive health diagnostics.
Keywords: Proximity Search Mechanism, Multilayer Feedforward Neural Network, Ant colony optimization, classification, Fertility dataset
Subject terms: Computational biology and bioinformatics, Engineering, Health care, Mathematics and computing
Introduction
Infertility has transitioned from a private struggle confined to households into a pressing global health crisis with profound clinical, social, and economic implications. Recent estimates from WHO reveal that nearly one in six adults of reproductive age equivalent to over 186 million individuals worldwide experience infertility at some stage of their life. Alarmingly, male factors contribute to approximately 50% of all cases1. This overturns the long standing misconception that infertility is primarily a women issue. However, male infertility often remains underdiagnosed and underreported, mainly due to social stigma, limited clinical precision, and lack of public awareness2. The etiology of infertility is multifactorial, encompassing genetic, hormonal, anatomical, systemic, and environmental influences. In men, several risk factors such as chromosomal abnormalities, hypogonadism, varicocele, infections, and testicular dysfunction interact with lifestyle related habits like smoking, alcohol use, obesity, and prolonged exposure to heat1,2. Environmental factors have also gained prominence, with air pollution, pesticides, heavy metals, and endocrine disrupting chemicals emerging as major contributors to declining semen quality and sperm morphology3,4. Studies have shown that such toxic exposures impair sperm concentration, motility, and DNA integrity, underscoring a growing intersection between reproductive health and environmental degradation7,8. Moreover, infertility is now being linked to broader indicators of male health. Research has demonstrated that reduced sperm quality may serve as a biomarker for systemic disorders such as metabolic syndrome, endocrine dysfunction, and cardiovascular disease5,9. This connection emphasizes that infertility should not be viewed in isolation but as part of an integrated health continuum. From a psychological perspective, infertility often results in heightened levels of anxiety, depression, and emotional distress, especially in sociocultural contexts where fertility defines gender identity and social status11. At the policy level, declining fertility rates in various regions particularly across europe, east asia, and parts of the global south raise serious concerns regarding demographic sustainability and population aging. These demographic shifts have far reaching implications for workforce stability and public health infrastructure13. Despite advancements in assisted reproductive technologies (ARTs), including in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI), accessibility remains limited due to cost, infrastructure gaps, and ethical constraints10,12. In recent years, Artificial Intelligence (AI) and Machine Learning (ML) have emerged as transformative tools in reproductive medicine. Studies have begun to explore their use in sperm morphology classification, motility analysis, and IVF success prediction, marking a paradigm shift in diagnostic and prognostic accuracy14. These realities highlight the urgent necessity of innovative research that integrates medical, technological, and computational approaches to enhance infertility diagnosis, prediction, and treatment. Traditional diagnostic methods, though valuable, often fail to capture the complex interplay between biological and environmental factors. Consequently, there is growing demand for advanced, data driven models capable of uncovering hidden patterns, stratifying risks, and providing more accurate, personalized insights into infertility. Such approaches hold the promise of bridging existing diagnostic gaps, reducing costs, and expanding accessibility to wider populations. In this context, the present study is driven by the dual imperative of addressing a global health crisis and leveraging advanced analytical methodologies to deliver practical and scientific solutions. Ultimately, this research seeks to deepen the understanding of male infertility, enhance diagnostic precision through data centric innovation, and contribute to the development of accessible, sustainable, and equitable approaches in reproductive healthcare.
Literature review
Traditional diagnostic methods for male infertility, including semen analysis and hormonal assays, have long served as clinical standards. However, these methods are limited in capturing the complex interactions of biological, environmental, and lifestyle factors that contribute to infertility. Consequently, there has been increasing interest in computational approaches to improve predictive accuracy and objectivity in reproductive health assessment. Support Vector Machines (SVM) have been successfully applied to detect abnormal sperm morphology, offering robust classification performance15. Deep learning architectures, such as instance-aware segmentation networks, have further enhanced automated sperm morphology analysis by identifying subtle structural variations16. Similarly, TOD-CNN has demonstrated efficacy in detecting tiny objects in sperm videos, enabling precise evaluation of sperm motility and morphology17. These approaches reduce subjectivity, increase reproducibility, and allow high throughput analysis, addressing key limitations of traditional diagnostic methods. In addition, nature inspired optimization algorithms, particularly Ant Colony Optimization (ACO), have gained prominence in biomedical classification tasks. ACO leverages adaptive, self-organizing mechanisms to improve feature selection and model performance18. Hybrid frameworks combining deep learning with ACO have shown promising results in image based biomedical tasks, such as ocular OCT image classification19 and streaming medical feature selection20,21. Hybrid metaheuristic methods for neural network optimization have further enhanced convergence and predictive accuracy in biomedical imaging22,23.
In reproductive health applications, genetic algorithm assisted machine learning has been successfully applied for clinical pregnancy prediction in IVF, demonstrating the utility of metaheuristic augmented networks in complex biological prediction problems24. Explainable AI (XAI) frameworks ensure interpretability of model decisions, which is critical for clinical adoption and trust25. Furthermore, SHMC-Net, a mask-guided feature fusion network, has achieved high accuracy in sperm head morphology classification, highlighting the potential of deep learning pipelines for automated semen evaluation26. Recent studies emphasized the growing relevance of AI and hybrid optimization methods in fertility diagnostics27 reviewed AI-based frameworks for fertility assessment, highlighting risk factor modelling for improved predictive accuracy and clinical decision-making28 emphasized ensemble learning techniques combined with sampling schemes to address imbalanced fertility datasets, demonstrating enhanced classification performance29. Explored AI-driven tools for male fertility detection, illustrating the potential of automated systems in reproductive health diagnostics30. Presented statistical and AI-based approaches for analyzing multivariate fertility datasets, highlighting methods to improve prediction reliability in clinical applications. Further, hybrid ML models integrating hyperparameter tuning and feature selection have demonstrated improved robustness and classification performance across complex clinical datasets31,39. The combination of metaheuristic algorithms with machine learning frameworks, including genetic algorithms and particle swarm optimization, facilitates effective feature selection and parameter optimization in diverse domains such as thyroid disorder diagnosis32, heart disease detection33,34, polycystic ovary syndrome identification36, and arrhythmia detection35. These hybrid approaches have proven effective for real-time clinical applications, including thermographic diagnosis of maxillary sinusitis37, and enhancing the diagnostic capability of support vector machines and artificial neural networks38,39. Collectively, these studies highlight the utility of hybrid and bio-inspired optimization frameworks in improving predictive accuracy, interpretability, and generalization while handling high-dimensional and imbalanced biomedical datasets. The present study builds on this foundation by proposing a hybrid MLFFN–ACO framework for male infertility assessment, combining adaptive parameter tuning, feature importance analysis, and hybrid optimization to develop a robust, interpretable, and clinically relevant diagnostic tool. This approach addresses existing gaps by integrating predictive accuracy, model interpretability, and clinical applicability within a single framework, paving the way for more reliable and efficient computational solutions in reproductive health analytics.
The paper is organized as follows: Section “Introduction” presents the Introduction, highlighting the motivation and significance of the study. Section “Literature review” provides a comprehensive Literature Review on male infertility diagnostics and hybrid bio-inspired optimization techniques. Section “Key contributions” outlines the Key Contributions of the proposed study. Section “Proposed methodology” describes the Proposed Methodology. Section “Results and discussion” presents Results and Discussion. Section “Conclusion” highlights the Future Scope. Finally, Section “Future scope” concludes the paper.
Key contributions
The major contributions of this study are highlighted below:
Developed a machine learning framework for early prediction of male infertility using clinical, lifestyle, and environmental factors.
Introduced the Proximity Search Mechanism (PSM) to provide interpretable, feature level insights for clinical decision making.
Integrated Ant Colony Optimization (ACO) with neural networks to enhance learning efficiency, convergence, and predictive accuracy.
Addressed class imbalance in medical datasets, improving sensitivity to rare but clinically significant outcomes.
Proposed a non-invasive, personalized diagnostic approach for reproductive health assessment, facilitating proactive interventions.
Proposed methodology
The proposed methodology outlines a novel, ML based framework for early prediction of male infertility. It integrates clinical, lifestyle, and environmental factors with advanced computational techniques to enable accurate, interpretable, and non-invasive diagnostics. The framework incorporates the Proximity Search Mechanism (PSM) for feature level interpretability, while combining ACO with neural networks to enhance learning efficiency, convergence, and predictive performance. This section details the overall architecture, data preprocessing, feature selection, model training, optimization strategy, and evaluation protocols employed in the study.
Dataset description
The Fertility Dataset utilized in this study is publicly accessible through the UCI Machine Learning Repository. The dataset was originally developed at the University of Alicante, Spain, in accordance with WHO guidelines57, to examine the factors influencing male seminal quality. Following the removal of incomplete records, the final dataset comprised 100 samples collected from healthy male volunteers aged between 18 and 36 years. Each record in the dataset is described by 10 attributes encompassing socio-demographic characteristics, lifestyle habits, medical history, and environmental exposures. The target variable is a binary class label, indicating either Normal or Altered seminal quality. The dataset exhibits a moderate class imbalance, with 88 instances categorized as Normal and 12 instances categorized as Altered. A comprehensive summary of the dataset attributes and their corresponding value ranges is presented in Table 1.
Table 1.
Summary of the Fertility Dataset Attributes [?].
| S.No | Attribute | Value Range |
|---|---|---|
| 1 | Season | ![]() |
| 2 | Age | 0, 1 |
| 3 | Childhood Disease | 0, 1 |
| 4 | Accident / Trauma | 0, 1 |
| 5 | Surgical Intervention | 0, 1 |
| 6 | High Fever (in last year) | ![]() |
| 7 | Alcohol Consumption | 0, 1 |
| 8 | Smoking Habit | ![]() |
| 9 | Sitting Hours per Day | 0, 1 |
| 10 | Class Label | Normal (Fertile), Altered (In-Fertile) |
Range scaling
Data preprocessing plays a pivotal role in ensuring the integrity, consistency, and analytical reliability of the dataset. In this study, we employ range based normalization techniques to standardize the feature space and facilitate meaningful correlations across variables operating on heterogeneous scales, as formulated in (1). Although the Fertility Dataset obtained from the UCI Machine Learning Repository is approximately normalized, an additional normalization step was applied to ensure uniform scaling across all features. This step was necessary due to the presence of both binary (0, 1) and discrete
attributes, which exhibit heterogeneous value ranges. All features were rescaled to the [0, 1] range to ensure consistent contribution to the learning process, prevent scale induced bias, and enhance numerical stability during model training. Specifically, we utilize Min-Max normalization, a rescaling technique that linearly transforms each feature such that its minimum and maximum values map to a specified range [a, b]. The normalization is computed as follows:
| 1 |
where x denotes the original feature value,
is the normalized value, and
and
are the minimum and maximum values of the feature vector X, respectively. This process preserves the relative distribution of data points while constraining them within a fixed interval, thereby promoting numerical stability, accelerating convergence, and ensuring consistent feature contribution across all input variables.
Proximity search mechanism
The Proximity Search Mechanism (PSM) is introduced as a core component of the proposed methodology, serving as an integrated preprocessing strategy for outlier detection and data balancing. It operates as a data-driven module that identifies and organizes samples based on their structural and descriptive similarity. At its core, PSM enables the model to determine which data points in the dataset lie within a specified similarity threshold relative to a given query point. A query point represents a reference data instance typically from the test or evaluation set for which the model seeks to locate structurally and descriptively similar examples. This proximity-based grouping not only enhances training generalization but also mitigates the influence of outliers and class imbalance, ensuring a more consistent and representative data distribution. Each data point, including the query, is represented by a feature vector and a descriptor vector, capturing both its primary attributes and contextual information. PSM computes a normalized L1 distance between the query and each data point in the dataset, integrating both vectors into a unified similarity measure. If the computed distance falls below a user-defined threshold
the data point is considered proximate and grouped with the query. Formally, proximity is measured as the set of input patterns X and their corresponding target output set D are inherently composed of discrete values. A set
and
, is defined as a training set and testing set from the set
if and only if it meets the specified conditions as follows.
| 2 |
| 3 |
In (2) and (3), the terms m and n denote the dimensionalities of the training and testing samples, respectively. The parameter
serves as a dissimilarity threshold that quantitatively differentiates between sample pairs
and
based on their l1norm distances. Specifically, pairs satisfying inequality (2) are designated as training samples, ensuring sufficient variability for robust learning. Conversely, pairs that fulfill condition (3) are assigned to the testing set, signifying higher similarity and serving as proximity focused instances. These criteria systematically enforce an effective separation between learning and evaluation datasets, thereby mitigating the risks of model overfitting or underfitting. Once data partitioning is completed based on these constraints, the proposed algorithm is subsequently initialized for further processing. Algorithm 1 embodies the PSM.
Algorithm 1.
PSM for data sampling
PSM serves a dual role in the proposed framework. First, it facilitates data stratification by organizing training and testing samples based on their degree of similarity. Specifically, samples that exceed the predefined proximity threshold are used for training, thereby encouraging diversity and improving generalization, while those within the threshold are reserved for testing, ensuring proximity focused evaluation. This stratified grouping helps balance model performance and mitigates the risks of overfitting. Second, PSM contributes to interpretability enhancement by allowing the models predictions to be traced back to a group of structurally and descriptively similar data points. This similarity based linkage helps explain model decisions through relatable and neighbouring patterns, thereby aligning machine reasoning with human intuition and fostering greater trust in the models outputs. Through this mechanism, the model benefits from both enhanced learning structure and improved post hoc explainability, making PSM a vital element in both model training and evaluation phases.
Multilayer feedforward neural network (MLFFN)
Multilayer Feedforward Neural Networks (MLFFNs) are a fundamental class of artificial neural networks characterized by a unidirectional flow of information from the input layer through one or more hidden layers to the output layer without forming any loops or cycles. Each neuron in a given layer is connected to every neuron in the subsequent layer through a weighted connection, and the network processes inputs through successive linear combinations and nonlinear transformations. A remarkable property of MLFFNs is their capability to approximate highly complex and nonlinear mappings between input and output spaces. This strength is supported by the universal approximation theorem, which asserts that a feedforward neural network with just a single hidden layer containing a finite number of neurons can approximate any continuous function on a compact subset of
to an arbitrary degree of accuracy, provided sufficient hidden units and appropriate activation functions are used49. Owing to this property, MLFFNs have been effectively applied across a wide range of tasks, including regression, classification, signal processing, and biomedical data analysis. In particular, they have shown considerable utility in medical diagnostic systems where the relationship between features and output is often nonlinear and not explicitly defined50. Moreover, their adaptability makes them suitable even for time series classification tasks, where capturing temporal dependencies is critical51. The learning process in an MLFFN involves adjusting the weights of connections between neurons to minimize a predefined loss function, typically through optimization algorithms like gradient descent48. During training, the network receives input data, processes it through the layers using weighted sums and activation functions, and produces an output. The error between the predicted and actual output is then propagated backward to update the weights a process known as backpropagation52 . The structural simplicity and theoretical robustness of MLFFNs, along with their capacity for universal function approximation, make them a powerful tool for modelling complex input output relationships in various real world problems. The mathematical formulation of a MLFFN with a single hidden layer53,54 is defined by a function
, where the input space
consists of real valued vectors
, and the output space
consists of vectors
, with network weights initialized randomly within the range
and the training process starting at iteration
.
![]() |
4 |
The input to the
hidden neuron, denoted as
, is calculated as a linear combination of the input feature vector
and the corresponding weight vector
, as defined in (4). Specifically, the weight vector
encapsulates the synaptic weights linking each input feature to the
hidden neuron, including an additional term
that corresponds to the bias. The input bias
is appended to the input vector and is set to 1 to allow for flexible activation thresholds during training.
| 5 |
The output of the
hidden neuron, denoted as
, is obtained by applying a non linear activation function to the neurons input, as expressed in (5). This activation function introduces non linearity into the model, enabling it to capture complex patterns within the input data.
![]() |
6 |
The input to the
output neuron, denoted as
, is computed as a linear combination of the hidden layer output vector
and the corresponding weight vector, as defined in (6). Specifically, the weight vector
represents the synaptic weights associated with the
output neuron, where h denotes the number of hidden neurons. The term
corresponds to the bias input, which is set to one. The final output of the
output neuron, denoted as
, is obtained by applying a non linear activation function to
, enabling the model to capture complex relationships in the data.
| 7 |
It can be expressed as
, where
represents the final output of the k-th component as in (7). Next is to update of the output weights
and hidden weights
is performed using the following equations
| 8 |
| 9 |
In (8) and (9),
and
represent the weight update increments for the output and hidden layers, respectively. These updates are computed using the gradient descent optimization technique, which iteratively adjusts the synaptic weights in the direction that minimizes the networks error function. The magnitude of each update is governed by the gradient of the loss with respect to the corresponding weight, thereby facilitating convergence toward an optimal solution.
| 10 |
| 11 |
and
represent the weights at iteration t,
is the learning rate,
and
are the gradients of the loss function L with respect to the weights
and
at iteration t.
MLFFNs have been widely used across various domains including image classification, signal processing, medical diagnosis, and time series prediction, owing to their flexibility and generalization capability. However, despite their effectiveness, traditional MLFFNs can face limitations such as slow convergence and the risk of getting trapped in local minima. These challenges have led to the development of hybrid and optimized training approaches to enhance their performance.
Proposed methodology
In this study, we propose a novel hybrid optimization framework, the Neuro-Ant Fusion (NAF) Algorithm, which integrates Ant Colony Optimization (ACO) with a multilayer feedforward network (MLFFN) to enhance classification performance in male infertility prediction. The NAF framework leverages the global search capability of ACO alongside the local gradient-based refinement of gradient descent (GD). While GD effectively fine-tunes weights via local error minimization, it is prone to slow convergence and entrapment in local minima. By contrast, the ACO component explores the solution space adaptively through pheromone-inspired heuristics, guiding the network toward promising regions and promoting convergence toward a globally optimal set of weights. This fusion allows the model to improve generalization, convergence stability, and optimization robustness. The MLFFN, capable of modeling complex nonlinear relationships, serves as the predictive backbone, while ACO provides a bio-inspired optimization layer. A schematic representation of the proposed algorithm is shown in Fig. 1.
Fig. 1.
Schematic diagram of proposed NAF Algorithm.
Rationale for using ACO in NAF
The choice of ACO is motivated by its adaptive, feedback-driven search mechanism and its ability to converge on optimal solutions through collective intelligence, mirroring the learning and reinforcement principles of neural networks. Ants exhibit complex social learning and cooperative foraging behaviors, including trail marking, pheromone communication, and recruitment of nest mates, which inspire the algorithm’s decentralized solution search40–45. Integrating ACO with MLFFN enhances optimization by:
Mitigating local minima: ACO guides weight adjustments globally, complementing the local refinements of GD.
Improving convergence and stability: Pheromone-based exploration promotes adaptive and robust search across the solution space.
Enhancing generalization: By balancing exploration and exploitation, the model avoids overfitting while capturing complex feature interactions.
Leveraging biological inspiration: The feedback loops in ACO mirror neural learning dynamics, making the fusion with NN naturally compatible.
This hybrid approach, NAF, thus combines the adaptive learning capability of neural networks with the exploratory robustness of ACO, resulting in a highly accurate, stable, and computationally efficient predictive model for male infertility diagnosis.
Ant colony optimization
ACO is a population based metaheuristic inspired by the foraging behaviour of real ants, first introduced by Dorigo et al.46,47,55,56. In nature, ants discover the shortest path between their nest and food sources by laying and sensing pheromones. These pheromone trails influence the movement of other ants, leading to emergent optimized path finding behaviour over time. Computationally, ACO translates this principle to solve combinatorial optimization problems, where the objective is to identify the best solution from a finite set of candidates. Artificial ants construct solutions by traversing a graph representing the problem space and depositing synthetic pheromones proportional to solution quality. The probability of selecting a path increases with pheromone intensity, enabling reinforcement of high quality solutions across iterations. ACO incorporates pheromone evaporation to prevent premature convergence on suboptimal paths, while reinforcement strengthens paths associated with optimal solutions. This balance of exploration and exploitation enables ACO to approximate global optima effectively. Traditional ACO relies on stochastic path selection, which may lead to inefficient information transfer when pheromone differences are subtle. Enhancing pheromone updating and decision-making strategies is therefore crucial for faster convergence and higher solution quality, particularly in complex search spaces.
Mathematical formulation of ACO
In traditional ACO, ants rely solely on probability based selection without considering pheromone concentration. They move along paths blindly, indiscriminately transferring information regardless of path quality. However, the presence of pheromone on suboptimal paths hinders decision making, making it challenging for ants to differentiate between good and poor paths. This study aims to maximize optimal solutions as in (12), initially allowing ants to move erratically and depositing a random amount of pheromone
as they move.
| 12 |
where E(w) is the represents the error function or the superfluous node visited p ants and E(w) is the value of the objective function evaluated based on the
ants. Objective function values are the best measure of the quality of candidate solutions. The candidate solution that results in the evaluation of the best value for the objective function is known as the best leader ant. On the other hand, the candidate solution that results in the worst value for the objective function is called the worst scout ant. According to the updated values of the objective function in each iteration, the best and worst members are also updated. This E(w) amount of pheromone may create confusion among the ants, also influence their decision on which direction to move. Therefore, Every ant chose its path at time t, is based on state transition probability rule.
![]() |
13 |
![]() |
14 |
![]() |
15 |
denotes the node accessibility probability, where a value of 1 indicates an accessible node and 0 denotes an inaccessible one. The parameter
represents the pheromone concentration, specifically the quantity of food pheromone deposited by the leader ant during its foraging process. The term
refers to the heuristic desirability or visibility function, quantifying the relative attractiveness of transitioning to node j from node i. Parameters e and f serve as control exponents that modulate the influence of pheromone intensity and heuristic information, respectively. The state transition probability governing ant movement is defined by (13), which guides each ants decision making process at every iteration. The spatial separation or path cost between nodes is captured by (14), which inversely affects the pheromone concentration as the distance between nodes increases, the corresponding pheromone intensity diminishes. After each ant completes a tour across all nodes, the euclidean distance between artificial ants
and
is computed using (15). Here, each ant
adjusts its path by referencing the trajectory of its neighbouring ant
, thus incorporating local learning into the global optimization process. Given the inherent complexity of obtaining a complete optimal path in a single traversal, a set of constraints is applied to enable partial solution dropping, facilitating efficient convergence. The maximum path length is defined as the longest cumulative distance traversed by the leader ants during the exploration phase. This metric not only characterizes the planned route but also provides a quantitative indicator of the ants migration efficiency under learning constraints. While conventional ACO algorithms predominantly prioritize optimal solutions irrespective of search path complexity, the current formulation introduces a constraint-aware probability model, as expressed in (16), to enable more precise approximations of the global optimum. The artificial ant system proposed in this study is iteratively trained using the aforementioned probabilistic, distance based, and neighbourhood following mechanisms, enabling adaptive and intelligent path planning under biologically inspired constraints.
![]() |
16 |
The transition probability as described in (16),is positively correlated with the concentration of food pheromone. The heuristic information, denoted by
, is defined as the inverse of the distance between nodes i and j; thus, shorter distances correspond to higher heuristic values. In each generation, the quality and informativeness of candidate paths are evaluated based on the maximum pheromone concentration, with the additional constraint that ants traverse all nodes. Utilizing a continuous probability selection mechanism, the leading ants progressively acquire enhanced decision making capabilities, guiding the optimization process effectively. The proposed NAF algorithm is depicted in algorithm 2.
Algorithm 2.
NAF algorithm
Local pheromone update
Pheromone updating is systematically performed subsequent to the completion of exploratory traversal by the entire ant population from the source node to the designated target node. This process intrinsically models the temporal diminution of pheromone intensity, emulating the natural phenomenon of volatilization and environmental degradation. The update protocol integrates two principal constituents such as the residual pheromone concentration persisting along the established paths from antecedent iterations, and the incremental pheromone deposition attributed to the latest ant trajectories. These dual components collectively govern the spatiotemporal evolution of pheromone distribution, thereby modulating the search dynamics. The comprehensive mathematical framework encapsulating this pheromone update mechanism is rigorously delineated by the ensuing equations.
| 17 |
![]() |
18 |
![]() |
19 |
In (17)
denote as the total amount of pheromone at time t,
is the pheromone decay rate. (18),
shows the the change in concentration of pheromone. In (19)
is the amount of pheromone deposited by ant p on the edge i to j, Q is the constant and
is the length of the node i to j.
Refined pheromone update strategies in deterministic path selection
The biomechanical constraints inherent to ant locomotion, coupled with their stochastic exploratory behaviour and obstacle avoidance mechanisms, necessitate the rigorous imposition of pitch angle limits. Excessive pitch deviations elevate the risk of mechanical instability and overturning, thereby jeopardizing navigational integrity and operational safety. Consequently, the enforcement of strict angular boundaries is critical to maintain dynamic stability during ant movement. Within this operational framework, kinematic constraints are systematically defined to regulate trajectory progression, effectively mitigating redundant traversal of non essential nodes. These movement constraints are mathematically formalized as delineated in (20).
| 20 |
be the minimum concentration of pheromone
be the minimum concentration of pheromone and
be the concentration of pheromone dropped at path i. In this study, the sigmoid function is employed to modulate the locomotion dynamics of the ant agents. Biologically inspired, the sigmoid function characterizes the temporal rate of change in system activation, effectively capturing non linear response behaviours. The pheromone update rate is derived by applying the sigmoid transformation to the maximum transition probability associated with its respective feasible node. This updated pheromone intensity quantifies the nodes frequency and propensity in constituting the optimal path, as determined by the iterative recursion process. Consequently, it provides a probabilistic estimate of the node contribution toward the globally optimal route formation.
| 21 |
In (21)
denotes the reachability of the subsequent node, assuming a value of 1 if the node is accessible and 0 otherwise. While exploring for the next viable node, the heuristic values for potential nodes are calculated using the (21). Indeed, the ant avoids moving to crowded neighbouring locations or paths. If it finds no valid location, the ant remains in its current position until the next step. A conventional ACO generally stuck at sub optimal solutions due to change the concentration of pheromone that is high pheromone levels on a path can restrict exploration of other paths, potentially causing the algorithm to stagnate early. To mitigate this, we have generalized the pheromone update strategy.
| 22 |
| 23 |
| 24 |
Now, updates occur only on the path taken by the optimal ant in the current iteration, with introduced limits on pheromone values to prevent dominance or neglect, as shown in the equations (22), (23), (24). Here,
depicts the alteration in pheromone concentration between two nodes along the current optimal path,
is the pheromone decay rate, governing the rate at which pheromones diminish at the current location following each update, Q is the constant pheromone value,
and
is the maximum and minimum concentration pheromone at time
and
is the desired level of pheromone laid by p ant at ongoing recursion process
. Evaporation in ant colony optimization is a crucial mechanism complementing the pheromone update process. It prevents premature convergence, promotes adaptability to dynamic environments, and contributes to the exploration and exploitation trade off. The evaporation factor controls the decay of pheromones over time, dynamically adjusting the algorithm s behaviour. This process ensures diversity in exploration, prevents computational inefficiencies, and maintains a balance between exploration and exploitation for effective solution discovery. Beyond its role in exploration, evaporation contributes to the maintenance of diversity within the ant colony exploration. By preventing the entire colony from converging to a single solution, evaporation enhances the algorithms ability to discover a variety of solutions, fostering robustness and adaptability. Furthermore, the computational efficiency of the algorithm benefits from evaporation as it prevents the unnecessary accumulation of pheromones on paths that are no longer relevant. Mathematically, the evaporation factor is embedded in the pheromone update formula, where it governs the decay of pheromones over time. In essence, evaporation is a cornerstone of ant colony optimization, shaping the algorithm behaviour, adaptability, and efficiency.
Results and discussion
This section presents a detailed evaluation of the proposed NAF algorithm and compares its performance with prior methods to assess model stability and convergence rate.
Experimental setup
The experiments were conducted using MATLAB R2023a on a system with an Intel Core i5 (x64-based) processor with a speed of 2.10GHz, 8 GB of RAM, 237 GB of external storage.
As summarized in table 2, the dataset was partitioned using PSM as the principal data sampling strategy. PSM plays a vital role in this study, ensuring that training and testing samples are not randomly split but rather matched based on statistical similarity across key covariates. This creates a balanced representation of the input feature space in both subsets, reducing bias and improving the generalizability of the model. PSM employed in this study is not merely a preprocessing step, but an integral design component that reinforces the reliability and generalizability of the model. This method enabled a deliberate and balanced stratification of the fertility dataset, allocating 71% of the data for training and 29% for testing. By doing so, the model evaluation benefits from both robust training diversity and reliable generalization analysis. In contrast, prior studies such as58 typically used a 70:15:15 split for training, validation, and testing, respectively, reporting 97% accuracy on the training set. The use of PSM in our study ensures class representation and sample-level similarity, which are critical for enhancing model learning and fair performance assessment.
Table 2.
Data sampling.
| Input | Output |
|---|---|
| Samples | 100 |
| Features | 11 |
| Classes | 2 |
| Training samples | 71 |
| Testing samples | 29 |
| class I | 10 |
| class II | 61 |
Evaluation criteria
The performance of the proposed model is commonly assessed using several standard classification metrics such as accuracy (25), Precision (26), sensitivity (27) , specificity (28) , and F1-score (29) . These metrics rely on fundamental terms, including True Positive (TP) for correctly labelled modified instances, True Negative (TN) for correctly labelled normal instances, False Positive (FP) for mislabelled normal occurrences, and False Negative (FN) for incorrectly labelled altered instances.
| 25 |
| 26 |
| 27 |
| 28 |
| 29 |
The error rate serves as a fundamental metric quantifying the proportion of misclassifications within a predictive model. It is computed as the ratio of incorrectly classified instances comprising both false positives and false negatives to the total number of evaluated samples. Mathematically, this is formalized by equation (30). Complementing this, the Matthews Correlation Coefficient (MCC), as expressed in equation (31), provides a robust measure of binary classification performance. MCC is especially valuable in contexts involving imbalanced datasets, as it encapsulates the balance between true positives, true negatives, false positives, and false negatives into a single correlation coefficient, thereby offering a more comprehensive assessment of predictive quality beyond conventional accuracy metrics.
| 30 |
| 31 |
The Receiver Operating Characteristic (ROC) curve is a graphical representation of the performance of a binary classification model at various thresholds. It plots the true positive rate (32) (sensitivity) against the false positive rate (1 - specificity) (33) for different threshold values. The Area Under the Curve (AUC) (34) is a single scalar value that quantifies the overall performance of a classification model using the ROC curve. A higher AUC indicates a better performing model.
| 32 |
| 33 |
| 34 |
Table 3 offers a comprehensive comparative analysis of critical performance metrics including accuracy, sensitivity, and specificity, which highlights the exemplary diagnostic prowess of the proposed model. This thorough evaluation substantiates that the model has superior capability in accurately differentiating fertile and infertile cases with exceptional precision. By attaining an impressive classification accuracy of 99%, the model not only outperforms existing benchmark methods but also establishes itself as a robust, scalable, and clinically relevant solution for fertility prediction. The significance of this table lies in its demonstration that the model performs well across multiple essential metrics, ensuring that it is not only accurate but also sufficiently sensitive to positive cases and specific enough to avoid false alarms. This balance is crucial in medical diagnostics where both false positives and false negatives can have serious implications. Complementing these results, the confusion matrix in Table 4 provides a detailed breakdown of classification outcomes for the fertility dataset. The model successfully identifies 88 fertile and 11 infertile instances with only one misclassification an infertile case erroneously labelled as fertile. Notably, there are no false positives, indicating a 100% specificity rate for fertile cases and underscoring the model s reliability in minimizing diagnostic errors. This high classification fidelity reinforces the models applicability in real time healthcare diagnostics, where precision and trustworthiness are paramount.
Table 3.
Comparative analysis of Fertility dataset with existing models.
Table 4.
CM of Fertility dataset.
| Fertile | Infertile | |
|---|---|---|
| Fertile | 11 | 01 |
| Infertile | 0 | 88 |
Notably, Table 5 reports the MCC to rigorously evaluate model stability, particularly within the context of imbalanced datasets. The F1 score is also presented, offering a harmonized metric that simultaneously accounts for both precision and recall, thus providing a comprehensive measure of classification efficacy. Prior research, such as the artificial neural network optimized via a genetic algorithm detailed in60, has demonstrated enhanced predictive performance. Furthermore, the study in61 introduces an innovative supervised ensemble framework clustering based decision forests (CBDF) explicitly tailored to mitigate challenges associated with unbalanced class distributions in seminal quality prediction. In terms of computational efficiency, previous analyses on the fertility dataset comprising 100 instances recorded a processing duration exceeding two minutes62. In stark contrast, the computational time reported in Table 5 highlights one of the most significant strengths of the proposed model is an direct outcome of its efficient convergence strategy and integrated optimization mechanisms. The model stabilizes by the 462nd epoch, indicating rapid convergence facilitated by the ACO based framework. Unlike traditional approaches that require numerous epochs and computationally expensive iterations, our model intelligently navigates the search space through pheromone guided feedback, allowing it to quickly converge to optimal or near optimal solutions. A key factor in this efficiency is the adaptive learning rate tuning governed by the ACO mechanism, which dynamically fine tunes the learning process by reinforcing paths that lead to better performance. This eliminates the need for prolonged trial and error in parameter updates. As a result, the model achieves high accuracy of 99% and minimal error 0.01 in a significantly shorter time frame, with each training sample processed in only 0.000178 seconds, and the entire training completed in just 0.0006 seconds. This substantial computational gain demonstrates the effectiveness of combining ACO with neural learning, where ACO feedback mechanism complements the gradient descent updates by guiding weight adjustments towards globally optimal directions. Thus, the proposed method not only delivers robust performance metrics but also exhibits superior real time feasibility, making it highly suitable for clinical and resource constrained environments.
Table 5.
Stability Analysis Metrics.
| F1 Score | MCC | ER | Time(in sec) | Epoch |
|---|---|---|---|---|
| 95.65 | 95.20 | 0.01 | 0.00006 | 462 |
Figures 2 and 3 illustrate a comparative assessment of actual versus predicted outputs during the training and testing phases of the proposed NAF algorithm. These plots showcase two key elements, the ground truth responses derived from the dataset and the corresponding values predicted by the NAF model. The alignment between these two sets of responses reflects the models predictive fidelity. A close correspondence indicates the models ability to accurately learn from the training data and generalize effectively to unseen instances. This visualization not only confirms the robustness of the models learning capabilities but also highlights its potential for precise and reliable prediction in real world applications, particularly within the domain of reproductive healthcare diagnostics.
Fig. 2.
Comparison of the actual response with the predicted response for the training sample.
Fig. 3.
Comparison of the actual response with the predicted response for the testing sample.
Figures 4 and 5 present a detailed visualization of actual versus predicted outcomes through swarm charts for the training and testing datasets, respectively. These charts map the distribution of data points by plotting actual and predicted values, where the intensity of each point visually encodes the magnitude of residuals indicating the level of prediction error. Higher intensity reflects larger discrepancies between true and predicted responses. Outliers are distinctly marked to highlight instances of significant deviation, allowing for easy identification of anomalous predictions. This visualization provides a comprehensive view of the models predictive alignment, enabling the detection of tightly clustered accurate predictions as well as areas with greater variance. By simultaneously showcasing prediction accuracy, residual spread, and the presence of outliers, these charts offer valuable insights into model performance, helping to assess its reliability, generalization capability, and areas that may benefit from further optimization.
Fig. 4.
Actual vs. predicted values with outliers in the training sample.
Fig. 5.
Actual vs. predicted values with outliers in the testing sample.
Figures 6 and 7 present the residuals and RMSE for the models performance on the training and testing datasets, respectively. Residuals reflect the difference between actual and predicted values, offering insight into individual prediction errors and potential model bias. RMSE serves as an aggregate measure of error magnitude, indicating how accurately the model predicts overall outcomes. The significance of these plots lies in their ability to assess the consistency and reliability of the model across both known and unseen data. Examining the spread and behaviour of residuals alongside RMSE values helps determine whether the model generalizes effectively, avoids overfitting, and maintains stable performance essential traits for dependable use in healthcare diagnostics.
Fig. 6.
Performance of the fitness function during the training phase.
Fig. 7.
Performance of the fitness function during the testing phase.
The ROC curve depicted in Fig. 8 provide a direct comparison of the models performance on the training and testing datasets. The training ROC illustrates the models ability to learn patterns from the dataset, with a high AUC indicating strong discriminative capability. The testing ROC demonstrates generalization performance on unseen data. The close alignment between the training and testing ROC curves indicates minimal overfitting, confirming the robustness and reliability of the proposed model in predicting male fertility outcomes. Previous research has highlighted the potential of various machine learning models in accurately predicting male fertility. For instance, the fusion of Enhanced Synthetic Minority Oversampling Technique (ESLSMOTE) with AdaBoost, as presented in63, achieved an impressive Area Under the Curve (AUC) of 97.2%. Similarly, the application of the Extreme Gradient Boosting (XGBoost) algorithm yielded an AUC of 98%, underscoring its superior discriminatory power64. These results validate the efficacy of ensemble learning methods in addressing imbalanced and complex clinical datasets. This graphical evaluation provides a robust foundation for interpreting model performance, particularly in identifying subtle variations between fertile and infertile cases. Ultimately, ROC analysis remains a cornerstone in validating the models predictive reliability and clinical applicability, ensuring that its deployment in real world healthcare scenarios is both informed and impactful.
Fig. 8.
ROC Curve.
Conclusion
In this study, we proposed the novel a hybrid optimization for accurate prediction of male infertility. The model demonstrated high performance, achieving robust classification metrics across multiple evaluation criteria. Beyond numerical results, the study emphasizes the clinical relevance of the approach: it enables improved early detection of male infertility, reduces diagnostic burden for clinicians, and has the potential to be integrated into decision-support systems in healthcare settings. Importantly, male infertility remains an underexplored area in computational diagnostics, and very few studies have applied advanced bio-inspired hybrid optimization methods to this domain. The proposed approach thus addresses a significant gap, providing an innovative framework for predictive modeling that combines biologically inspired intelligence with neural computation. Moreover, this study establishes a foundation for future research, highlighting avenues such as validation on larger, multi-center datasets, real world clinical testing, and exploration of other hybrid bio-inspired optimization techniques to further enhance model performance and generalizability. By combining the global search capability of ACO with the learning power of neural networks, the proposed framework not only advances predictive modelling for male infertility but also offers a methodological blueprint for similar applications in broader medical diagnostics.
Future scope
This study opens several avenues for further research. Expanding the dataset with diverse clinical and demographic profiles could enhance model generalizability. Alternative bio-inspired optimization strategies and hybrid learning frameworks may be explored to further improve predictive accuracy. Incorporating advanced feature selection, dimensionality reduction, and multimodal data sources can strengthen model robustness. Finally, integrating the proposed approach into real-world clinical workflows and decision-support systems may facilitate practical applications in male fertility assessment, providing a foundation for future translational research.
Author contributions
Priyanka R: Writing—original draft, Validation, Software, Methodology, Investigation, Conceptualization. G. Gajendran: Writing—original draft, Visualization, Validation, Methodology, Investigation, Conceptualization. Salah Boulaaras: Writing—review & editing, Visualization, Validation, Supervision, Methodology, Investigation, Conceptualization. Farid Selatnia Writing—review & editing, Validation, Methodology, Investigation, Funding acquisition.
Funding
There is no applicable fund.
Data availability
The data sets used and/or analyzed during the current study available from the corresponding author on reasonable request.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Gajendran Ganesan, Email: gajendrg@srmist.edu.in.
Salah Boulaaras, Email: s.boularas@qu.edu.sa.
References
- 1.Agarwal, A. et al. Male infertility. Lancet.397(10271), 319–333. 10.1016/S0140-6736(20)32667-2 (2021). [DOI] [PubMed] [Google Scholar]
- 2.Biggs, S. N. et al. Lifestyle and environmental risk factors for unexplained male infertility: Study protocol for Australian Male Infertility Exposure (AMIE), a case–control study. Reprod. Health20(1), 32 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Abilash, D. & Sridharan, T. B. Impact of air pollution and heavy metal exposure on sperm quality: A clinical prospective research study. Toxicol. Rep.13, 101708. 10.1016/j.toxrep.2024.101708 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Mansour, H.A.E.-H. & Sadek, A.-S.M. Impacts of environmental pollutants and environmentally transmitted parasites on male fertility and sperm quality. Deleted J.10.1007/s42452-025-07400-8 (2025). [Google Scholar]
- 5.Kaltsas, A. et al. Male infertility and reduced life expectancy: Epidemiology, mechanisms, and clinical implications. J. Clin. Med.14(11), 3930 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wei, J., Huang, H. & Fan, L. Global burden of female infertility attributable to sexually transmitted infections and maternal sepsis: 1990–2021 and projections to 2050. Sci. Rep.15, 15189. 10.1038/s41598-025-94259-9 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ghosh, A., Tripathy, A. & Ghosh, D. Impact of endocrine disrupting chemicals (EDCs) on reproductive health of human. Proc. Zool. Soc.75, 16–30 (2022). [Google Scholar]
- 8.Tzouma, Z. et al. Associations between endocrine-disrupting chemical exposure and fertility outcomes: a decade of human epidemiological evidence. Life15(7), 993 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Randell, Z. et al. Sperm telomere length in male-factor infertility and reproduction. Fertil. Steril.121(1), 12–25 (2024). [DOI] [PubMed] [Google Scholar]
- 10.Brannigan, R. E. et al. Updates to male infertility: AUA/ASRM guideline (2024). J. Urol.212(6), 789–99 (2024). [DOI] [PubMed] [Google Scholar]
- 11.Choudhary, P., Dogra, P. & Sharma, K. Infertility and lifestyle factors: How habits shape reproductive health. Middle East Fertil. Soc. J.30, 14. 10.1186/s43043-025-00228-7 (2025). [Google Scholar]
- 12.Cox, C. M. et al. 2022 Infertility prevalence and the methods of estimation from 1990 to 2021: a systematic review and meta-analysis. Hum Reprod Open.4, hoac051 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.UNFPA. (2023). World population report: Demographic shifts and fertility. https://www.unfpa.org
- 14.Qaderi, K. et al. Artificial intelligence (AI) approaches to male infertility in IVF: A mapping review. Eur. J. Med. Res.30, 246 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Diyasa, I. G. S. M., Prasetya, D. A., Kuswardhani, H. A. C. & Halim, C. Detection of abnormal human sperm morphology using support vector machine classification. Inform. Technol. Int. J.2, 57–63 (2024). [Google Scholar]
- 16.W. Chen et al., Automated sperm morphology Analysis based on Instance-Aware part segmentation. 2024, pp. 17743–17749. 10.1109/icra57147.2024.10611339.
- 17.Zou, S. et al. TOD-CNN: An effective convolutional neural network for tiny object detection in sperm videos. Comput. Biol. Med.146, 105543. 10.1016/j.compbiomed.2022.105543 (2022). [DOI] [PubMed] [Google Scholar]
- 18.M. Dorigo and T. Stützle, “Ant Colony Optimization: Overview and recent advances,” Handbook of Metaheuristics. International Series in Operations Research & Management Science, (2019), pp. 311–351. 10.1007/978-3-319-91086-4-10.
- 19.Agarwal, S. et al. HDL-ACO hybrid deep learning and ant colony optimization for ocular optical coherence tomography image classification. Sci. Rep.15(1), 5888 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Fahad, L. G. et al. Ant colony optimization-based streaming feature selection: An application to the medical image diagnosis. Sci. Program.2020(1), 1064934 (2019). [Google Scholar]
- 21.Fahad, L. G. et al. Ant colony optimization-based streaming feature selection: An application to the medical image diagnosis. Sci. Program.2020, 1–10. 10.1155/2020/1064934 (2020). [Google Scholar]
- 22.R. N. Ravikumar, S. Aarthi, S. Kurbanova, S. Polvanov, B. Matchanova, and K. Sathya, “Hybrid metaheuristic optimization for neural networks in biomedical imaging. In: Metaheuristic Algorithms and Optimizing Neural Networks for Biomedical Image Processing, 2025, pp. 197–234. 10.4018/979-8-3373-0523-3.ch008.
- 23.R. N. Ravikumar, S. Aarthi, S. Kurbanova, S. Polvanov, B. Matchanova, and K. Sathya, “Hybrid metaheuristic optimization for neural networks in biomedical imaging,” in Metaheuristic Algorithms and Optimizing Neural Networks for Biomedical Image Processing, 2025, pp. 197–234. 10.4018/979-8-3373-0523-3.ch008.
- 24.Louis, C. M. et al. Genetic algorithm–assisted machine learning for clinical pregnancy prediction in in vitro fertilization. AJOG Global Rep.3(1), 100133. 10.1016/j.xagr.2022.100133 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sadeghi, Z. et al. A review of explainable artificial intelligence in healthcare. Comput. Electr. Eng.1(118), 109370 (2024). [Google Scholar]
- 26.Sapkota N, Zhang Y, Li S, Liang P, Zhao Z, Zhang J, Zha X, Zhou Y, Cao Y, Chen DZ. Shmc-net: a mask-guided feature fusion network for sperm head morphology classification. In: 2024 IEEE International Symposium on Biomedical Imaging (ISBI) 2024 May 27 (pp. 1-5). IEEE.
- 27.GhoshRoy, D., Alvi, P. A. & Santosh, K. C. AI tools for assessing human fertility using risk factors: A state-of-the-art review. J. Med. Syst.47(1), 91 (2023). [DOI] [PubMed] [Google Scholar]
- 28.GhoshRoy, D., Alvi, P. A. & Santosh, K. C. Leveraging sampling schemes on skewed class distribution to enhance male fertility detection with ensemble AI learners. Int. J. Patt. Recogn. Artif. Intell.38(02), 2451003 (2024). [Google Scholar]
- 29.Roy DG, Alvi PA. Detection of male fertility Using AI-driven tools. In: International Conference on Recent Trends in Image Processing and Pattern Recognition 2021 Dec 8 (pp. 14-25). Cham: Springer International Publishing.
- 30.GhoshRoy, D., Alvi, P. A. & Santosh, K. C. Improved statistical approach to analyze multivariate women’s fertility dataset for better prediction. Proc. Comput. Sci.260, 91–100. 10.1016/j.procs.2025.03.181 (2025). [Google Scholar]
- 31.Dhanka, S., Sharma, A., Kumar, A., Maini, S. & Vundavilli, H. Advancements in hybrid machine learning models for biomedical disease classification using integration of hyperparameter-tuning and feature selection methodologies: A comprehensive review. Arch. Comput. Methods Eng.30, 1–36 (2025). [Google Scholar]
- 32.Kumar, A. et al. Comprehensive framework for thyroid disorder diagnosis: Integrating advanced feature selection, genetic algorithms, and machine learning for enhanced accuracy and other performance matrices. PLoS One.20(6), e0325900 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kumar, A., Dhanka, S., Singh, J., Ali Khan, A. & Maini, S. Hybrid machine learning techniques based on genetic algorithm for heart disease detection. Innov. Emerg. Technol.14(11), 2450008 (2024). [Google Scholar]
- 34.Sharma, A. et al. A systematic review on machine learning intelligent systems for heart disease diagnosis. Arch. Comput. Methods Eng.15, 1–27 (2025). [Google Scholar]
- 35.Kumar A, Singh J, Khan AA. Arrhythmia Detection Using Machine Learning: A Study with UCI Arrhythmia Dataset. In: International Conference on Frontiers of Intelligent Computing: Theory and Applications 2024 Jun 6 (pp. 217-226). Singapore: Springer Nature Singapore.
- 36.Kumar, A., Singh, J. & Khan, A. A. A comprehensive machine learning framework with particle swarm optimization for improved polycystic ovary syndrome (PCOS) diagnosis. Eng. Res. Express.6(3), 035233 (2024). [Google Scholar]
- 37.Singh, J., Pandey, B., Karna, S., Arora, A. S. & Kumar, A. Enhancing the thermographic diagnosis of maxillary sinusitis using deep learning approach. Quant. InfraRed Thermogr. J.22(3), 195–209 (2025). [Google Scholar]
- 38.Kumar, A., Khan, A. A. & Singh, J. Enhancing the diagnosis of cardiovascular disease: A comparative examination of support vector machine and artificial neural network models utilizing extensive data preprocessing techniques. WSEAS Trans. Comput.23, 318–27 (2024). [Google Scholar]
- 39.Dhanka, S. et al. Enhancing the diagnosis of cardiovascular disease: A comparative examination of support vector machine and artificial neural network models utilizing extensive data preprocessing techniques. Arch. Comput. Methods Eng.4, 1–45 (2025). [Google Scholar]
- 40.Li, Lixiang et al. Chaos-order transition in foraging behavior of ants. Proc. Nat. Acad. Sci.111(23), 8392–8397 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Beckers, Ralph et al. Colony size, communication and ant foraging strategy. Psyche J. Entomol.96, 1239–56 (1989). [Google Scholar]
- 42.Richardson, Thomas O. et al. Teaching with evaluation in ants. Curr. Biol.17(17), 1520–1526 (2007). [DOI] [PubMed] [Google Scholar]
- 43.Ali, M. F. & David Morgan, E. Chemical communication in insect communities: A guide to insect pheromones with special emphasis on social insects. Biol. Rev.65, 227–247 (1990). [Google Scholar]
- 44.Morgan, E. D., Insect trail pheromones: a perspective of progress, Chromatography and isolation of insect hormones and pheromones, (1990), 259-70.
- 45.Möglich, M. & Hölldobler, B. Social carrying behavior and division of labor during nest moving in ants. Psyche J. Entomol.81(2), 219–236 (1974). [Google Scholar]
- 46.Kallioras, Nikolaos Ath, Kepaptsoglou, Konstantinos & Lagaros, Nikos D. Transit stop inspection and maintenance scheduling: A GPU accelerated metaheuristics approach. Transp. Res. Part C: Emerg. Technol.55, 246–260 (2015). [Google Scholar]
- 47.Dorigo, M. & Christian, B. Ant colony optimization theory: A survey. Theor. Comput. Sci.344, 243–278 (2005). [Google Scholar]
- 48.Du, S., Jason, L., Haochuan, L., Liwei, W., & Xiyu, Z., Gradient descent finds global minima of deep neural networks. In: International conference on machine learning, pp. 1675-1685. PMLR, 2019.
- 49.Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Netw.2(5), 359–366. 10.1016/0893-6080(89)90020-8 (1989). [Google Scholar]
- 50.Litjens, G. et al. A survey on deep learning in medical image analysis. Med. Image Anal.1(42), 60–88 (2017). [DOI] [PubMed] [Google Scholar]
- 51.Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L. & Muller, P. A. Deep learning for time series classification: A review. Data Min. Knowl. Discov.33(4), 917–63 (2019). [Google Scholar]
- 52.Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature323(6088), 533–536. 10.1038/323533a0 (1986). [Google Scholar]
- 53.Ramdass, Priyanka, and Gajendran Ganesan, “Leveraging Neighbourhood Component Analysis for Optimizing Multilayer Feed-Forward Neural Networks in Heart Disease Prediction”, Mathematical Modelling of Engineering Problems, vol. 10, no. 4, pp: 1317-1323 , Aug, 2023, 10.18280/mmep.100425 [DOI]
- 54.Ramdass, P., Ganesan, G., Boulaaras, S. & Tantawy, SSh. Enhancing efficacy in breast cancer screening with Nesterov momentum optimization techniques. Mathematics12(21), 3354. 10.3390/math12213354 (2024). [Google Scholar]
- 55.Dorigo M. Optimization, learning and natural algorithms. Ph. D. Thesis, Politecnico di Milano. 1992.
- 56.M. Dorigo and G. Di Caro, Ant Colony Optimization: a new meta-heuristic. 2003, pp. 1470–1477. 10.1109/cec.1999.782657.
- 57.Gil,David and Girela,Jose, Fertility , UCI Machine Learning Repository, 2013,10.24432/C5Z01Z.
- 58.Simfukwe, Macmillan, Kunda, Douglas & Chembe, Christopher. Comparing naive bayes method and artificial neural network for semen quality categorization. Int. J. Innov. Sci. Eng. Technol2, 689–694 (2015). [Google Scholar]
- 59.Yibre, A. M. & Koçer, B. Semen quality predictive model using feed forwarded neural network trained by learning-based artificial algae algorithm. Eng. Sci. Technol. Int. J.24(2), 310–318 (2021). [Google Scholar]
- 60.Engy, E. L., Ali, E. L. & Sally, El.-G.H.A.M.R.A.W.Y. An optimized artificial neural network approach based on sperm whale optimization algorithm for predicting fertility quality. Stud. Inform. Control27, 349–358 (2018). [Google Scholar]
- 61.Wang, H., Xu, Q. & Zhou, L. Seminal quality prediction using clustering-based decision forests. Algorithms7(3), 405–417. 10.3390/a7030405 (2014). [Google Scholar]
- 62.Sheth, P. D., Patil, S. T. & Dhore, M. L. Evolutionary computing for clinical dataset classification using a novel feature selection algorithm. J. King Saud Univ. Comput. Inform. Sci.34, 5075–5082 (2022). [Google Scholar]
- 63.Ma, J., Afolabi, D. O., Ren, J. & Zhen, A. Predicting seminal quality via imbalanced learning with evolutionary safe-level synthetic minority over-sampling technique. Cogn. Comput.13(4), 833–844 (2021). [Google Scholar]
- 64.GhoshRoy, D., Alvi, P. A. & Santosh, K. C. Explainable AI to predict male fertility using extreme gradient boosting algorithm with SMOTE. Electronics12(1), 15 (2022). [Google Scholar]
- 65.GhoshRoy, D., Alvi, P. A. & Santosh, K. C. Unboxing industry-standard AI models for male fertility prediction with SHAP. Healthcare11, 929 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data sets used and/or analyzed during the current study available from the corresponding author on reasonable request.




















