Skip to main content
Asian Journal of Andrology logoLink to Asian Journal of Andrology
. 2025 Aug 12;28(2):143–150. doi: 10.4103/aja202544

Unraveling sperm kinematic heterogeneity with machine learning

Andrés Aragón-Martínez 1,
PMCID: PMC13065317  PMID: 40791002

Abstract

The management of data from computer-aided sperm analysis (CASA) systems is crucial for understanding sperm motility. CASA systems generate motility parameters derived from tracking individual sperm cells, producing raw data as spermatozoa coordinates, which form the basis for sperm trajectory construction. These parameters and trajectories allow statistical descriptions of motility and identification of sperm heterogeneity. The substantial information provided by CASA enables the application of artificial intelligence (AI) techniques to interpret their biological significance. However, the type and format of CASA data, whether raw or condensed, pose challenges for analysis using conventional statistical methods. Advances in machine learning and deep learning have addressed these limitations by leveraging motility parameters and trajectory representations for automated classification and clustering of motility patterns. These methods, including supervised and unsupervised learning, have been employed to identify kinematic subpopulations within sperm samples, offering deeper insights into sperm dynamics. Open-source tools and CASA systems have facilitated this progress by providing accessible platforms for AI applications in sperm motility analysis. Although the use of machine learning in this field remains limited, integrating CASA-derived data with AI techniques shows potential for automating sperm classification and identifying motility patterns, advancing reproductive biology and fertility assessments. This work reviews the traditional use of CASA data, the analytical constraints, and the promising role of machine learning in enhancing the understanding of the heterogeneity of sperm kinematics.

Keywords: artificial intelligence, CASA, kinematic subpopulations, motility parameters, sperm heterogeneity, sperm trajectories

INTRODUCTION

The management of data obtained from computer-aided sperm analysis (CASA) is critical for gaining insights into sperm motility. The motility parameters of each sperm cell, typically derived from CASA systems, represent condensed values computed from the raw coordinates tracked during spermatozoa movement. These coordinates are used to calculate motility parameters and, in their raw form, are also employed to construct sperm trajectories. As a result, the trajectories are associated with specific motility parameters, which are ultimately used to describe sperm motility statistically.

Machine learning and deep learning techniques can utilize either motility parameters or trajectory images to enable the automatic classification and clustering of motility patterns. The enrichment of data plays a significant role in enhancing the learning process of artificial intelligence (AI) algorithms. However, there are currently only a limited number of examples of supervised and unsupervised machine learning approaches that leverage both raw and condensed data obtained from CASA systems for clustering and classification. The application of these methods depends largely on the availability and type of data that can be extracted from CASA systems.

The sperm in an ejaculate form a heterogeneous population.1,2,3 Sperm heterogeneity is easily observable at the morphological level,4 as well as at the kinematic level. If the ejaculate is heterogeneous, this heterogeneity must have significant implications for the practical use of spermatozoa in assisted reproductive techniques, both in human medicine and livestock breeding.2 There is heterogeneity in motility patterns in fresh semen samples5 and under different processing conditions.6 This variability in motility patterns may be related to the way spermatozoa respond to substances present in their environment.1,2 Interest in identifying kinematic subpopulations has existed for more than 20 years, from recognizing the need for such identification1 to the development of distinct statistical strategies.3,7 These advances were made possible thanks to the availability of individual-level data provided by CASA systems.

This review presents a conceptual analysis of data acquisition and management from CASA systems, the statistical and conventional use of condensed data for identifying sperm heterogeneity, and the alternative applications of machine learning and deep learning techniques that use condensed data, as well as traditional and novel trajectory representations, as inputs.

DATA GENERATION AND TRACKING MECHANISMS IN CASA SYSTEMS

The evaluation of sperm motility is fundamental as a first step in assessing fertility potential in vivo. Other parameters, such as sperm concentration, viability, and morphology, are commonly evaluated in the ejaculate; however, the presence of motile sperm indicates the potential to reach the ovum. While sperm motility is commonly assessed subjectively by a technician, the need for an objective evaluation is always recognized. Objective evaluation is provided by CASA systems; however, manual assessment is still the standard for human sperm evaluation.8

A CASA system utilizes computer vision (object illumination, construction of a digital image through pixel activation on a digital camera chip, image capture, storage of the digital image, and image processing) to detect sperm cells. During the image processing stage, different algorithms can track the sperm observed in a microscopic field over a period of time. In a calibrated camera–microscope system, the pixel-to-micron ratio is known. Therefore, if the observation time is also known, it is possible to calculate the distance-to-time ratio, i.e., the velocity of each detected sperm cell, along with other kinematic parameters.9,10 The underlying idea in the image processing stage is that since sperm cells can be detected within a fixed visual field, their movements can be tracked across multiple consecutive images (frames) of the same visual field.

The digital tracking of sperm is performed in a two-dimensional space (x, y), constrained by the microscope stage. Thus, each sperm cell identified by the software is represented by a set of pixels, which is assigned an (x, y) coordinate (e.g., its centroid). These coordinates allow the localization of the sperm cell in each frame throughout the observation period. The CASA system software uses the coordinate values as inputs for algorithms that calculate various motility parameters or kinematic parameters (Table 1 and Figure 1). In this way, it is possible to associate a set of motility parameter values with the coordinates.

Table 1.

The common motility parameters calculated by computer-aided sperm analysis systems

Name Abbreviation Unit Description
Curvilinear velocity VCL µm s−1 The total distance traveled by the sperm head along its actual trajectory divided by the observation period
Average path velocity VAP µm s−1 The average velocity of the sperm head along its smoothed (average) trajectory
Straight-line velocity VSL µm s−1 The velocity calculated from the straight-line distance between the initial and final positions of the sperm head, divided by the total observation time
Linearity LIN=VSL/VCL UD The linearity of the curvilinear path
Straightness STR=VSL/VAP UD The linearity of the average path
Wobble WOB=VAP/VCL UD A measure of the oscillation of the curvilinear path relative to the average path
Amplitude of lateral head displacement ALH µm The amplitude of the lateral displacement of the head relative to the average path
Beat-cross frequency BCF Hz The average frequency at which the sperm head crosses its smoothed (average) path

UD: derived unit

Figure 1.

Figure 1

Obtaining motility parameters from a CASA system. This example shows the steps to calculate the straight line velocity (VSL) from coordinates. The tracking algorithm in CASA systems detects a group of pixels composing the sperm head in (a) the first and (b) consecutive photograms, and then (c) records their coordinates. To determine (d) the sperm’s velocity from the first to the last photogram, (e) the distance it travels is divided by the time taken. The distance is derived from the relationship between pixels and length in micrometers (the conversion factor), while the time represents the duration in seconds during which the system captures photograms. CASA: computer-aided sperm analysis.

If tracking is defined as the ability of an algorithm to detect a sperm cell without bias, and this detection allows the assignment of coordinates in successive frames, a trajectory is the graphical representation of consecutive x and y coordinates where a sperm cell was detected (Figure 2). The software of some CASA systems allows visualization of the trajectories followed by the detected sperm cells in a video sequence.11,12 This is very useful for quickly recognizing the behavior of sperm in a sample. However, some CASA systems have the limitation of not allowing the retrieval or storage of the detected sperm coordinates. As a result, researchers cannot reconstruct sperm trajectories or analyze them later. There are at least two open-source CASA software systems that allow for the storage of coordinates.13,14,15,16

Figure 2.

Figure 2

Simple representation of trajectories, motility parameters, and the idea behind alternative representations of data-enriched trajectories. Schematic representation of sperm detection in a sequence of images. The black sperm in the images on the left side of the panel represents the first and last detections, while the gray sperm indicates intermediate and consecutive detections of the same sperm in distinct frames. The red line in a represents the straight trajectory between the first and last detections of the tracked sperm. (b) A view of the straight trajectory without the sperm is shown. The red line in c illustrates the zigzag trajectory of the sperm by connecting the coordinates where the sperm was detected in each photogram (arrow). (d) The curvilinear trajectory without the sperm is shown. The sinuous red line in e represents the smoothed trajectory, constructed by averaging consecutive coordinates. (f) The average trajectory without the sperm is shown. (g) The overlay of trajectories from b, d, and f, along with the ALH parameter is shown; the dotted line delineates the hand-drawn area occupied by the sperm over time. (h) The overlay of trajectories from b, d, and f is shown. (i) The overlay of trajectories demonstrates how new information can be generated, such as the blue discrete areas, which roughly correspond to ALH. The region delimited by the dotted line in j illustrates the use of the (x, y) planar space. Note how the construct in j resembles a KDE plot. VSL: straight-line velocity; VCL: curvilinear velocity; VAP: average path velocity; ALH: amplitude of lateral head displacement; KDE: kernel density estimation.

Therefore, it is clear that from the coordinates, it is possible to obtain motility parameters and construct sperm trajectories. The trajectory is the most basic expression of image processing, as the coordinate data should not be manipulated beyond plotting. In contrast, motility parameters are obtained through a set of mathematical operations that take the coordinates as input. Thus, motility parameters are the condensed numerical representations of a set of coordinates, allowing us to state that trajectories are associated with a corresponding set of motility parameters.

UNDERSTANDING AND MANAGING SPERM KINEMATIC HETEROGENEITY

CASA systems enable the evaluation of hundreds of individual sperm cells per sample and calculate between 8 and 12 kinematic parameters for each detected sperm cell. This capability allows for the construction of large datasets. These large volumes of biological data can be analyzed by what has recently been referred to as biomedical data science.17

Once it became possible to obtain motility parameters for each sperm cell, it also became feasible to build sizable datasets. For example, in an experiment in which eight motility parameters were evaluated for 500 sperm cells per sample from five stallions across four treatments with two replicates (500 × 5 × 4 × 2), a dataset of 20 000 evaluated sperm cells and a total of 160 000 data points was generated.16

Deriving meaningful insights from these datasets through statistical analysis presents a challenge for researchers.3,7 The initial analysis strategy involved using each motility parameter to obtain measures of central tendency and dispersion. In other words, univariate statistics were applied to multivariate data without accounting for the heterogeneity of the evaluated samples. Despite its limitations, this approach remains in use. However, since all sperm motility parameters in CASA systems are calculated for each detected sperm cell, univariate interpretations lack significance and limit a comprehensive understanding of the biological phenomenon. This issue has been discussed in other works.4,18

Sperm cells within an ejaculate form a heterogeneous population of cells, which can be grouped into subpopulations that share common characteristics. These subpopulations have distinct cellular or molecular features.2 When the distinguishing characteristic of these subpopulations is motility, they are referred to as kinematic subpopulations.

To identify kinematic subpopulations, various strategies and statistical tools have been applied to the motility parameters of sperm cells evaluated using a CASA system.3,17,19 All motility parameters are measurements that help describe a spatially restricted kinematic phenomenon. Although CASA systems are used to evaluate sperm motility in many reports, the identification of kinematic subpopulations is not always reported. Instead, only means and measures of dispersion for each motility parameter under each treatment are presented.20,21,22 This handling of the dataset may introduce biases in the biological interpretation of the results. Notably, the least favorable aspect of this type of analysis is that it obscures the underlying heterogeneity in the dataset.

One of the earliest limitations to identifying kinematic subpopulations using individual sperm data was that CASA system software did not provide data for individual cells, only offering access to central tendency measures for each motility parameter. In some cases, the values for each sperm cell’s motility parameters were not provided, and individual values were presented only in intervals (e.g., CASA systems from Hamilton Thorne Biosciences, Beverly, MA, USA).19,23,24

The rationale for using data obtained from CASA systems is that the information they provide could optimize semen dose production and improve efficiency in animal and human reproduction processes, whether by using average values of sperm motility parameters25,26 or kinematic subpopulations.3,27,28 This argument has recently been revisited, explicitly considering sperm heterogeneity: “if the ejaculate is heterogeneous, this heterogeneity must have significant effects on the practical use of sperm in artificial reproductive techniques, both in human medicine and livestock reproduction.”2

The origin and significance of sperm heterogeneity may arise during spermatogenesis and sperm maturation in the epididymis. In the testis, an unequal distribution of different molecules along the cytoplasmic bridges may occur. This uneven distribution likely impacts how sperm exhibit their motility potential and how they interact with the female reproductive tract.2 Therefore, if heterogeneity is a natural condition of sperm within an ejaculate or sample, a proper evaluation of sperm properties, such as motility, requires the identification and characterization of subpopulations.

The concept of kinematic heterogeneity in datasets obtained through CASA analysis was addressed early on using bivariate graphical analysis of motility parameters.1,29,30 In these studies, researchers analyzing bivariate plots of average path velocity (VAP) and linearity (LIN) identified a distinct cluster of sperm that exhibited high VAP and LIN values when exposed to either a toxic agent (mercury chloride) or a physiological regulator (bicarbonate). These findings underscore the presence of kinematic heterogeneity in sperm populations and illustrate how variations in VAP and LIN can reveal distinct subpopulations and their responsiveness to different stimuli. Consequently, the statistical analysis of these motility parameters is essential for understanding and quantifying sperm kinematic heterogeneity.

CLUSTERING OF MOTILITY PARAMETERS: A WAY TO DEAL WITH KINEMATIC HETEROGENEITY

In the standard approach to identifying subpopulations within these multivariate datasets, dimensionality reduction strategies were adopted (e.g., principal component analysis). The principal components generated are then used as input for the algorithms of clustering.3,7,31 These algorithms group the input data into subsets based on mathematical criteria of similarity. Clustering is the process of partitioning a set of objects into subsets (called clusters) such that each subset contains similar objects and objects in different subsets are dissimilar. Hierarchical clustering can be broadly divided into divisive and agglomerative strategies.32,33 Agglomerative methods work in a bottom-up manner. At the beginning, every input object forms its own cluster. In each subsequent step, the two closest clusters will be merged until only one cluster remains.32 Divisive methods work on the principle of a top-down approach. It starts with the root with all items involved in a lone cluster. During every iteration, the most diverse cluster is divided into two groups, with one group being called the left cluster and another group being called the right cluster. This process continues until all the objects are singletons.34 Agglomerative hierarchical clustering allows the formation of data hierarchies based on similarity. This similarity can be understood as the distance between data points.35 Euclidean distance is one such measure, although other similarity metrics exist.36

Once subpopulations within the dataset have been identified, the next step is to compute descriptive statistics (typically mean ± standard error of the mean) and inferential statistics (commonly one- or two-way analysis of variance [ANOVA] and Chi-squared matrices) for the data across different treatments (if any) within each subpopulation.16 This allows for the interpretation of treatment effects within each subpopulation.

Clustering of motility parameters, whether or not dimensionality reduction is used, is a better approach because it incorporates all motility parameters into the analysis. However, because motility parameters are condensed measures of the coordinates/trajectories, significant information loss could occur during the clustering process. This is supported by the fact that trajectories cannot be reconstructed from motility parameter values alone. In this sense, motility parameters are indicators of the shape of the trajectories. Therefore, one way to obtain comprehensive information on sperm movement would be to use the trajectories as a classification criterion. In studies of Goodson et al.37 and Gacem et al.,12 examples of sperm trajectories in different clusters are presented, along with the kinematic parameter values associated with those trajectories. In these studies, it is clear that if only kinematic parameter values were available, it would be impossible to construct the graphical representation of the trajectories.

OPEN-SOURCE INNOVATIONS IN CASA TECHNOLOGY

The global trend toward reproducible science,38 along with the open-source and open-hardware movements, has enabled researchers to overcome the high cost associated with proprietary scientific equipment.39,40 This has allowed various research groups to build their own CASA systems by integrating pre-existing hardware with electronic controllers and open-source analysis software. As a result, research groups worldwide have been able to obtain kinematic data from individual cells. Notable examples of such developments include the CASA software13 and OpenCASA,14 both running on the ImageJ platform.41 The CASA software,13 initially developed for use with fish sperm, has since been adapted for use with horse sperm15 and pig sperm.16 The same open-source platform has also enabled the development of software to analyze sperm motility from humans, bulls, and chickens in microfluidic chips.42 This software uses libraries that are not available in ImageJ, and it is only available for Windows. Meanwhile, for analyzing kinematic data from ram and deer sperm, the software Motility Tracker was reported to provide good results,43 although it is no longer available.

The open-source software used in CASA systems allows for the storage of motility parameters and the coordinates of each evaluated sperm cell.13,15 With these coordinates, the individual trajectories of the evaluated sperm cells can be reconstructed as images. These images can serve as input for clustering algorithms using machine learning.44 In this way, in addition to clustering, it is also possible to associate motility parameters with their trajectories, allowing for the statistical description and comparison of subpopulations.

AI IN SPERM MOTILITY ANALYSIS

Machine learning is a subset of AI that enables the recognition of numerical patterns,45 in which an algorithm predicts the outcome of an unobserved event based on numerical patterns and makes identification inferences from the input data without explicit instructions. Most machine learning methods are classified into supervised and unsupervised learning. In supervised learning, the algorithm is trained with a labeled dataset (in which each data point is paired with an outcome) to identify the input–output correlation and then uses this correlation to predict the outcome of new cases. Supervised machine learning algorithms make predictions based on the training dataset, which must be labeled by human experts. Binary classification is a typical example of supervised learning, where input data (e.g., sperm morphology) are classified into one of two possible categories (normal sperm vs abnormal sperm). In unsupervised learning, the algorithm identifies previously unknown underlying patterns in an unlabeled input dataset. A key example of unsupervised learning is clustering, in which the algorithm determines shared characteristics of the input dataset and groups them accordingly, then extrapolates patterns for future classifications of similar datasets.46 Supervised and unsupervised models could serve to deal with sperm heterogeneity.

The idea of applying machine learning methods to kinematic data obtained from a CASA system was first published by Martínez-Pastor et al.7 in 2011 and later developed by Ramón and Martínez-Pastor3 in 2018. This latter work emphasized the power of supervised machine learning methods. Support vector machine (SVM), a supervised machine learning technique, was implemented to receive motility parameter data as input and was used in the automatic classification of different sperm motility patterns in mice,37 gazelles,31 and humans.11 Since SVM is a supervised learning technique, it required human expert labeling of the dataset instances (rows), which is a time-consuming and error-prone process. In these studies, different sperm classes were identified (motile, weakly motile, progressive, intermediate, and hyperactivated), and the representative trajectories of sperm assigned to each class were shown, revealing noticeable differences in the trajectory shapes within each class. SVM is a robust approach for identifying kinematic heterogeneity when working with human or mouse sperm, particularly when only motility parameter values derived from the CASA system are available.

If the CASA system allows for the storage of each sperm’s coordinates, other strategies can be applied to identify sperm heterogeneity, such as simple trajectory representation or alternative strategies (Figure 2). Recently, we used the coordinates of activated sperm to reconstruct images of their trajectories and employed those images as input for an unsupervised machine learning algorithm, which allowed us to identify kinematic subpopulations.44 This strategy uses standard experimental methods along with a Python-based workflow (Figure 3). Using this approach, we demonstrated that chemical inhibition of the serotonin 2A receptor (5HT-2A) altered activated sperm motility, inducing curved trajectories,44 while the addition of tryptophan to the medium induced linear trajectories.47

Figure 3.

Figure 3

Acquisition of data and computational model for clustering images of trajectories. Sequences of images are acquired by a CASA system (pictures enclosed by blue lines). From these image sequences, both the coordinates and motility parameters of each sperm are obtained (in a results sheet). A Python script splits the result sheet into two separate sheets, containing the motility parameters and coordinates, respectively. Additional Python scripts are written for other relevant steps in the computational model. Each sheet of results is iteratively added to a new dataframe. The coordinates from each analyzed sperm are then used to construct alternative individual trajectory images, which serve as input for machine learning (clustering algorithms) or deep learning algorithms (predictive tasks). The clustered trajectories (subpopulations) are linked with their associated motility parameters to create the statistical description. CASA: computer-aided sperm analysis; KDE: kernel density estimation.

Other approaches that do not use CASA systems have also been reported for describing sperm motility. Recently, sperm flagellum images were analyzed to describe the three-dimensional (3D) swimming patterns of individual sperm, using a specialized non-CASA hardware system.48 Currently, 3D movement is not analyzed by CASA systems. In that study, the data served as input for a hierarchical agglomerative machine learning algorithm, used as a clustering tool in a manner similar to our approach.44 The method for capturing sperm motility described by Hernández et al.48 in 2024 is out of reach for most researchers. However, this work demonstrates that data from non-CASA systems, which were designed to analyze sperm motility, can also serve as inputs for machine learning algorithms to unravel sperm motility heterogeneity.

INNOVATIVE TRAJECTORY VISUALIZATIONS AND DATA ENHANCEMENT TECHNIQUES

To date, the graphic representation of trajectories has primarily been through line plots.12,37,49,50 Line plots connect the bivariate points defined by the coordinates without explicitly plotting the points corresponding to those coordinates. Techniques of data enrichment, such as feature engineering, data augmentation (Figure 4), imputation, oversampling, undersampling, and external data integration, could enhance the learning process of algorithms, as researchers have demonstrated in other fields.51 Thus, alternative graphical representations of trajectories, such as dot plots and heat maps, could serve as data enrichment tools for machine learning clustering algorithms. This approach could be particularly valuable for the automatic discrimination of hyperactivated motility.

Figure 4.

Figure 4

Example of alternative representation of a trajectory with data augmentation. Transformations (random flip, random rotation, random zoom, and random translation) were applied to the original image plot. To offer diverse learning scenarios, the diverse versions of the same image can be used as input for machine learning and deep learning algorithms (i.e., allowing the model to see more variations during training and learn more robust features).

The objective identification of hyperactivated sperm using data obtained from CASA systems, at least in humans, is typically achieved by adjusting threshold values for motility parameters such as curvilinear velocity and amplitude of lateral head displacement (VCL and ALH, respectively).52,53 The use of motility parameters in clustering strategies allows for the automatic identification of a group of sperm cells whose maximum and minimum motility parameter values are very close to the threshold values recommended by the CASA system manufacturer for detecting hyperactivation (unpublished results conducted by Dr. Chirinos M from Department of Reproductive Biology, National Institute of Medical Sciences and Nutrition Salvador Zubirán, Mexico City, Mexico). Like many other phenomena in biology, hyperactivation is a unique event but may vary in scale. For example, it has been observed that bull and pig sperm can exhibit either partial or complete hyperactivated movement.54 Analytical approaches that use motility data other than condensed motility measures, as well as enriched alternative graphical representations, could help identify sperm with varying degrees of hyperactivated motility objectively and automatically.

The identification of hyperactivated movement is related to the depth of the chamber containing the sperm cells.55 It is possible that changes in the focal point (reflected in pixel value changes after image processing) could serve as indicators of 3D movement. This 3D movement could be represented as pixels with different values along the trajectory and in surrounding pixels. Therefore, exploring the graphic representation of hyperactivated movement using data from conventional CASA systems could aid in identifying hyperactivated swimming patterns. Since hyperactivated sperm exhibit a variety of swimming patterns distinct from those of activated sperm,11,54,56 it can be speculated that enriched graphical representations, different from traditional line plots, such as kernel density estimation (KDE) plots or heat maps, would make better use of the (x, y) plane and serve as more effective inputs for machine learning algorithms that classify and cluster hyperactivated sperm cells (Figure 5).

Figure 5.

Figure 5

Conventional representation of trajectories versus alternative data-enriched representation. Coordinates were used to construct two representations of distinct trajectories of hamster and boar sperm. Trajectories are shown as either conventional line plots or alternative data-enriched KDE plots. The columns of plots on the left side display trajectories of hamster sperm, which were classified as hyperactivated or non-hyperactivated by experts based on video sequences. Each trajectory is represented as (a) a line plot or (b) a KDE plot. The columns of plots on the right side display trajectories of boar sperm, represented as (c) line plots or (d) KDE plots. The data used to construct the images of hamster sperm were generously provided by Dr. Masakatsu Fujinoki (Laboratory for Reproductive Medicine, Research Center for Advanced Medical Science, School of Medicine, Dokkyo Medical University, Mibu, Tochigi, Japan; unpublished results). KDE: kernel density estimation.

CONCLUSIONS

In the pursuit of relevant biological information, the handling of data obtained from CASA systems has evolved over time: from univariate evaluations of sperm motility parameters to the use of statistical clustering techniques, and more recently, to the application of machine learning and deep learning algorithms that use motility parameter data as input. Recently, the use of coordinates has been employed to reconstruct conventional images of sperm trajectories, which have served as input for unsupervised clustering algorithms. A promising alternative for automatically identifying, classifying, and grouping sperm trajectories is the use of enriched representations of sperm trajectories, as well as data augmentation through image transformations. This latter approach could be particularly relevant for the automatic identification of hyperactivated swimming patterns and for addressing the heterogeneity in sperm kinematics.

AUTHOR CONTRIBUTIONS

AAM prepared the manuscript, and read and approved the final manuscript.

COMPETING INTERESTS

The author declares no competing interests.

ACKNOWLEDGMENTS

This work was supported by the UNAM–DGAPA–PAPIIT (Grant No. IT201021 and No. IN224925).

REFERENCES

  • 1.Holt WV, Van Look KJ. Concepts in sperm heterogeneity, sperm selection and sperm competition as biological foundations for laboratory tests of semen quality. Reprod Camb Engl. 2004;127:527–35. doi: 10.1530/rep.1.00134. [DOI] [PubMed] [Google Scholar]
  • 2.Martínez-Pastor F. What is the importance of sperm subpopulations? Anim Reprod Sci. 2022;246:106844. doi: 10.1016/j.anireprosci.2021.106844. [DOI] [PubMed] [Google Scholar]
  • 3.Ramón M, Martínez-Pastor F. Implementation of novel statistical procedures and other advanced approaches to improve analysis of CASA data. Reprod Fertil Dev. 2018;30:860–6. doi: 10.1071/RD17479. [DOI] [PubMed] [Google Scholar]
  • 4.Ramón M, Jiménez-Rabadán P, García-Álvarez O, Maroto-Morales A, Soler AJ, et al. Understanding sperm heterogeneity:biological and practical implications. Reprod Domest Anim. 2014;49(Suppl 4):30–6. doi: 10.1111/rda.12404. [DOI] [PubMed] [Google Scholar]
  • 5.Ramón M, Soler AJ, Ortiz JA, García-Alvarez O, Maroto-Morales A, et al. Sperm population structure and male fertility:an intraspecific study of sperm design and velocity in red deer. Biol Reprod. 2013;89:110. doi: 10.1095/biolreprod.113.112110. [DOI] [PubMed] [Google Scholar]
  • 6.Ramón M, Pérez-Guzmán MD, Jiménez-Rabadán P, Esteso MC, García-Álvarez O, et al. Sperm cell population dynamics in ram semen during the cryopreservation process. PLoS One. 2013;8:e59189. doi: 10.1371/journal.pone.0059189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Martínez-Pastor F, Tizado EJ, Garde JJ, Anel L, de Paz P. Statistical series:opportunities and challenges of sperm motility subpopulation analysis. Theriogenology. 2011;75:783–95. doi: 10.1016/j.theriogenology.2010.11.034. [DOI] [PubMed] [Google Scholar]
  • 8.World Health Organization. WHO Laboratory Manual for the Examination and Processing of Human Semen. 5th ed. Geneva: World Health Organization; 2010. [Google Scholar]
  • 9.Cooper T. Semen analysis. In: Nieshlag E, Behre HM, Nieschlag S, editors. Andrology: Male Reproductive Health and Dysfunction. 3rd ed. Berlin: Springer; 2010. pp. 125–38. [Google Scholar]
  • 10.Okumus F, Kocamaz AF, Özgür ME. Using polynomial modeling for calculation of quality parameters in computer assisted sperm analysis. J Comput Sci. 2021;6:152–65. [Google Scholar]
  • 11.Goodson SG, White S, Stevans AM, Bhat S, Kao CY, et al. CASAnova:a multiclass support vector machine model for the classification of human sperm motility patterns. Biol Reprod. 2017;97:698–708. doi: 10.1093/biolre/iox120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gacem S, Valverde A, Catalán J, Yánez Ortiz I, Soler C, et al. A new approach of sperm motility subpopulation structure in donkey and horse. Front Vet Sci. 2021;8:651477. doi: 10.3389/fvets.2021.651477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wilson-Leedy JG, Ingermann R. Development of a novel CASA system based on open source software for characterization of zebrafish sperm motility parameters. Theriogenology. 2007;67:661–72. doi: 10.1016/j.theriogenology.2006.10.003. [DOI] [PubMed] [Google Scholar]
  • 14.Alquézar-Baeta C, Gimeno-Martos S, Miguel-Jiménez S, Santolaria P, Yániz J, et al. OpenCASA:a new open-source and scalable tool for sperm quality analysis. PLoS Comput Biol. 2019;15:e1006691. doi: 10.1371/journal.pcbi.1006691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Giaretta E, Munerato M, Yeste M, Galeati G, Spinaci M, et al. Implementing an open-access CASA software for the assessment of stallion sperm motility:relationship with other sperm quality parameters. Anim Reprod Sci. 2017;176:11–9. doi: 10.1016/j.anireprosci.2016.11.003. [DOI] [PubMed] [Google Scholar]
  • 16.Rivas C, Ayala M, Aragon A. Effect of various pH levels on the sperm kinematic parameters of boars. South Afr J Anim Sci. 2022;52:693–704. [Google Scholar]
  • 17.Bourne PE. Is ‘bioinformatics’ dead? PLoS Biol. 2021;19:e3001165. doi: 10.1371/journal.pbio.3001165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Altschuler SJ, Wu LF. Cellular heterogeneity:do differences make a difference? Cell. 2010;141:559–63. doi: 10.1016/j.cell.2010.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ayala M, Aragón M. Effect of sexual steroids on boar kinematic sperm subpopulations. Cytometry A. 2017;91:1096–103. doi: 10.1002/cyto.a.23246. [DOI] [PubMed] [Google Scholar]
  • 20.Jimenez-Trejo F, Tapia-Rodriguez M, Cerbon M, Kuhn DM, Manjarrez-Gutierrez G, et al. Evidence of 5-HT components in human sperm:implications for protein tyrosine phosphorylation and the physiology of motility. Reproduction. 2012;144:677–85. doi: 10.1530/REP-12-0145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Contri A, Gloria A, Robbe D, Valorz C, Wegher L, et al. Kinematic study on the effect of pH on bull sperm function. Anim Reprod Sci. 2013;136:252–9. doi: 10.1016/j.anireprosci.2012.11.008. [DOI] [PubMed] [Google Scholar]
  • 22.Sakamoto C, Fujinoki M, Kitazawa M, Obayashi S. Serotonergic signals enhanced hamster sperm hyperactivation. J Reprod Dev. 2021;67:241–50. doi: 10.1262/jrd.2020-108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Becerril M, Huerta C, Méndez M, Hernández G, Aragón M. Use of multivariate statistics to identify unreliable data obtained using CASA. Syst Biol Reprod Med. 2013;59:164–71. doi: 10.3109/19396368.2013.766281. [DOI] [PubMed] [Google Scholar]
  • 24.Vázquez AJ, Cedillo MJ, Quezada VJ, Rivas AC, Morales EC, et al. Effects of repeated electroejaculations on kinematic sperm subpopulations and quality markers of Mexican creole goats. Anim Reprod Sci. 2015;154:29–38. doi: 10.1016/j.anireprosci.2014.12.009. [DOI] [PubMed] [Google Scholar]
  • 25.Broekhuijse ML, Šoštarić E, Feitsma H, Gadella BM. Application of computer-assisted semen analysis to explain variations in pig fertility. J Anim Sci. 2012;90:779–89. doi: 10.2527/jas.2011-4311. [DOI] [PubMed] [Google Scholar]
  • 26.García-Molina A, Navarro N, Valverde A, Sadeghi S, Garrido N, et al. Optimization of human semen analysis using CASA-Mot technology. Syst Biol Reprod Med. 2023;69:166–74. doi: 10.1080/19396368.2023.2170297. [DOI] [PubMed] [Google Scholar]
  • 27.García-Molina A, Navarro N, Valverde A, Bompart D, Caldeira C, et al. Human kinematic and morphometric sperm subpopulation analysis using CASA technology:a new approach to spermatozoa classification. Rev Int Androl. 2022;20:257–65. doi: 10.1016/j.androl.2021.05.003. [DOI] [PubMed] [Google Scholar]
  • 28.Santolaria P, Soler C, Recreo P, Carretero T, Bono A, et al. Morphometric and kinematic sperm subpopulations in split ejaculates of normozoospermic men. Asian J Androl. 2016;18:831–34. doi: 10.4103/1008-682X.186874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Abaigar T, Holt WV, Harrison RA, del Barrio G. Sperm subpopulations in boar (Sus scrofa) and gazelle (Gazella dama mhorr) semen as revealed by pattern analysis of computer-assisted motility assessments. Biol Reprod. 1999;60:32–41. doi: 10.1095/biolreprod60.1.32. [DOI] [PubMed] [Google Scholar]
  • 30.Satake N, Elliott RM, Watson PF, Holt WV. Sperm selection and competition in pigs may be mediated by the differential motility activation and suppression of sperm subpopulations within the oviduct. J Exp Biol. 2006;209:1560–72. doi: 10.1242/jeb.02136. [DOI] [PubMed] [Google Scholar]
  • 31.Ramón M, Martínez-Pastor F, García-Álvarez O, Maroto-Morales A, Soler AJ, et al. Taking advantage of the use of supervised learning methods for characterization of sperm population structure related with freezability in the Iberian red deer. Theriogenology. 2012;77:1661–72. doi: 10.1016/j.theriogenology.2011.12.011. [DOI] [PubMed] [Google Scholar]
  • 32.Ackermann MR, Blömer J, Kuntze D, Sohler C. Analysis of agglomerative clustering. Algorithmica. 2014;69:184–215. [Google Scholar]
  • 33.Sinaga KP, Yang MS. Unsupervised K-means clustering algorithm. IEEE Access. 2020;8:80716–27. [Google Scholar]
  • 34.Lalitha YS. An advanced agglomerative hierarchical clustering methods on aquatic data set. Int J Aquat Sci. 2014;5:215–23. [Google Scholar]
  • 35.Husson F, Josse J, Pagès J. Principal components methods-hierarchical clustering-partitional clustering:why would we need to choose for visualizing data. Tech Rep Agrocampus. 2010:1–17. [Google Scholar]
  • 36.Singh A, Yadav A, Rana A. K-means with three different distance metrics. Int J Comput Appl. 2013;67:13–7. [Google Scholar]
  • 37.Goodson SG, Zhang Z, Tsuruta JK, Wang W, O’Brien DA. Classification of mouse sperm motility patterns using an automated multiclass support vector machines model. Biol Reprod. 2011;84:1207–15. doi: 10.1095/biolreprod.110.088989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Goodman SN, Fanelli D, Ioannidis JP. What does research reproducibility mean? Sci Transl Med. 2016;8:341ps12. doi: 10.1126/scitranslmed.aaf5027. [DOI] [PubMed] [Google Scholar]
  • 39.Pearce JM. Materials science. Building research equipment with free, open-source hardware. Science. 2012;337:1303–4. doi: 10.1126/science.1228183. [DOI] [PubMed] [Google Scholar]
  • 40.Drack M, Hartmann F, Bauer S, Kaltenbrunner M. The importance of open and frugal labware. Nat Electron. 2018;1:482–6. [Google Scholar]
  • 41.Rasband WS. ImageJ, US National Institutes of Health. 1997. [[Last accessed on 2023 Sep 14]]. Available from: https://imagej.nih.gov/ij/
  • 42.Elsayed M, El-Sherry TM, Abdelgawad M. Development of computer-assisted sperm analysis plugin for analyzing sperm motion in microfluidic environments using Image-J. Theriogenology. 2015;84:1367–77. doi: 10.1016/j.theriogenology.2015.07.021. [DOI] [PubMed] [Google Scholar]
  • 43.Buchelly ImbachíF, Zalazar L, Pastore JI, Greco MB, Iniesta-Cuerda M, et al. Objective evaluation of ram and buck sperm motility by using a novel sperm tracker software. Reprod Camb Engl. 2018;156:11–21. doi: 10.1530/REP-17-0755. [DOI] [PubMed] [Google Scholar]
  • 44.Rodríguez-Martínez EA, Rivas CU, Ayala ME, Blanco-Rodríguez R, Juarez N, et al. A new computational approach, based on images trajectories, to identify the subjacent heterogeneity of sperm to the effects of ketanserin. Cytometry A. 2023;103:655–63. doi: 10.1002/cyto.a.24732. [DOI] [PubMed] [Google Scholar]
  • 45.Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. 2022;23:40–55. doi: 10.1038/s41580-021-00407-0. [DOI] [PubMed] [Google Scholar]
  • 46.You JB, McCallum C, Wang Y, Riordon J, Nosrati R. Machine learning for sperm selection. Nat Rev Urol. 2021;18:387–403. doi: 10.1038/s41585-021-00465-1. [DOI] [PubMed] [Google Scholar]
  • 47.Yael Aquetzalli CM. [Identification of Kinematic Heterogeneity in Spermatozoa Exposed to Tryptophan Using Machine Learning. Tlalnepantla de Baz: National University of Mexico; 2024. [Thesis in Spanish] [Google Scholar]
  • 48.Hernández HO, Montoya F, Hernández-Herrera P, Díaz-Guerrero DS, Olveres J, et al. Feature-based 3D+t descriptors of hyperactivated human sperm beat patterns. Heliyon. 2024;10:e26645. doi: 10.1016/j.heliyon.2024.e26645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Henning H, Petrunkina AM, Harrison RA, Waberski D. Cluster analysis reveals a binary effect of storage on boar sperm motility function. Reprod Fertil Dev. 2014;26:623–32. doi: 10.1071/RD13113. [DOI] [PubMed] [Google Scholar]
  • 50.Katoh Y, Takebayashi K, Kikuchi A, Iki A, Kikuchi K, et al. Porcine sperm capacitation involves tyrosine phosphorylation and activation of aldose reductase. Reprod Camb Engl. 2014;148:389–401. doi: 10.1530/REP-14-0199. [DOI] [PubMed] [Google Scholar]
  • 51.Nguyen H, Bui XN, Drebenstedt C. Blast-induced ground vibration data enrichment sustainable and responsible mining machine learning open-pit mining performance improvement. InżMiner. 2023;2:79–88. [Google Scholar]
  • 52.Suarez SS. Control of hyperactivation in sperm. Hum Reprod Update. 2008;14:647–57. doi: 10.1093/humupd/dmn029. [DOI] [PubMed] [Google Scholar]
  • 53.Hernández-Silva G, Fabián López-Araiza JE, López-Torres AS, Larrea F, Torres-Flores V, et al. Proteomic characterization of human sperm plasma membrane-associated proteins and their role in capacitation. Andrology. 2020;8:171–80. doi: 10.1111/andr.12627. [DOI] [PubMed] [Google Scholar]
  • 54.Harayama H. Flagellar hyperactivation of bull and boar spermatozoa. Reprod Med Biol. 2018;17:442–8. doi: 10.1002/rmb2.12227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Le Lannou D, Griveau JF, Le Pichon JP, Quero JC. Effects of chamber depth on the motion pattern of human spermatozoa in semen or in capacitating medium. Hum Reprod Oxf Engl. 1992;7:1417–21. doi: 10.1093/oxfordjournals.humrep.a137585. [DOI] [PubMed] [Google Scholar]
  • 56.Waberski D, Suarez SS, Henning H. Assessment of sperm motility in livestock:perspectives based on sperm swimming conditions in vivo. Anim Reprod Sci. 2022;246:106849. doi: 10.1016/j.anireprosci.2021.106849. [DOI] [PubMed] [Google Scholar]

Articles from Asian Journal of Andrology are provided here courtesy of Editorial Office of AJA.

RESOURCES