Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2025 Apr 17;21(4):e1012865. doi: 10.1371/journal.pcbi.1012865

Modeling diffusive search by non-adaptive sperm: Empirical and computational insights

Benjamin M Brisard 1, Kylie D Cashwell 1, Stephanie M Stewart 1, Logan M Harrison 1, Aidan C Charles 1, Chelsea V Dennis 1, Ivie R Henslee 1, Ethan L Carrow 1, Heather A Belcher 2, Debajit Bhowmick 3, Paul W Vos 4, Maciej Majka 5, Martin Bier 6, David M Hart 7, Cameron A Schmidt 1,*
Editor: Jing Chen,8
PMCID: PMC12005489  PMID: 40244975

Abstract

During fertilization, mammalian sperm undergo a winnowing selection process that reduces the candidate pool of potential fertilizers from ~106-1011 cells to 101-102 cells (depending on the species). Classical sperm competition theory addresses the positive or ‘stabilizing’ selection acting on sperm phenotypes within populations of organisms but does not strictly address the developmental consequences of sperm traits among individual organisms that are under purifying selection during fertilization. It is the latter that is of utmost concern for improving assisted reproductive technologies (ART) because low-fitness sperm may be inadvertently used for fertilization during interventions that rely heavily on artificial sperm selection, such as intracytoplasmic sperm injection (ICSI). Importantly, some form of sperm selection is used in nearly all forms of ART (e.g., differential centrifugation, swim-up, or hyaluronan binding assays, etc.). To date, there is no unifying quantitative framework (i.e., theory of sperm selection) that synthesizes causal mechanisms of selection with observed natural variation in individual sperm traits. In this report, we reframe the physiological function of sperm as a collective diffusive search process and develop multi-scale computational models to explore the causal dynamics that constrain sperm fitness during fertilization. Several experimentally useful concepts are developed, including a probabilistic measure of sperm fitness as well as an information theoretic measure of the magnitude of sperm selection, each of which are assessed under systematic increases in microenvironmental selective pressure acting on sperm motility patterns.

Author summary

During mammalian reproduction, sperm outnumber eggs by many orders of magnitude. This study models the statistical properties of sperm movement as a diffusive search process, combining experiments and simulations to explore how heterogeneity in motility patterns and microenvironmental complexity shape successful fertilization. We introduce simple metrics to quantify sperm fitness and the magnitude of selection pressure imposed by the microenvironment, revealing that sperm phenotype distributions interact with environmental constraints to determine the range of sperm traits that ultimately support successful egg contact. These insights improve the understanding of sperm subpopulation dynamics and offer practical tools for optimizing assisted reproductive technologies in clinical and agricultural settings.

Introduction

Assisted reproductive technologies (ARTs) are widely used in medicine and agriculture and include a variety of strategies such as in vitro fertilization, intra-uterine insemination, and embryo transplantation. Efficiency of ART is of utmost importance because of the implications for parental and offspring well-being, and significant time and cost investments. Though there are a multitude of factors that influence ART efficiency, one particularly salient challenge has been the pre-selection of sperm that have the potential to maximize the paternal contribution to the number and quality of viable embryos [13]. Identifying and isolating sperm with high fertility and developmental potential presents a significant challenge due to their structural and phenotypic heterogeneity, dynamic post-ejaculatory maturation processes, and the large quantity of cells in an ejaculate (i.e., order of 106-1011 depending upon species) [410].

Phenotypic variation in sperm has generally been explained by a game-theoretic competition model in which males adopt evolutionarily stable strategies that maximize fitness payoffs under sexual selection [11]. For example, mammalian sperm exhibit relatively high swimming velocity and/or greater sperm number per ejaculate in socio-ecological scenarios where there is strong between-male competition for mates [12]. Largely inspired by those observations, swimming velocity and sperm count have been regarded as heuristic guides for clinical sperm selection under the straightforward assumption that the ‘highest quality’ sperm can be identified from an idealized set of competitive traits [13].

However, heuristic approaches may be misleading because the predictions of sperm competition theory apply only to between-male variation, while within-male variation in sperm phenotype is the primary concern of assisted reproduction [11,14]. Importantly, male gamete function not only co-evolves with the competitive traits of other males, but also with the corresponding micro/macro-scale anatomy of the female reproductive tract. This effect, known as ‘cryptic female choice’, facilitates sperm selection in the reproductive microenvironment through various physical and chemical barriers (e.g., epithelial folds, cervical mucous, etc.) [15]. For example, mammalian sperm have evolved time-dependent changes in motility pattern (e.g., progressive to hyperactivated transition) that assist navigation of the labyrinth-like epithelial surfaces of the oviducts [16]. Within-male sperm selection may be an important component of mammalian reproduction and is a powerful candidate for the improvement of ART outcomes. However, our understanding of sperm selection at the cell population scale remains limited, and there is currently no underlying theory that enables precise description of the key aspects of sperm selection - including a quantitative definition of sperm ‘fitness’, or a measure of the magnitude of selective pressure acting on sperm traits under a given set of conditions.

In this report, we investigate sperm selection as a consequence of the interaction between phenotypic variation among sperm populations and the constraints imposed on sperm fitness by the reproductive microenvironment. We use empirical data to inform the development of agent-based computational models (ABMs) and simulate ‘bottom up’ sperm population dynamics. We then extend concepts from probability and information theory to define a quantitative measure of sperm fitness (i.e., the posterior probability distribution of ‘successful’ traits obtained using Bayes theorem), as well as a measure of the magnitude of selection imposed on a sperm population during fertilization (i.e., the relative information gain). The results from this work lay a foundation for high-precision male fertility diagnostics to improve sperm classification and/or selection in conjunction with existing semen analysis and laboratory pre-selection procedures.

Results

Model aim and context

The models are meant to simulate physiologically relevant aspects of sperm motility that contribute to sperm selection under microenvironmental constraints. The core models were designed using simple self-propelled particle physics, similar to the methods used in Computer Aided Sperm Motility Analysis (CASA), in which motility measures are obtained by digitally locating, annotating, tracking, and summarizing the trajectories of sperm nuclei in microscopy videos [17]. The models were fit to empirical data to improve accuracy and physiological relevance.

System boundaries

The physical environment simulated by the models approximates a 10X field-of-view under a light microscope with 680 X 680 μm side lengths and approximately 4.62x105 μm2 area (Fig 1A). Sperm motility imaging is typically performed using ~20 μm depth chambered slides that restrict the axial mobility of the cells for the study of ‘planar’ flagellar beating [18]. Although the models are 2D, they can be considered to have a 3D quality because the sperm are allowed to freely cross paths, as would occur in depth-chambered slides. The environment is comprised of a regular grid-space arranged in four quadrants. The grid squares were assigned a length scale value chosen to facilitate accurate approximation of empirically derived microscopy data (40 μm for the dimensions described above) [19]. The length scale factor can be adjusted to facilitate modeling any spatially defined environment.

Fig 1. Simulated random walkers explore space in a manner that depends on their movement properties.

Fig 1

(A) Diagram of the model environment, which is designed to emulate an isolated portion of the field of view of a light microscope with a 10X objective and a 20 µm deep chambered glass microscope slide. (B) Diagram of the core movement functions employed by the agent-based model. Θ(t) is the angular rate of change, ν(t) is the radial rate of change, σ is the respective amplitudes of zero average Gaussian noise added to the parameters. (C) Example images of simulations highlighting extremes of model behavior based on the choice of parameters. Ballistic-like motion results from no Gaussian noise being added to the radial and angular velocities; Combined motion results from combinations of the radial and angular velocities as well as the amplitude of noise added to each term; Diffusion-like motion results from relatively large values of noise amplitude. (D) Root mean square displacement of simulations with 50 agents as a measure of the relative distance traveled by the particles on average from their point of origin after 50 steps. Colors are randomly assigned to the agents and serve only to facilitate distinguishing the trajectories.

Core movement functions

The agents in the simulations represent the spermatozoon nucleus, in the same manner that sperm nuclei are typically imaged, filtered, and tracked using phase contrast microscopy for 2D path reconstruction in computer aided sperm motility analysis (CASA) [17].

The agents in the simulations execute the self-propelling random walk (Fig 1B). That is, each agent is characterized by its velocity vector vit=x˙it,y˙it and the angle θit at which this velocity vector is directed, subjected to random fluctuations. The components of vit and θit obey the following equations of motion:

xi˙t=cosθtv+σrηr,ityi˙t=sinθtv+σrηr,itθi˙t=sgnsin2πt/τω+σθηθ,it

Here, T numerates the agents, while ηr,it. and ηθ,it are the zero-average uncorrelated Gaussian noises and sgn is the signum function. When random fluctuations are not present (σr=0 and σθ=0), these equations describe a trajectory consisting of alternating arcs, oscillating around a preset direction. In this case, σ describes the magnitude of self-propulsion, N is the period of oscillations in time and the particle turns its velocity vector by the angle ωτ/2 as it travels along a single arc of its trajectory. When fluctuations are present (σr0 and σθ0), σ can be interpreted as the average velocity vt and sgnsin2πt/τω is the average angular velocity θ˙it. The presence of fluctuations introduces random direction changes into the motion of the agents. Depending on the magnitude of fluctuations (given by σr and σθ) and the specific choice of σ and m, a range of widely different movement patterns can be described by the above model.

Two extreme examples of simulated movement patterns, spanning the conceptual spectrum of possible behaviors (Fig 1C), are ballistic-like motion and free diffusion. Choosing σrv and σθω while keeping non-zero r and ω such that 0<ωτ/2<π, results in the trajectories resembling the deterministic alternating arcs pattern – or ballistic-like motion. In this case, the agents effectively travel long distances, but do not eensively explore the surroundings of their current position. In case of ωτ/2>π the paths become circular, which can be seen as a quasi-deterministic strategy for more local search. However, the ballistic-like trajectories are generally ‘stiff’, that is, they are characterized by long-persisting correlations in the changes of position. Conversely, assuming v=0 and ω=0 results in motion akin to passive diffusion. In this case, the agents thoroughly explore their immediate surroundings, but on average do not change position. The correlations in position change are also very short-ranged. The difference between both archetypes of movement pattern is conveniently characterized by the root mean squared displacement:

RMSDit=xitxi02+yityi02

For ballistic-like motion (with 0<ωτ/2<π) RMSDt, while for free diffusion RMSDt, where in the former case RMSD is the measure of travelled distance, while in the latter case it is a radius of explored area (Fig 1D). In general, the model (1) can describe a continuity of possible motility patterns. By varying the parameters of the model (especially, by allowing σθ to be comparable with ω, while keeping vσr) it is possible to combine the aspects of ballistic-like motion and free diffusion in almost arbitrary proportions. Significant variety of movement patterns is observed in mammalian sperm, making the proposed movement functions particularly suitable as a model of sperm motility [19,20].

Timescale of the models

Time in the models advances in discrete steps during which agent states are independently updated in random sequence (i.e., asynchronously). Model-time was scaled to real-time by setting the timestep of an advancement to previously reported beat cross-frequencies for isolated mouse sperm [19]. For example, reported mean beat-cross frequency is 25.4 Hz, and each sperm crosses its average straight-line path once per model-time advancement, then one advancement is approximately 1/25.4 seconds or approximately 40 milliseconds. The model length and time scales can be readily adjusted to simulate behavior of sperm from species other than mouse, but mouse parameters are used throughout this report for consistency with available data.

Parameter estimation for the sperm motility patterns

Isolated mouse epididymal sperm have been studied extensively to define the molecular mechanisms and phenotypic characteristics of fertilization competent sperm. Though there is a large set of possible movement characteristics that a given sperm may occupy at a given time, some common features of motility can be classed into specific patterns. For example, ‘progressive’ motility consists of a symmetric movement with low lateral head amplitude about the averaged central path and rapid straight-line movement [17]. Similarly, ‘intermediate’ motility follows a similar movement pattern, but with greater magnitude of lateral head displacement. Here, we define five categorical motility patterns, based on previous work [19].

Typical CASA motility parameters consist of: VAP (average path velocity), VSL (straight line velocity), VCL (curvilinear velocity), ALH (amplitude of lateral head), BCF (beat-cross frequency), STR (straightness), and LIN (linearity) [17]. VSL, VCL, and VAP were used to compare the parameter fitting of the sperm movement functions in our models to empirical sperm motility data. Means and standard deviations of the movement parameters for each motility type were obtained from a previous report, and Gaussian distributions were simulated using Graphpad Prism [19]. Temporally-coded phase contrast images of representative cauda epididymal mouse sperm movement patterns at 10X magnification are qualitatively similar to those observed for sperm (shown at approximately 10X magnification at the objective; Fig 2A). Movement function parameters (VSL, VCL, and VAP) for the agents of each prescribed motility class were adjusted to approximate the medians of the distributions of sperm movement parameters (Fig 2BD). The movement function parameter values are detailed in (Table 1). The distributions did not match exactly for every parameter or motility pattern, indicating that true sperm motility is more complicated than the simple agent rules in this report. Nevertheless, the simulated motility patterns are qualitatively similar to mouse sperm and could be readily updated in future model iterations to accommodate alternative agent rulesets, or movement parameter distributions from other species if labeled CASA data are available. For instances where labeled data is not available, meaning that categorical motility types are not classified, the models could still be fit directly to measured CASA parameter distributions (e.g., VCL, VSL, ALH, etc).

Fig 2. Parameter Estimation for the Sperm Motility Patterns.

Fig 2

(A) Temporally coded phase contrast images of mouse sperm motility patterns (top row). Matching temporally coded images of simulated sperm-agents for each sperm motility type (bottom row). Color scale is blue (early) to white (late) frames in the video. Simulated trajectories (B) Consensus data from published studies were used to generate normal distributions of curvilinear velocity (VCL) values (indicated by the subscript ‘Data’). Parameters (i.e., ν(t), θ(t), σr, and σθ) of each motility type in the agent-based models (i.e., subscript- ABM) were adjusted to approximate the mean VCL values with those identified in the data distributions. N = 250 data points for all groups.

Table 1. Movement function parameters for sperm simulations.

Motility pattern Nominal Radial Velocity vμm/s Std. Deviation of Radial Gaussian Noise σrμm/s1/2 Nominal Angular Velocity ω°/τ Std. Deviation of Angular Gaussian Noise
σθ°/τ1/2
Progressive 279.4 0.58 45 17.3
Intermediate 419.1 0.29 120 11.5
Hyperactive 419.1 0.29 180 51.0
Slow 152.4 1.15 135 26.0
Weak 101.6 0.29 180 52.0

Parameter to control the oscillation period of the signum function: τ=1/25.4s.

Sperm-agent search is a function of ensemble motility pattern

We performed simulations and sensitivity analysis to investigate the relationships between motility pattern and the amount of discrete space searched in a bounded environment. Each simulation was performed with agents of only one motility type (progressive, intermediate, hyperactive, slow, weak, or mixed in equal proportions). Sperm began in a randomized position in the environment with the same total number in each simulation (N = 250). The environment consisted of an underlying grid of 40 x 40 µm unit squares. Each grid square was considered distinctly searched if at least one sperm passed through it. Simulations ended after two seconds, a typical time window for computer aided sperm motility analysis (CASA). The time color-coded movement paths of a representative simulation with ‘mixed’ sperm motility types are shown (Fig 3A). Populations of sperm with mixed motility types exhibited average movement patterns and an ability to search space that was reflective of the proportions of different motility types in the population (Fig 3B,C). These results demonstrate that the average search capability of a sperm population reflects the underlying distribution of sperm motility phenotypes. Notably, there are important mathematical properties exhibited by correlated random walks that may be useful for improving sperm motility analysis and modeling the process of fertilization more broadly (see the discussion section for more details).

Fig 3. Sperm-Agent Search is a Function of Ensemble Motility Pattern.

Fig 3

(A) Representative model simulation of 250 sperm with equal proportions of each motility type searching a closed space. Color scale is blue (early) to white (late) frames in the video. (B) Root mean squared displacement (µm) for simulations involving the indicated composition of motility types. Mixed populations consisted of 50 sperm of each motility type. (C) Search progress (%) for the simulations described in subpanel (B).

Adding spatial complexity to the microenvironment

The microanatomy of mammalian female reproductive tracts imposes spatial limitations on sperm movement that act as physical barriers to eventual contact with the egg(s). In the uterus, the luminal volume is large relative to the size of a single sperm, and convective flow predominates in the dispersion of cells [21]. However, the luminal volume in the cervix and oviducts are much smaller relative to the size of a sperm and the relative volumes are constrained by the presence of laminar epithelial folds which form a tight labyrinth-like environment [22].

To model the relationship between microenvironmental complexity and sperm selection, three simulation environments were developed. Mazes (more specifically labyrinths, or ‘acyclic’ mazes) were chosen as a simple model of microenvironmental spatial complexity because they can be compared quantitatively using foundational concepts from graph theory. These simple structures are not necessarily intended to serve as accurate models of oviducts, which are much more complex and involve adaptive physiological variables including hormones, metabolites, hydrodynamic forces, and thermal gradients. Rather, the simple mazes enable quantitative exploration of the fundamental constraints imposed on sperm fitness by a selective pressure (in this case- spatial complexity), and can be more readily mimicked in vitro, making them more tractable for experimental validation using channel-based microfluidics.

The mazes in this report were defined by internal (white) barriers that the sperm could not cross. A separate agent-based model was designed to facilitate drawing and saving the maze environments (available at https://github.com/cas-mitolab/Fertilization_ABM). When a sperm encountered a barrier, the sperm would reorient within a range of possible new directions determined by their motility state and corresponding movement function. A single ‘egg’ was also included in the environment as a designated grid square, and egg-contact occurred when a sperm moved over the square. The mazes consisted of ‘dead-ends’ and ‘intersections’ as vertices of an undirected graph G(v,e), where v is a set of vertices {vi} and (e) is a set of edges where eij is the unordered vertex pair {vi, vj}. We define a path as the set of edges that connects two specified vertices. To quantify the complexity of the mazes, the total weighted complexity (TCw) was calculated as:

TCw=i=1kdi

where di is the ith vertex degree (i.e., the total number of paths that lead to and from the vertex) and the vertex indices {1, 2, …, k} is a proper subset of v. An open space in which sperm start at one position and an egg is located at another position within the space, has a total weighted complexity of 1 (Fig 4A). Mazes with more vertices, or with more edges connecting vertices are more complex and reflect a larger TCw (Fig 4B,C). As an additional measure of spatial complexity, we considered the probability of a sperm traversing a direct path from the start (S) position to the (E) egg position, which can be calculated as:

Fig 4. Modeling Microenvironmental Complexity.

Fig 4

(A) The simplest simulation microenvironment consisting of an open space with an egg located in the bottom right corner. Sperm begin at position 1 (red) and end at the egg position 2 (red). TCw = total weighted complexity, a measure of the graph complexity of the maze. P(S→E)  = the probability of a sperm taking the shortest direct path to the egg. (B) A more complex maze with increased TCw relative to maze A. (C) The most complex maze used in the simulations. Mazes were constructed using a separate agent-based model in Netlogo. Vertex numbers are indicated on the maze diagrams. Graph networks with numbered vertices connected by edges are shown on the right.

PSE=i=1k1di

where k is the total number of vertices along the shortest path from the sperm to the egg. For example, the probability of a direct path taken (Fig 4C) is extremely low (i.e., 1/34,992 possible paths).

Sperm number and search properties

To explore the relationship between sperm-agent number (or density within the simulation space) and search for an egg in complex spatial microenvironments, simulations were performed for increasing numbers of sperm (1-104 progressively motile sperm; 100 simulations each). The simulations ended when the egg was contacted for the first time by one of the sperm (Fig 5A; top- TCw  = 1; middle- TCw = 22; bottom- TCw = 54). Notably, the time to first contact was not symmetrically distributed for each sperm-agent density, which became symmetric when placed on a logarithmic scale. Simulations involving 103 and 104 sperm became much more likely to be normally distributed rather than lognormally distributed in Mazes B and C, but not A. However, tests for lognormality were inconclusive and it is not clear what the underlying distributions were in either case, though it was clear that the distributions were skewed in simulations with fewer sperm (Fig 5A).

Fig 5. Sperm Number and Search Properties.

Fig 5

(A) Time to first contact with an egg in microenvironments with TCw = 1 (maze A), TCw = 22(maze B), and TCw = 54 (maze C). (B) Area (µm2) searched at first contact with the egg for microenvironments with increasing weighted complexity (top to bottom as in subpanel A). Lines indicate the median. N = 100 simulations for each condition.

Next, we investigated the relative impact of environmental complexity on the efficiency of the search process by examining what quantity of the searchable area was accessed during each simulation (Fig 5B; top- TCw  = 1; middle- TCw = 22; bottom- TCw = 54). We hypothesized that sperm ensembles would perform something like a depth-first search in which all preceding branches of the maze were likely to be searched prior to finding the egg [23]. Indeed, the sperm searched more space prior to finding the egg as the number of agents in the simulations increased (Fig 5B). Interestingly, at low sperm number (< ~103), the role of chance was large compared to simulations with larger numbers of sperm, and in many cases the sperm were able to contact the egg without searching a large proportion of the space. This effect was particularly relevant in the open environment of maze A, but was diminished with increasing environmental complexity in mazes B and C.

Taken together, these results predict that diffusive search for an egg by non-adaptive sperm will exhibit a non-linear relationship with sperm density and that the role of chance in finding shortest path to the egg is modified by the spatial complexity of the microenvironment. As the microenvironment becomes more complex, more sperm are required to minimize the time to egg contact, but the benefit gained by increasing sperm number above a critical threshold also diminishes nonlinearly due to convergence on the most direct path to the egg. These insights may provide a basis for optimal prediction of sperm number for ART procedures such as IVF, though these models do not explicitly account for the risk of polyspermy which would likely form an upper bound on sperm density due to decreasing zygote fitness with increased risk of polyspermy. Additionally, there are several useful mathematical properties that describe the asymptotic behaviors of persistent random walks in mazes that may inform sperm motility analysis and the physiology of fertilization more broadly (see the ‘random walks’ subsection the discussion for more details).

A time homogeneous markov model of sperm phenotype heterogeneity

Individual sperm undergo dynamic changes that are conditioned on nutrients and signaling factors in the microenvironment. These factors ultimately influence sperm behavior, lifespan, and ability to recognize and bind to an egg [5,2427]. In mammalian sperm, intracellular calcium is a key second messenger that mediates capacitive changes, and heterogeneity in calcium transients are a key source of individual sperm variation within cell populations [9,28]. To aid in choosing parameter distributions for the agent-based models, we explored the effect of natural variation in isolated mouse cauda epididymal sperm in response to well-defined capacitating signaling inputs. To account for the controlling effect of exogenous free calcium, we performed pseudo-titrations of total calcium using an ethylene glycol-bis(β-aminoethyl ether)-N,N,N′,N′-tetraacetic acid (EGTA) chemical ‘clamp’ system to buffer the free calcium at defined concentrations (Fig 6A). We then combined this method with pseudo-titrations of sodium bicarbonate (HCO3-), a key signaling factor that stimulates capacitation via activation of soluble adenylate kinase. Intracellular calcium was monitored using an acetoxy-methyl ester Indo-1 (ratiometric) dye. HCO3- stimulated intracellular calcium increase concomitant with exogenous free calcium concentration over a period of two hours (Fig 6B). These measurements highlight the average responses that sperm populations make to signaling inputs during in vitro capacitation.

Fig 6. A Time-Homogeneous Markov Model of Sperm Phenotype Heterogeneity:

Fig 6

(A) Linear regression curves for different calcium ion selective electrode filling solutions used to calculate the free Ca2+ concentrations in HEPES buffered assay media in the presence of 1mM EGTA. (B) Representative heat map showing Indo-1 fluorescence ratios for sperm under the indicated Ca2+ and HCO3- pseudo-titration conditions. Iono = ionomycin. Free calcium concentrations (bottom) are in micromolar units. T = time since the beginning of the assay in minutes. (C) Probability density estimate from spectral flow cytometry for approximately 105 live cells per indicated condition. Dead cells were excluded from analysis based on ToPro3 fluorescence intensity. (D) Representative intracellular calcium oscillations derived from a squared sine function assigned to each sperm in the model simulations. Teal bar at the top of the graph indicates the upper 5% of the concentration range during which the cells were allowed to transition motility states according to a Markov probability transition table. (E) Relative proportion of sperm in each indicated motility state over time (in model-timestep units). In the long run, sperm in the models absorbed into a weak motility state.

Next, we sought to determine how intracellular calcium was distributed among individual sperm at the 60-minute timepoint using spectral flow cytometry, with a similar multi-dimensional culture array scheme (Fig 6C). Examination of the qualitative distributions of intracellular calcium ([Ca2+]i with increasing concentrations of HCO3- revealed that [Ca2+]i exhibited a positively skewed distribution. We interpreted the skewed shape of this distribution as an indication that high [Ca2+]i cells are a relatively ‘rare’ phenotype relative to the mean, a pattern which was invariant to the magnitude of the HCO3- signal (Fig 6C).

To incorporate this information into updated models with physiological changes in motility phenotype over time, the sperm-agents were updated to include a ‘calcium oscillator’ function that influences the behavior of the sperm in proportion to the ‘frequency’ of intracellular calcium transients (Fig 6D). We associated the following equation with each agent in the simulation:

Ca2+i=121cosΩit

Each cell was assigned a randomly generated oscillation frequency Ωi, drawn from a Poisson distribution. Use of the Poisson distribution was motivated by the skewed indo-1 fluorescence ratio distributions observed in the flow cytometry measurements of intracellular calcium distributions (Fig 6C). It describes the probability of observing a given Ωi for an interval of discrete oscillation frequencies with mean λ and has the property that the sequence of inter-frequency intervals between sperm subgroups will be independent and exponentially distributed with mean 1λ. To model changing population states over time, all sperm began with a progressive motility state and were allowed to change motility states when the intracellular calcium was above Ca2+i>0.97% of the maximum. The changes were stochastic, following a time-homogeneous Markov model, and motility patterns (i.e., progressive, intermediate, hyperactive, slow, and weak) were assigned using a probability transition table (Table 2). Since the weak mobility state was final and could not be left once reached by a given sperm, the population experienced a net flow towards this state over a long run period, similar to overall motility degradation observed in live sperm samples (Fig 6E). Overall, this logic models the empirically measured effects of intracellular calcium oscillations on the motility state distributions of sperm [9,28,29], and simulates natural (stochastic) variation in the rates at which individual sperm undergo motility changes during in vitro capacitation.

Table 2. Table of markov transition probabilities.

Current State To: Progressive To:
Intermediate
To: Hyperactive To:
Slow
To:
Weak
Progressive 0.95 .02 .01 .01 .01
Intermediate .02 0.94 .02 .01 .01
Hyperactive .01 .02 0.94 .02 .01
Slow .00 .01 .02 0.95 .02
Weak .00 .00 .00 .01 .99

Table: Markov probability transition matrix for changing motility state distributions over time.

Impact of phenotype heterogeneity on sperm search

Next, we simulated search for an egg in microenvironments with increasing spatial complexity (Fig 4AC). Simulations consisted of 100 sperm, which was chosen as a reasonable (minimal) number that supported consistent random phenotype distributions across simulation runs, determined by increasing the size of sampled distributions until the observed mean approximately matched the ideal mean of the distribution from which the sample was drawn. A two-factor design was implemented to compare the relative effects of sperm population heterogeneity (mean calcium oscillation frequency; λ) and microenvironmental complexity (TCw) on search time. Intracellular Calcium oscillation frequencies assigned to each sperm were drawn from one of three Poisson distributions characterized by different means/variances (λ) (Fig 7A). A low λ value indicates that most of the sperm had low Calcium oscillation frequencies, and thus, would absorb into a weak motility state slowly, giving them more time to actively search for the egg. Conversely, a high oscillation frequency might absorb into a weakly motile state quickly, making it comparatively less likely to find the egg.

Fig 7. Impact of Sperm Phenotype Heterogeneity on Diffusive Search.

Fig 7

(A) Histogram of the sperm intracellular calcium oscillation frequencies randomly drawn from a Poisson distribution with the indicated means (λ). (B) Search time for sperm populations with different phenotype distributions in microenvironments with increasing total weighed complexity (TCw). (C) Logarithmically transformed search times from subplot B used for statistical analysis to satisfy 2-way ANOVA assumptions. Ns = not significant, *p < 0.05, ****p < 0.0001. Lines indicate medians. Simulations consisted of N = 100 agents.

A plot of search time vs. TCw for each value of λ qualitatively indicated that both microenvironmental complexity and phenotypic heterogeneity increased the search time (Fig 7B). Due to the asymmetry of the search time distributions, the assumptions of a two-way ANOVA were not met. To address this issue, we performed a two-way ANOVA on logarithmically transformed search time (Fig 7C), and a statistically significant interaction was detected between TCw and λ (F (4, 891)  = 22.41; P < 0.0001); however, the effect only accounted for 0.29% of total variation. Simple main effects analysis revealed that λ accounted for only 0.82% of variation (F (2, 891) = 127.6; P < 0.0001), while TCw accounted for most of the variation (95.99%; F (2, 891) = 14808; P < 0.0001). Post hoc analysis using a Tukey’s multiple comparison test indicated statistically significant differences between the λ levels in all three microenvironments with mean search time differences as large as ~627 seconds in the most extreme case (TCw=54; λ = 1 Hz Vs. λ = 20 Hz). Together these simulation outcomes predict that both phenotypic heterogeneity and microenvironmental constraints interact to impose selective pressure on fertilizing sperm. An important remaining question is how to quantify both the impact and magnitude of selection on sperm fitness.

Environmental complexity narrows the posterior distribution of fit sperm

Further exploration of the interaction between sperm-agent population dynamics, spatial constraint, and the effect of selection requires measures of how the distribution of sperm phenotypes changes under selection as well as the magnitude of effect caused by the selective pressure. We considered two measures that have been used in similar biological contexts to model changing phenotype distributions among populations 1.) Bayesian inference - the posterior probability of contact with an egg given that a sperm has a particular phenotype (calcium oscillation frequency Ωi in this case), and 2.) Kullback-Leibler divergence (a.k.a. relative information gain) [30] - an information theoretic measure of the magnitude of effect of selection on the distribution of calcium oscillation frequencies following diffusive search in microenvironments with differing degrees of complexity. These approaches are especially useful in this context because they are non-parametric and are easily interpreted even when the sample size is small, which is relevant given that only a few sperm may ultimately gain access to the egg during fertilization.

The posterior distribution provides a quantitative measure of sperm fitness

For these simulations, the agent-based models were updated to facilitate tracking the number, assignment, and duration of contact time with the egg by each of the sperm. A total contact-time threshold of five seconds was defined as a condition to end the simulations. The underlying assumption was that after a threshold value for contacts by one or more sperm, the fertilization process is likely to have occurred if it will occur at all. One hundred simulations involving one hundred sperm each were carried out for each factor and corresponding level (i.e., {λ: λ = 1, 10, 20 Hz}, and {TCw: TCw = 1, 22, 54}). The relative proportion of the ith Calcium oscillation frequency is denoted qi. To visualize the initial distributions across the λ and TCw levels, cumulative probability distributions were calculated, demonstrating approximately identical initial distributions across each of the one hundred simulations (Fig 8A).

Fig 8. Measures to Infer Sperm Fitness as well as Quantify the Magnitude of Sperm Selection.

Fig 8

(A) Cumulative distributions total probability P(qi) for each oscillation frequency in the initial sperm population for each simulation condition. (B) The Bayesian likelihood (frequency of sperm for each oscillation frequency that contacted the egg). (C) Cumulative posterior probability of egg contact for each oscillation frequency. Note, a prior distribution of 1/N, where N is the total number of sperm in the simulation, was used in the calculation. This can be interpreted to mean that each sperm had an assumed equal chance of contacting the egg. (D) Relative information gain (a.k.a. Kullback-Leibler divergence) calculated for each simulation condition. ****p < 0.0001. TCw = total weighted complexity. N = 100 sperm in each simulation. Points in A-C represent the median of 100 simulations. Points in D represent relative information gain for each of 100 simulations.

The distribution of each discrete Calcium oscillation frequency among the sperm that contacted the egg is also known as the likelihood function P(qi | contact). Plotting the likelihood function vs. each distinct Calcium oscillation frequency indicated that increasing environmental complexity narrowed the range of frequencies among sperm that successfully made contact (Fig 8B). It also reduced the absolute number of unique sperm that made contact. Though the likelihood function is the typical empirical measure used in laboratory experiments related to sperm fertility competence, it lacks information about prior assumptions regarding the fitness of sperm traits as well as the base rate distribution of of those traits within the initial sperm population (before fertilization outcomes are known). In other words, the likelihood function is a sampling distribution, but what we are most interested in for the purposes of predicting sperm fitness is the posterior distribution, which can be obtained using Bayes theorem:

P(contact|qi)=Prior*LikelihoodTotalProbability=PcontactP(qi|contact)Pqi

where P(contact) is the hypothesized prior distribution, in this report Pcontact=1N, where N is the total number of sperm in the simulation. P(contact) can be interpreted to mean that all sperm were assumed to have an equal chance of contacting the egg (prior to observing the outcome). Calculating the posterior distribution in this way addresses the question- “what is the probability that a given sperm will be ‘successful’ given that is has a particular trait value”. Cumulative posterior probabilities were calculated for each level of TCw and λ (Fig 8C). Interestingly, the range of successful oscillation frequency values was narrowed by increasing microenvironmental complexity, enabling identification of a subset of sperm that could be considered to have high ‘fitness’ within each microenvironment.

Quantifying the magnitude of selection imposed by the microenvironment-

Sperm selection is often used in ART applications, but the magnitude of selective effect is generally not considered, despite the importance of such a measure for comparing the effectiveness of different selection strategies [2,31,32]. Here we describe use of relative information gain as measure of the magnitude of selection. As in the previous section, qi is the initial distribution of sperm-agent Calcium oscillation frequencies. Let qi denote trait probabilities among the proper subset of sperm that successfully made contact with the egg. If qi and qi are the same, then there was no selection for Calcium oscillation frequency during the simulation. However, if qi and qi are not the same distribution, then some subset of Calcium oscillation frequencies did not contact the egg, implying they were selected against by the conditions of the simulation. The normalized distance between the two distributions can be quantified with the following expression:

D(q'||q)=i=1nqi'log2qi'qi

This expression is known as the relative information gain (or Kullback-Leibler divergence), and its units are in binary digits (bits). A relative information gain of 0 indicates the distributions are the same, and a positive number indicates the magnitude of the difference between the two distributions. The relative information gain cannot be less than 0 and this measure is not symmetric, meaning that it is not equivalent to D(q || q’). As anticipated, increasing TCw or λ increased the relative information gain (Fig 8D). Two-way analysis of variance (ANOVA) revealed a statistically significant interaction effect among TCw and λ (F (4, 891) = 19.47; P < 0.0001), which only accounted for ~2% of the total variation. Simple main effects analysis indicated that TCw accounted for about 37% of the variation (F (2, 891) = 542.9; P < 0.0001) and λ accounted for about 29% (F (2, 891) = 425.9; P <  0.0001). The results of a Tukey’s post hoc test comparing each λ level to λ = 1 is indicated in (Fig 8D).

Taken together, the results from these simulations reinforce the conclusion that simple spatial hindrance by the latent structure of the microenvironment combined with variation in individual sperm phenotypes exerts quantifiable selective pressure on sperm during fertilization. The proposed measures enable quantitative description of sperm fitness and the magnitude of selection in relatively simple terms.

Discussion

Random walks

The asymptotic properties of diffusive and correlated random walks have been well studied and described previously [33]. The goal of this report is not necessarily to derive new findings on random walk behavior, but rather, to frame the physiological context of the sperm search for an egg using tools from this field with the hope of illuminating improved fertility analysis and/or prediction. General random walk models, similar to those presented in this report, have several notable properties- 1) First passage probabilities, akin to the probability that a sperm will find an egg, are heavily influenced by dimensionality [34], 2) correlation among successive steps (i.e., persistence of the walkers), akin to progressive sperm motility patterns, produces scaling properties that are distinct from uncorrelated random walks [34], 3) the number of distinct sites visited by a given number of random walkers, akin to how sperm explore space, has been shown to exhibit distinct time regimes that depend on the system size (i.e., the area of the environment and the number of random walkers) [35] and finally, 4) random walks on undirected graphs, akin to sperm movement in the epithelial folds of the uterus and oviducts, exhibit polynomial traversal times- bounded above by T2en1, where e is the number of edges and n is the number of nodes [23]. Together, these properties may be extended to inform the behavior of diverse sperm populations searching in spatially complex microenvironments.

A Systems Perspective on Fertilization - The molecular mechanisms that underpin the regulation of mammalian sperm post-ejaculatory maturation (a.k.a., capacitation) have been thoroughly studied, and consist of signaling pathways [26,36], metabolic processes [24,25,37], and complementary binding of cell surface molecules [38,39]. Though there are excellent physical models of individual sperm motility function and regulation [16,18,4043], few models account for the stochasticity of hundreds of millions of sperm searching for an egg that ultimately determine fertility outcomes; this gap in knowledge persists despite consistent observations of significant phenotypic heterogeneity within sperm populations (e.g., the localized expression of ion channels on the plasma membrane) [4,6,9,44].

Previous work modeling sperm search times in both 2D and 3D environments detailed several potential scaling laws for the diffusive search process relating search time to sperm number [45]. Similar to our observations in this report, a non-linear relationship between sperm number and search time was described, and the scaling relationships depended on the dimensions of the search space. However, those simulations used sperm with constant velocity, rectilinear motion, and did not explicitly account for phenotypic heterogeneity or motility pattern changes over time. The agent-based models (ABMs) developed in this report are informed by empirical data and provide a structured framework to explore the complex collective dynamics of phenotypically heterogeneous sperm populations under various environmental conditions. The models facilitate a deeper understanding of the interactions between microenvironmental complexity and sperm phenotypic heterogeneity, emphasizing the stochastic nature of the variables that shape sperm fitness.

Diffusive search under microenvironmental constraint

The spatial scale of motility is important when analyzing the consequences of motility pattern distributions on sperm selection. In vivo, peristaltic fluid flow moves sperm suspensions over relatively large distances within the female reproductive tract independent of their motility status [21]. Though this phenomenon will distribute the cells on a macroscopic scale, at the microscopic scale, individual sperm must still ‘search’ local space using flagellar movement in a manner that increases probability of contact with the egg. This important property implies that critical cellular density thresholds, cell intrinsic motility characteristics, and microenvironmental factors such as physical/chemical barriers play critical roles in influencing which sperm from a given cell population will have an opportunity to fertilize. The degree to which this is due to chance alone is an important consideration, and a theoretical framework for sperm selection should account for the probabilistic dependencies of sperm fitness. The simulations in this report predict that increased microenvironmental complexity requires greater sperm density to maintain effective diffusive search and timely egg contact, highlighting potential tradeoffs between the collective diffusive search capability of a sperm population and the number of required sperm. The models also predict that critical thresholds exist, above which sperm number plays a diminishing role in diffusive search capability. Importantly, this threshold is not a fixed value, but rather, depends on the complexity of the microenvironment and the phenotypic heterogeneity of the sperm population.

Impact of sperm phenotypic heterogeneity

Motility is the most fundamental physiological function of mammalian sperm and is a common distinguishing feature used in clinical sperm selection [31,32]. Our results demonstrate how intrinsic phenotypes of sperm, such as intracellular ion transients coupled with the regulation of motility pattern, may critically influence selection outcomes. Sperm phenotypic variation is complex and caused by many different factors. Variation may be an important driver of optimal sperm number among (or within) species, based on the observation that sperm number is positively correlated with potentially deleterious effects of genomic recombination during meiosis [46]. Variation may also be undergirded by an evolutionarily stable strategy that optimizes the number of capacitated sperm during a post-copulatory fertilization window; a process facilitated by periodic synchronous capacitation among sperm subpopulations [47]. Regardless of the underlying causes of variation, the simulations in this report suggest that the reproductive microenvironment is a critical factor in sperm selection because it ultimately determines which sperm phenotypes will have access the egg. This suggests that sperm selection protocols for ART should consider both the statistical distribution of biological variability among the sperm and the physical/chemical structure of the microenvironment in which fertilization will occur.

Quantifying sperm fitness and selection pressures

Sperm pre-selection is almost ubiquitous in routine clinical diagnostics and ART procedures (e.g., gradient centrifugation, swim-up assays, hyaluronan binding assays, etc.). Though semen parameters such as motility and sperm count are known to influence ART outcomes [48], current methods of selection largely depend on simple correlation and qualitative assumptions about the effects of selection [49]. Additionally, use of ICSI has increased substantially in recent decades, a procedure which relies on direct selection of a single sperm for injection into the egg [50]. Currently, there is no quantitative framework that facilitates high-precision sperm selection from within-male samples and sperm ‘fitness’ remains nebulously defined.

Though fitness could be quantitatively framed in many ways - Bayesian inference is a particularly useful approach because it is insensitive to the base rate representation of sperm phenotypes and has no minimum sample size. The approach taken here is drawn from the theory of natural selection, in which fitness is defined as a probability measure of success [51], which has classically been defined by survival or reproduction of organisms within a given population. For the purposes of modeling fertilization, success can be defined flexibly depending on the scenario without altering the underlying mathematical representation (e.g., egg contact, fertilization, passage through a selective barrier, etc). Most approaches to assessing sperm fitness are based on regression of sperm traits (or interventions) with fertility or developmental outcomes [52,53]. However, it is important to consider that the probability that a sperm has a particular trait given that it fertilized an egg, is not necessarily equivalent to the probability that a sperm will fertilize an egg given that it has a particular trait, though it is the latter condition (inference) that is of prime interest for sperm selection in ART.

Bayes theorem incorporates useful information beyond simple sampling frequencies. For example, it accounts for the relative proportion of sperm with ‘successful’ sperm traits in the initial population and prior information about the traits’ contributions to fertilizing potential. Bayesian inference has been used recently in conjunction with dimensionality reduction to make fertility predictions from motility stereotypes in boar semen [54]. Quantitatively defining sperm fitness is becoming more important as machine learning and computer vision technologies advance, allowing for high-dimensional data collection from semen samples, and necessitating methods that analyze distributions directly rather than relying on summary statistics [55]. One advantage of the approach taken in this report is the computational simplicity, which may be useful for developing classification or selection strategies that rely on interactive microscopy video manipulation in real-time.

Another major limitation to improving current male fertility diagnostics and high-precision selection is that fertilization is an open-ended process, making it very difficult to predict which sperm will have a selective advantage from semen analysis alone. As mentioned previously, the microenvironment plays a substantial role in constraining which sperm will have access to the egg. For this reason, it is critical to have some measure that can compare between the selective effects introduced by different reproductive microenvironments in vivo or in vitro. To address this limitation, we propose another measure - relative information gain - for quantification of the magnitude of selection imposed by the reproductive microenvironment [51]. This measure, also known as Kullback-Leibler divergence, provides a useful way to compare selection strategies quantitatively [30]. It is a numerical representation of the ‘distance’ between the trait distribution of an initial (pre-fertilization) population of sperm and the ‘successful’ (post-fertilization) population of sperm. Notably, it is independent of the actual features of the microenvironment and is only dependent on the effect of microenvironmental constraint on sperm fitness. Additionally, it lays the groundwork for a new biological context for reproduction by reframing the process of fertilization in information theoretic terms as a form of learning process [56].

Model limitations

There are several notable limitations of the models developed in this report. Most of the limitations stem from the simplifying assumptions made about the physiological phenomena that underly mammalian reproduction. First, the sperm movement functions are relatively simple, but real sperm exhibit more complicated patterns such as helical progression [57]. Second, the regulatory systems in the model that control the timing and trajectory of capacitation are limited only to calcium transients with an assumed correlation between intracellular calcium concentrations and motility pattern transitions. This simplification ignores a much more complicated reality involving time-inhomogeneous plasma membrane potassium hyperpolarization, metabolic energy balance, protein tyrosine phosphorylation, and other key biochemical reactions. Though the results should be interpreted with caution, the models were designed to capture key elements of cell population scale dynamics and were constrained by empirical data to enhance their physiological relevance. These models may be extended to incorporate updated movement parameters and regulatory subsystems - for example through use of coarse-grained approaches such as Boolean networks or more involved systems of differential equations [9,58]. Finally, to approach this problem in a general way, we modeled only non-adaptive sperm, meaning sperm that do not change their motility pattern in response to environmental inputs. However, there are many ways by which mammalian sperm modulate behavior in response to their environment including chemotaxis, rheotaxis, and thermotaxis. Incorporating these behaviors will likely affect the statistical predictions about sperm fitness and should be pursued in future studies.

Materials and methods

Ethics statement

All animal related work adhered to the guidelines outlined in the National Research Council Guide for the Care and Use of Laboratory Animals and was approved by the Institutional Animal Care and Use Committee of East Carolina University (approval A3469-01).

Model implementation

Agent-based models were developed and implemented using the Netlogo modeling environment (V6.2.2) [59]. Netlogo BehaviorSpace was used for repeated simulations with parameter scaling. The models and other supporting information are available at (https://github.com/cas-mitolab/Fertilization_ABM). Simulations were run on a standard laptop computer with 16 GB of RAM and an Intel Core i7 1.7GHz processor. Markov state transition simulations and calculations related to Bayesian inference and relative information gain (Kullback-Leibler divergence) were performed using the Python (V3.9) programming language and the NumPy library (https://github.com/cas-mitolab/Fertilization_ABM) [60].

Animals

Adult male outbred CD-1 retired breeder mice were obtained from Charles River Laboratories (Raleigh, NC, USA). Mice had free access to water and food, were maintained on a 12-hour light/dark cycle and were humanely euthanized by CO2 asphyxiation followed by thoracotomy.

Isolation of mouse epididymal sperm

Testes with epididymides were isolated in phosphate buffered saline (PBS) at 37°C. Cauda epididymides were transferred to isolation media where gently dissected. Following a brief swim-out period (~15 minutes at 37°C), sperm were isolated from epididymal tissue by centrifugation at 100 x g for 2 minutes. Cell counts were determined using a hemocytometer after dilution in water. Cells were then incubated at 37 °C for 30 minutes with 10 µM Indo-1-AM cell permeable free-calcium dye in HEPES buffered, bicarbonate-free, media containing glucose and lactate (2.88 mM and 21 mM respectively). Cells were then washed by centrifugation at 800 x g for 5 minutes.

Calcium clamp and microtiter plate assay

1 mM EGTA (ethylene glycol-bis(β-aminoethyl ether)-N,N,N′,N′-tetraacetic acid) was used to clamp the ‘free’ Calcium ion concentration in assay preparations. Calcium concentrations were measured using ion selective electrodes (Kwik Tip electrodes; World Precision Instruments, Sarasota Fl, USA). Two concentration ranges were determined, requiring two separate electrode filling solutions (low range - 150-300 µM; high range - 1.2-2.0 mM) with different concentrations of CaCl2 to obtain an appropriate working range. Once determined, the buffer conditions that clamped the Calcium concentrations were included in the assay in conjunction with sodium bicarbonate pseudo-titrations. pH of the media during the assays did not change and was confirmed using glass tipped pH microelectrodes (World Precision Instruments, Sarasota Fl, USA). Indo-1 stained cells were added into the microtiter plate at unform cell density across the plate and fluorescence (340/400:475 nm) was obtained for ratiometric analysis at 37 °C with sampling every 5 minutes for two hours in a microtiter plate reader (Molecular Devices ID3, San Jose CA, USA). The calcium ionophore ionomycin (10 µM) was used as a positive control.

Spectral flow cytometry

– The qualitative distribution of intracellular calcium in live sperm populations was performed using a 5-laser Aurora spectral analyzer with SpectroFlo acquisition software (V2.2; Cytek, Fremont, CA, USA) [61]. Flow cytometry measurements were performed in capacitating media with corresponding pseudo-titrations of calcium chloride and sodium bicarbonate. Scatter gating was used to identify intact single cells. Live-cell impermeable ToPro3 dye (Thermo Fisher; Waltham MA, USA) was used to monitor cell viability during the assays and mild detergent titrations (digitonin, 24-240 µM) were used to prepare single stain reference for dead cells. Indo-1-AM was excited using a 405nm laser and emission collected at 400 and 475 nm. Conditions were optimized prior to flow cytometry by spectral scanning using a Horiba Duetta fluorometer (Kyoto, Kyoto, Japan). Ionomycin was included as a positive control condition. Scatter plots of fluorescence intensity were manually gated and exported using FlowLogic (V8.7, Inivai Technologies; Victoria AUS). Kernel density estimates of fluorescence ratios for various pseudotitration conditions were plotted using Python (V3.9) with Matplotlib, NumPy, and Pandas libraries.

Data analysis and statistics

Data were analyzed and visualized using Graphpad Prism (V9.1.2), or NumPy, Pandas, and Matplotlib [60,62,63]. Statistical analyses were performed using Graphpad Prism (V9.1.2, San Diego, CA, USA). Two tailed Student’s t-test was used for comparison of group means. Normal quantile-quantile plots were used to assess whether normal based inference procedures should be replaced with nonparametric methods. The presence of outliers, both their magnitude and number, was also used to check the assumptions of inference procedures. For multifactorial designs one- or two-way ANOVA was performed for one or two factor designs, with Dunnett or Sidak post hoc tests for multiple comparison respectively. All data are presented as raw values with the median represented by a bar. An α value of 0.05 was used as the threshold of statistical significance.

Conclusion

In this report we developed agent-based models (ABMs) and explored aspects of collective behavior of non-adaptive sperm (i.e., sperm that change motility pattern over time in a manner that depends on intrinsic control, rather than exogenous responses to signals). Our results highlight the intertwined influences of microenvironmental complexity and sperm phenotypic heterogeneity in shaping sperm fitness- defined here as the probability of egg contact given that a sperm has a particular trait value. Results from this study provide key insights and useful definitions for further exploration of a theory of sperm selection in the context of assisted reproductive technologies. The insights provided by the models hold promise for optimizing real-time sperm diagnostics and selection strategies with broad applications in both clinical and agricultural settings.

Data Availability

All data and code used for running experiments, model fitting, and plotting is available on a GitHub repository at https://github.com/cas-mitolab/Fertilization_ABM

Funding Statement

This work was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (R01HD110170 to CAS), as well as laboratory startup funding from the Thomas Harriot College of Arts and Sciences at East Carolina University and the East Carolina University Research and Economic Development Office (CAS). M.M. gratefully acknowledges the support for this research by Fulbright Scholar-In-Residence Program, sponsored by the U.S. Department of State. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Kandel ME, Rubessa M, He YR, Schreiber S, Meyers S, Matter Naves L, et al. Reproductive outcomes predicted by phase imaging with computational specificity of spermatozoon ultrastructure. Proc Natl Acad Sci U S A. 2020;117(31):18302–9. doi: 10.1073/pnas.2001754117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.You JB, McCallum C, Wang Y, Riordon J, Nosrati R, Sinton D. Machine learning for sperm selection. Nat Rev Urol. 2021;18(7):387–403. doi: 10.1038/s41585-021-00465-1 [DOI] [PubMed] [Google Scholar]
  • 3.Oehninger S, Franken DR, Ombelet W. Sperm functional tests. Fertil Steril. 2014;102(6):1528–33. doi: 10.1016/j.fertnstert.2014.09.044 [DOI] [PubMed] [Google Scholar]
  • 4.Buffone MG, Doncel GF, Marín Briggiler CI, Vazquez-Levin MH, Calamera JC. Human sperm subpopulations: relationship between functional quality and protein tyrosine phosphorylation. Hum Reprod. 2004;19(1):139–46. doi: 10.1093/humrep/deh040 [DOI] [PubMed] [Google Scholar]
  • 5.Puga Molina LC, Luque GM, Balestrini PA, Marín-Briggiler CI, Romarowski A, Buffone MG. Molecular basis of human sperm capacitation. Front Cell Dev Biol. 2018;6:72. doi: 10.3389/fcell.2018.00072 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Luque GM, Dalotto-Moreno T, Martín-Hidalgo D, Ritagliati C, Puga Molina LC, Romarowski A, et al. Only a subpopulation of mouse sperm displays a rapid increase in intracellular calcium during capacitation. J Cell Physiol. 2018;233(12):9685–700. doi: 10.1002/jcp.26883 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Navarrete FA, Aguila L, Martin-Hidalgo D, Tourzani DA, Luque GM, Ardestani G, et al. Transient sperm starvation improves the outcome of assisted reproductive technologies. Front Cell Dev Biol. 2019;7:262. doi: 10.3389/fcell.2019.00262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Darszon A, Nishigaki T, López-González I, Visconti PE, Treviño CL. Differences and similarities: the richness of comparative sperm physiology. Physiology (Bethesda). 2020;35(3):196–208. doi: 10.1152/physiol.00033.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Aguado-García A, Priego-Espinosa DA, Aldana A, Darszon A, Martínez-Mekler G. Mathematical model reveals that heterogeneity in the number of ion transporters regulates the fraction of mouse sperm capacitation. PLoS One. 2021;16(11):e0245816. doi: 10.1371/journal.pone.0245816 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schmidt CA, Hale BJ, Bhowmick D, Miller WJ, Neufer PD, Geyer CB. Pyruvate modulation of redox potential controls mouse sperm motility. Dev Cell. 2024;59(1):79-90.e6. doi: 10.1016/j.devcel.2023.11.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Parker G. Sperm competition and its evolutionary consequences in the insects. Biol. Rev. 1970;45:525–67. [Google Scholar]
  • 12.Tourmente M, Gomendio M, Roldan ERS. Sperm competition and the evolution of sperm design in mammals. BMC Evol Biol. 2011;11:12. doi: 10.1186/1471-2148-11-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Simmons LW, Fitzpatrick JL. Sperm wars and the evolution of male fertility. Reproduction. 2012;144(5):519–34. doi: 10.1530/REP-12-0285 [DOI] [PubMed] [Google Scholar]
  • 14.Sutter A, Immler S. Within-ejaculate sperm competition. Philos Trans R Soc Lond B Biol Sci. 2020;375(1813):20200066. doi: 10.1098/rstb.2020.0066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Firman RC, Gasparini C, Manier MK, Pizzari T. Postmating female control: 20 years of cryptic female choice. Trends Ecol Evol. 2017;32(5):368–82. doi: 10.1016/j.tree.2017.02.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Suarez SS. Control of hyperactivation in sperm. Hum Reprod Update. 2008;14(6):647–57. doi: 10.1093/humupd/dmn029 [DOI] [PubMed] [Google Scholar]
  • 17.Hansen JN, Rassmann S, Jikeli JF, Wachten D. SpermQ - a simple analysis software to comprehensively study flagellar beating and sperm steering. Cells. 2018;1. doi: 10.1101/449173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Gallagher MT, Cupples G, Ooi EH, Kirkman-Brown JC, Smith DJ. Rapid sperm capture: high-throughput flagellar waveform analysis. Hum Reprod. 2019;34(7):1173–85. doi: 10.1093/humrep/dez056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Goodson SG, Zhang Z, Tsuruta JK, Wang W, O’Brien DA. Classification of mouse sperm motility patterns using an automated multiclass support vector machines model. Biol Reprod. 2011;84(6):1207–15. doi: 10.1095/biolreprod.110.088989 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Goodson SG, White S, Stevans AM, Bhat S, Kao C-Y, Jaworski S, et al. CASAnova: a multiclass support vector machine model for the classification of human sperm motility patterns. Biol Reprod. 2017;97(5):698–708. doi: 10.1093/biolre/iox120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Suarez SS, Pacey AA. Sperm transport in the female reproductive tract. Hum Reprod Update. 2006;12(1):23–37. doi: 10.1093/humupd/dmi047 [DOI] [PubMed] [Google Scholar]
  • 22.Giojalas LC, Guidobaldi HA. Getting to and away from the egg, an interplay between several sperm transport mechanisms and a complex oviduct physiology. Mol Cell Endocrinol. 2020;518:110954. doi: 10.1016/j.mce.2020.110954 [DOI] [PubMed] [Google Scholar]
  • 23.Aleliunas R, Arp RMK, Lipton RJ, Lovasz L, Rackoff C. Random walks, universal traversal sequences, and the complexity of maze problems. Annual Symposium on Foundations of Computer Science; 1979. [Google Scholar]
  • 24.Travis AJ, Jorgez CJ, Merdiushev T, Jones BH, Dess DM, Diaz-Cueto L, et al. Functional relationships between capacitation-dependent cell signaling and compartmentalized metabolic pathways in murine spermatozoa. J Biol Chem. 2001;276(10):7630–6. doi: 10.1074/jbc.M006217200 [DOI] [PubMed] [Google Scholar]
  • 25.Balbach M, Gervasi MG, Hidalgo DM, Visconti PE, Levin LR, Buck J. Metabolic changes in mouse sperm during capacitation†. Biol Reprod. 2020;103(4):791–801. doi: 10.1093/biolre/ioaa114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Visconti PE, Moore GD, Bailey JL, Leclerc P, Connors SA, Pan D, et al. Capacitation of mouse spermatozoa. II. Protein tyrosine phosphorylation and capacitation are regulated by a cAMP-dependent pathway. Development. 1995;121(4):1139–50. doi: 10.1242/dev.121.4.1139 [DOI] [PubMed] [Google Scholar]
  • 27.Hereng TH, Elgstøen KBP, Cederkvist FH, Eide L, Jahnsen T, Skålhegg BS, et al. Exogenous pyruvate accelerates glycolysis and promotes capacitation in human spermatozoa. Hum Reprod. 2011;26(12):3249–63. doi: 10.1093/humrep/der317 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Orta G, de la Vega-Beltran JL, Martín-Hidalgo D, Santi CM, Visconti PE, Darszon A. CatSper channels are regulated by protein kinase A. J Biol Chem. 2018;293(43):16830–41. doi: 10.1074/jbc.RA117.001566 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Miller MR, Kenny SJ, Mannowetz N, Mansell SA, Wojcik M, Mendoza S, et al. Asymmetrically positioned flagellar control units regulate human sperm rotation. Cell Rep. 2019;26(10):2847. doi: 10.1016/j.celrep.2019.02.075 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Baez J, Pollard B. Relative entropy in biological systems. Entropy. 2016;18(2):46. doi: 10.3390/e18020046 [DOI] [Google Scholar]
  • 31.Nosrati R, Graham PJ, Zhang B, Riordon J, Lagunov A, Hannam TG, et al. Microfluidics for sperm analysis and selection. Nat Rev Urol. 2017;14(12):707–30. doi: 10.1038/nrurol.2017.175 [DOI] [PubMed] [Google Scholar]
  • 32.Baldini D, Ferri D, Baldini GM, Lot D, Catino A, Vizziello D, et al. Sperm selection for ICSI: do we have a winner?. Cells. 2021;10(12):3566. doi: 10.3390/cells10123566 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Codling EA, Plank MJ, Benhamou S. Random walk models in biology. J R Soc Interface. 2008;5(25):813–34. doi: 10.1098/rsif.2008.0014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Larralde H. First-passage probabilities and mean number of sites visited by a persistent random walker in one- and two-dimensional lattices. Phys Rev E. 2020;102(6–1):062129. doi: 10.1103/PhysRevE.102.062129 [DOI] [PubMed] [Google Scholar]
  • 35.Larralde H, Trunfio P, Havlin S, Stanley H, Weiss G. Number of distinct sites visited by N random walkers. Phys Rev A (Coll Park). 1992;45. [DOI] [PubMed] [Google Scholar]
  • 36.Baker MA, Reeves G, Hetherington L, Aitken RJ. Analysis of proteomic changes associated with sperm capacitation through the combined use of IPG-strip pre-fractionation followed by RP chromatography LC-MS/MS analysis. Proteomics. 2010;10(3):482–95. doi: 10.1002/pmic.200900574 [DOI] [PubMed] [Google Scholar]
  • 37.Ferramosca A, Zara V. Bioenergetics of mammalian sperm capacitation. Biomed Res Int. 2014;2014:902953. doi: 10.1155/2014/902953 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Inoue N, Hamada D, Kamikubo H, Hirata K, Kataoka M, Yamamoto M, et al. Molecular dissection of IZUMO1, a sperm protein essential for sperm-egg fusion. Development. 2013;140(15):3221–9. doi: 10.1242/dev.094854 [DOI] [PubMed] [Google Scholar]
  • 39.Inoue N, Ikawa M, Isotani A, Okabe M. The immunoglobulin superfamily protein Izumo is required for sperm to fuse with eggs. Nature. 2005;434:229–34. doi: 10.1038/nature03318 [DOI] [PubMed] [Google Scholar]
  • 40.Cardullo RA, Baltz JM. Metabolic regulation in mammalian sperm: mitochondrial volume determines sperm length and flagellar beat frequency. Cell Motil Cytoskeleton. 1991;19(3):180–8. doi: 10.1002/cm.970190306 [DOI] [PubMed] [Google Scholar]
  • 41.Lighthill JT. Flagellar hydrodynamics. SIAM Review. 1976;18:161–226. Available from: https://about.jstor.org/terms [Google Scholar]
  • 42.Miller MR, Kenny SJ, Mannowetz N, Mansell SA, Wojcik M, Mendoza S, et al. Asymmetrically positioned flagellar control units regulate human sperm rotation. Cell Rep. 2018;24(10):2606–13. doi: 10.1016/j.celrep.2018.08.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ooi EH, Smith DJ, Gadêlha H, Gaffney EA, Kirkman-Brown J. The mechanics of hyperactivation in adhered human sperm. R Soc Open Sci. 2014;1(2):140230. doi: 10.1098/rsos.140230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Martínez-Pastor F. What is the importance of sperm subpopulations?. Anim Reprod Sci. 2022;246:106844. doi: 10.1016/j.anireprosci.2021.106844 [DOI] [PubMed] [Google Scholar]
  • 45.Yang J, Kupka I, Schuss Z, Holcman D. Search for a small egg by spermatozoa in restricted geometries. J Math Biol. 2016;73(2):423–46. doi: 10.1007/s00285-015-0955-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Cohen J. Cross-overs, sperm redundancy and their close association. Heredity (Edinb). 1973;31(3):408–13. doi: 10.1038/hdy.1973.96 [DOI] [PubMed] [Google Scholar]
  • 47.Roldan ERS. Sperm competition and the evolution of sperm form and function in mammals. Reprod Domest Anim. 2019;54 Suppl 4 14–21. doi: 10.1111/rda.13552 [DOI] [PubMed] [Google Scholar]
  • 48.Villani MT, Morini D, Spaggiari G, Falbo AI, Melli B, La Sala GB, et al. Are sperm parameters able to predict the success of assisted reproductive technology? A retrospective analysis of over 22,000 assisted reproductive technology cycles. Andrology. 2022;10(2):310–21. doi: 10.1111/andr.13123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Donnelly ET, Lewis SE, McNally JA, Thompson W. In vitro fertilization and pregnancy rates: the influence of sperm motility and morphology on IVF outcome. Fertil Steril. 1998;70(2):305–14. doi: 10.1016/s0015-0282(98)00146-0 [DOI] [PubMed] [Google Scholar]
  • 50.Shalom-Paz E, Anabusi S, Michaeli M, Karchovsky-Shoshan E, Rothfarb N, Shavit T, et al. Can intra cytoplasmatic morphologically selected sperm injection (IMSI) technique improve outcome in patients with repeated IVF-ICSI failure? a comparative study. Gynecol Endocrinol. 2015;31(3):247–51. doi: 10.3109/09513590.2014.982085 [DOI] [PubMed] [Google Scholar]
  • 51.Frank SA. Natural selection. V. How to read the fundamental equations of evolutionary change in terms of information theory. J Evol Biol. 2012;25(12):2377–96. doi: 10.1111/jeb.12010 [DOI] [PubMed] [Google Scholar]
  • 52.Gómez Montoto L, Magaña C, Tourmente M, Martín-Coello J, Crespo C, Luque-Larena JJ, et al. Sperm competition, sperm numbers and sperm quality in muroid rodents. PLoS One. 2011;6(3):e18173. doi: 10.1371/journal.pone.0018173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Petrunkina AM, Waberski D, Günzel-Apel AR, Töpfer-Petersen E. Determinants of sperm quality and fertility in domestic species. Reproduction. 2007;134(1):3–17. doi: 10.1530/REP-07-0046 [DOI] [PubMed] [Google Scholar]
  • 54.Fernández-López P, Garriga J, Casas I, Yeste M, Bartumeus F. Predicting fertility from sperm motility landscapes. Commun Biol. 2022;5(1):1027. doi: 10.1038/s42003-022-03954-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Riordon J, McCallum C, Sinton D. Deep learning for the classification of human sperm. Comput Biol Med. 2019;111:103342. doi: 10.1016/j.compbiomed.2019.103342 [DOI] [PubMed] [Google Scholar]
  • 56.Frank SA. Natural selection maximizes Fisher information. J Evol Biol. 2009;22(2):231–44. doi: 10.1111/j.1420-9101.2008.01647.x [DOI] [PubMed] [Google Scholar]
  • 57.Kromer JA, Märcker S, Lange S, Baier C, Friedrich BM. Decision making improves sperm chemotaxis in the presence of noise. PLoS Comput Biol. 2018;14(4):e1006109. doi: 10.1371/journal.pcbi.1006109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.de Prelle B, Lybaert P, Gall D. A Minimal model shows that a positive feedback loop between sNHE and SLO3 can control mouse sperm capacitation. Front Cell Dev Biol. 2022;10:835594. doi: 10.3389/fcell.2022.835594 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Tisue S, Wilensky U. NetLogo: A Simple Environment for Modeling Complexity. International conference on complex systems. Boston, MA; 2004. p. 16–21. [Google Scholar]
  • 60.Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, et al. Array programming with NumPy. Nature. 2020;585(7825):357–62. doi: 10.1038/s41586-020-2649-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Park LM, Lannigan J, Jaimes MC. OMIP-069: forty-color full spectrum flow cytometry panel for deep immunophenotyping of major cell subsets in human peripheral blood. Cytometry A. 2020;97(10):1044–51. doi: 10.1002/cyto.a.24213 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Hunter JD. Matplotlib: A 2D graphics environment. Comput Sci Eng. 2007;9(3):90–5. doi: 10.1109/mcse.2007.55 [DOI] [Google Scholar]
  • 63.Mckinney W. Data structures for statistical computing in Python. SciPy. 2010;445. [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1012865.r002

Decision Letter 0

Jing Chen

7 Oct 2024

Dear Dr Schmidt,

Thank you very much for submitting your manuscript "Modeling Diffusive Search by Non-Adaptive Sperm: Empirical and Computational Insights" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Jing Chen

Academic Editor

PLOS Computational Biology

Pedro Mendes

Section Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: uploaded as an attachment

Reviewer #2: In this manuscript, the authors suggest to use simple modelling approach to study the selective pressure and variability in the sperm fitness during the fertilisation process. I actually do like the general idea and fully support the usage of simple enough models to make the point. Indeed with the help of tractable models it is possible to bring across very powerful mathematical predictions that can be the key to understanding even such complex phenomena as discussed in this manuscript. So while in general strongly supporting the simple idea, I have multiple technical comments/problems to the implementation which prevent me from being fully positive about this work. Below I provide the comments in the chronological order. I note the most critical points with a star * sign. Also there is a tendency to over-interpret the results from the models

- “Diffusive search is an intrinsic property of agents performing a random walk” - this is a strange statement, not all random walking agents are searching for something

- Fig 1 A is confusing, not clear what the blue, red and white parts are, also what is the zoom in square means. Maybe showing a photo of a real chip would help better than this schematics

- Angle \theta in Fig. 1B is just showing the 90 degrees and not linked to the trajectory

- "increasing the mean \mu .." - mean of what?

-"Minor adjustments to the mean or standard deviation of the distribution can have significant effects on the agents' diffusive search behavior" - this is very vague and unsubstantiated statement. What is minor, what is significant? Also search behaviour is not quantified in any way.

-"Agents with movement parameters that allow them to search space more quickly than others will exhibit a more rapidly increasing RMS” - this is also very vague statement. A walker moving along the straight line with constant speed will have higher asymptotic MSD - but will it search space more quickly? It is very imprecise formulation.

-How random \delta is selected? can it be negative (if it comes from Gaussian distribution)?

-“..search patterns to broader exploration of the surrounding environment ultimately resulting in altered diffusive search outcomes” - this is either unsubstantiated statement or trivial, should be removed in both cases.

(*)-“Those with smaller ranges had fewer directions to go at each timestep and exhibited behavior that resembled crude progressive motility patterns (Figure 1E)” - how do we know? no reference, no figure

-" Taken together, the results of these simplified models highlight the diffusive search functionality that emerges” - unsubstantiated - nothing definite has been said about search till this point.

-"match the simulated Gaussian distributions from data” - not clear what it is, why to simulate Gaussian from data?

-Fig 2A: why experiment and simulations are shown at different magnification? It is impossible to see the tracks in simulations good enough to say if they are similar or not

(*)- "The resulting sperm motility quantitatively and qualitatively like mouse sperm..” how do we see that? based on which analysis? We see in Fig. 2 BCD that multiple quantifiers are off between data and simulations. Some have low values and we can’t tell how good/bad they are

- "These results demonstrate that the average diffusive search capability of a sperm population reflects the underlying distribution of sperm phenotypes. In other words, the ability to efficiently search space is an emergent property of the sperm population.” This is somehow very generic and trivial statement. The search efficiency was quantified with an arbitrary quantity (which is not really discussed who general it is) and then of course - different microscopic random walk leads to a different diffusion constant and to different exploration - of course? what else?

- The plots for the MSD, within the realm of the random walk model used can be fully analytically quantified, in particular by the MSD of the Ornstein-Uhlenbeck process that can be derived from the run-and-tumble-like random walk model.

-"The simulations presented thus far predict that sperm populations will function on average to search all the space that they occupy, and the kinetics of the search process depend on the distribution of sperm motility patterns within the cell population” - There is nothing to predict! The result is known (trivial) analytically as soon as the model is formulated? Rephrase!

-Figure 5. I have difficulties with quantitative interpretation of these results, as the space/time scale of the experiment is unclear, and very difficult to tease out from the methods section. Also form that moment on some quantities start to be shown stochastically with units/dimensionless.

(*)-Results in Figure 5 in essence have clear, analytical explanation. For the open chamber the search time is the first passage time of the random walk to the target. And the plots show how that changes for a finite number of trajectories. The shortest possible time is the movement of the random walker on a straight path (very rare event). For an infinite number of realisations such a trajectory will always appear and thus provide a narrow cloud of points for 100 trials - this is an asymptotic value towards which the times converge. This behavior is to be expected as soon as the number of trajectories (particles) times the statistical probability of such trajectory to occur (can be calculated, like the authors did for labyrinths) will become equal to 1 - so one trajectory is enough. For smaller N of course we have to wait till whatever trajectory first reaches the target (thus a scatter and higher target hitting time).

Similarly - plots in B show the interplay of the time it takes to reach the target versus the number of distinct sites visited by the random walker. Reaching the target depends on the pdf of the particles to reaching out beyond a given distance with high enough total probability. In that there is a square root relationship of the distance and the time it takes to reach the target (if we argue in terms of time scaling). Another quantity is the number of distinct sites visited by the collection of random walkers. The scaling of this quantity for large N (number of walking particles) has been studied in 90s (see for example Larralde et al PRA 1992). The number of distinct sites visited with time (depending on the regime) scales at least linearly. So therefore it is also expected (from analytical arguments) that the number of sites visited with the increasing number of walkers will increase reaching to the 100% before the run is terminated when one of the particles hits the target.

- "Taken together, these results predict that diffusive search by non-adaptive sperm will exhibit a non-linear relationship with sperm density and that the role of chance in finding shortest path to the egg is modified by the spatial complexity of the microenvironment”. So, as before, I don’t think there is any need in the numerical simulations to predict the behaviours observed/shown in the plots. The qualitative outcome can be understood based on the known results from basic random walk properties. The only novelty (which is also probably difficult to straightforwardly derive analytically) are the results for labyrinth geometries. Those however, have no qualitative explanations, for example, with respect to the characteristic time scales, etc. So these are not predictions, but a simple numerical verification of the analytically predictable (interpretable) results.

- "with a Markov transition matrix “ - this is not defined, also in the Methods it is hard to understand how exactly it looks like

- "Simulations consisted of 100 sperm, which was chosen as a reasonable (minimal) number” - based on what?

-" with mean search time differences as large as ~627 seconds in the most extreme case” as mentioned above, the jumping between unit and unitless quantities - and no idea of the actual searching area - it is hard to give a physical meaning to the numbers provided.

(*) - "Taken together, the results from these simulations reinforce the conclusion that simple spatial hindrance by the latent structure of the microenvironment combined with variation in individual sperm phenotypes exerts quantifiable selective pressure on sperm. The fitness can be represented as an inferential probability of ‘success’, and the magnitude of selection can be represented using relative information gain.” I think this is the part which mixes the relatively straightforward predictions of the model (which are essentially what they are as soon as parameters are fixed) and the pressure/selection are the criteria imposed by the authors, and the more complex evolutionary concepts of the underlying biological problem. Instead, what authors should do - just explain their mathematical results, and then map them in the language of pressure and selection, but not mixing them (as also was done in the whole last section).

"Previous work modeling sperm search times in both 2D and 3D environments detailed several potential scaling laws for the diffusive search process relating search time to sperm number[38]” - the work cited, which is indeed based on a large body of analytical work, actually argue that the relevant movement of cells is not diffusive as considered in this work but along the straight paths which only reflect from the boundaries of the confinement. This needs to be commented.

" The simulations in this report predict that increased microenvironmental complexity requires greater sperm density to maintain effective diffusive search and timely egg contact” - as mentioned in the above several points - these are not the points that need any numerical simulations to make predictions. At best those are numerical illustrations, and maybe the way to get a feeling of the quantitative effect.

- My final point is that in the extensive discussion section there are no qualitative explanations of the model results, which, in fact, can be argues in terms of the random walk theory. In terms of numbers (times/scales) there is no discussion of relevance to those to the underlying biological problem. Furthermore, the connection of the model results to the problem of selection and pressure need a dedicated explanation, why the quantifiers/measure are the one to use? What are the alternatives, what are the advantages/disadvantages?

So taken together, as you see, there is an extensive list of criticism on the technical implementation and its interpretation which need to be resolved before reconsidering this manuscript for publication.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No:  No access to code, some descriptions of the model setup are still cryptic.

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: ReviewF.docx

pcbi.1012865.s001.docx (13.6KB, docx)
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1012865.r004

Decision Letter 1

Jing Chen

9 Jan 2025

PCOMPBIOL-D-24-01402R1

Modeling Diffusive Search by Non-Adaptive Sperm: Empirical and Computational Insights

PLOS Computational Biology

Dear Dr. Schmidt,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 30 days Mar 11 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Jing Chen

Academic Editor

PLOS Computational Biology

Pedro Mendes

Section Editor

PLOS Computational Biology

Additional Editor Comments:

Please address the reviewers' new comments regarding the clarity issue of the revised manuscript. An accurate description of the contribution of this work and justification of the methods will make the work more significant and impactful.

Journal Requirements:

1) We have noticed that you have uploaded Supporting Information files, but you have not included a complete list of legends. Please add a full list of legends for your Supporting Information files after the references list.

2) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

- State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

- State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.".

If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The paper presents computational results that support empirical methods for assessing sperm fitness within assisted reproductive technologies. The authors conceptualize the physiological function of sperm as a diffusive search process and develop computational tools to explore the underlying causal dynamics of sperm fitness. They introduce both a probabilistic measure of sperm fitness and an information-theoretic measure of sperm selection, each situated within a relevant theoretical framework. The figures, methods, and explanations collectively support the idea that the developed agent-based models present compelling arguments, offering valuable insights toward a theory of sperm selection. Overall, the references and detailed results are consistent and well-explained. The responses to reviewer comments have strengthened the paper's main mathematical results and established stronger connections with relevant empirical work.

The comments raised by reviewers were adequately addressed, and the main issues have been resolved. Below are additional comments and questions regarding the responses to the original reviewer feedback:

Comments on the GitHub Repository Updates

From Document:

"Response: We have updated the movement function description in the paper to make the model implementation more clear as well as to address potential issues with compounding random noise in the initial models. The movement functions are described in the materials and methods. We have modified Figure 1 and no longer include the simulations with simple Gaussian walks in lieu of simulations with the updated movement functions. We have updated the code in the GitHub repository accordingly."

Comment:

Although the responses indicate that the code source at https://github.com/CAS-ReproLab/Fertilization_ABM has been updated, the GitHub repository itself has not received any updates in the past six months. It is possible that updates have been made to a nonpublic version of the code but have not yet been pushed to the public repository. Please clarify if there is a new link or specify which codes have been updated.

Comments on Future Work and Current Paper Content

From Document:

"Response: Not sure I fully understand this critique, so apologies in advance if the response is off base. In sperm analysis, summary statistics are often used to describe sperm phenotypes (e.g., motility measures, etc.) without paying much attention to the underlying distributions, which in our experience are often not symmetric. We feel that this perspective on sperm physiology misses the point that the underlying distributions matter greatly, and ignoring this likely contributes to diagnostic shortcomings and misconceptions about the physiological tasks that sperm perform. The approach to quantifying fitness and the magnitude of selection may be an important step toward improving sperm diagnostics and understanding the constraints that influence which sperm will ultimately fertilize an egg. We realize that the idea is mostly descriptive and preliminary, but we are working on a 'more detailed analysis of the underlying statistical considerations as well as the information-theoretic implications of selection during fertilization.'”

Comment:

The more detailed analysis that is being worked on, I am assuming that is referring to a future work. For this comment it would be useful to address the current information theoretic sections of the paper to make clear the distinction between what is present in the paper and the future proposed work. This distinction would both enhance the reader’s understanding of the scope and implications of the current research and ensure the reviewers are adequately understanding the points that you are getting across.

Reviewer #2: While the authors did a good job in removing ambiguous passages and overstatements the presentation has now suffered in two ways. First when reading the results, it is not clear what the setup and setting is, what are the mazes ABC where the egg is, what the sperm is doing while moving in the model. I really do not understand why this information is now hidden in S1 figure (which still doesn’t contain mazes). Second negative effect is that by removing the overstatements there is now no any kind of qualitative explanation of the results (at least in the way I was trying to make sense of those results in my previous comment). So now we are only presented with some results (dependencies) but no attempt to rationalize them.

Taken together, while the over interpretations were removed, the clarity of results has suffered considerably. I do think this needs to be fixed before it can be published.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1012865.r006

Decision Letter 2

Jing Chen

10 Feb 2025

Dear Dr Schmidt,

We are pleased to inform you that your manuscript 'Modeling Diffusive Search by Non-Adaptive Sperm: Empirical and Computational Insights' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Jing Chen

Academic Editor

PLOS Computational Biology

Pedro Mendes

Section Editor

PLOS Computational Biology

***********************************************************

Please fix one small type on line 412: of of.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Overall, the references and updated sections are consistent and well-explained. The responses to comments from reviewers have further strengthened the main mathematical results of the paper along with building a stronger connection between relevant empirical works. Overall, the comments were fully addressed and the main issues brought up were resolved.

Reviewer #2: I appreciate that the authors put the details of the model back to the main text. I reread the whole manuscript again and it now reads much more consistently. I’m happy to recommend the paper for publication.

I only noticed one small type on line 412: of of

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1012865.r007

Acceptance letter

Jing Chen

PCOMPBIOL-D-24-01402R2

Modeling Diffusive Search by Non-Adaptive Sperm: Empirical and Computational Insights

Dear Dr Schmidt,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofia Freund

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: ReviewF.docx

    pcbi.1012865.s001.docx (13.6KB, docx)
    Attachment

    Submitted filename: Response to Reviewers.docx

    pcbi.1012865.s003.docx (30.7KB, docx)
    Attachment

    Submitted filename: Response_to_Reviewers_auresp_2.docx

    pcbi.1012865.s004.docx (24.4KB, docx)

    Data Availability Statement

    All data and code used for running experiments, model fitting, and plotting is available on a GitHub repository at https://github.com/cas-mitolab/Fertilization_ABM


    Articles from PLOS Computational Biology are provided here courtesy of PLOS

    RESOURCES