Skip to main content
PLOS Biology logoLink to PLOS Biology
. 2026 Mar 31;24(3):e3003699. doi: 10.1371/journal.pbio.3003699

Evidence for genetically-based sperm discrimination in the vaginal tract of a primate species

Rachel M Petersen 1,¤a,*, Lee (Emily) M Nonnamaker 2,¤b, Jaclyn A Anderson 3, Christina M Bergey 4, Christian Roos 5,6, Amanda D Melin 3,7,8, James P Higham 1
Editor: Masahito Ikawa9
PMCID: PMC13038021  PMID: 41915630

Abstract

Females influence offspring paternity through diverse pre- and post-copulatory mechanisms. Sperm discrimination—the differential physiological response to ejaculates based on male or sperm characteristics—can bias fertilization outcomes, but in vivo evidence of this process in large-bodied mammals is lacking. Here, in a study of nine females and four males, we tested whether two aspects of female physiology that affect sperm survival—vaginal immune response and pH—are modulated by male genetic makeup in a non-human primate, the olive baboon (Papio anubis). Our findings suggest post-copulatory differences in vaginal gene expression and pH, with the strongest immune responses and largest pH decreases, harmful to sperm, exhibited by females mating with genetically similar males. These findings are consistent with genetically-based post-copulatory mate discrimination, offering new insights into how interactions between male gametes and the female reproductive tract may shape conception probability in primates.


Female sperm discrimination that can bias fertilisation outcomes has been observed in animals, but in vivo evidence is lacking in large mammals. This study shows that the female olive baboon vaginal tract can exert post-copulatory mate selection by altering sperm survival via vaginal immune responses and pH according to male genetic makeup, with the strongest responses to genetically similar males.

Introduction

Characterizing the mechanisms and outcomes of sexual selection, and specifically mate choice, has been a major goal of evolutionary biologists [14]. Female mate choice can occur both prior to copulation in the form of behavioral mating biases, or after copulation in the form of fertilization biases, a process termed cryptic female choice (CFC) [58]. To date, empirical evidence demonstrating in vivo CFC in mammals is concentrated in rodent taxa [9,10]. However, the heightened maternal investment and prolonged offspring care common to large-bodied mammals, as well as discrepancies between mating observations and genetic paternity, suggest that CFC may be widespread [1113]. Nonetheless, investigating these processes in species which share aspects of their reproductive physiology with humans, such as other primates, is likely critical for improving our understanding of human infertility.

Studies indicate that the female reproductive tract can discriminate between sperm cells based on their genetic material [1416], providing a potential mechanism for genetically-based CFC. In mammals, in vitro experiments in mice show higher fertilization success for sperm from more distantly related males [17], and artificial insemination experiments in pigs reveal dramatic shifts in oviductal gene expression in response to sex-sorted X- versus Y-chromosome-bearing sperm [18]. In humans, in vitro experiments have shown both differential sperm responsiveness to follicular fluid and differential gene expression in vaginal epithelial cells in response to seminal fluid, however, how these responses relate to the genetic make-up of the egg and sperm remains uncertain [19,20]. The major histocompatibility complex (MHC) is a highly polymorphic genomic region involved in pathogen identification and immune response regulation. It is also an attractive candidate target of CFC due to its prior implicated role in mate choice and important contribution to reproductive success [2123]. While pre-copulatory MHC-based mate preferences are well documented across taxa, including non-human primates [2429], the role of the MHC in post-copulatory sexual selection remains largely unexplored. Evidence for MHC-driven sperm selection is limited to a handful of studies in rodents, fish, and birds [3033], with no documented evidence in primates, despite its potential relevance to human fertility.

In this study, we aimed to explore potential mechanisms of CFC in a non-human primate, the olive baboon (Papio anubis). Olive baboon females mate with multiple males across their ovarian cycle, however, males often attempt to monopolize access to fertile females through mate guarding. These consortships, in which a male closely associates with and guards a female, can persist for several hours to multiple days, during which time the ejaculate from only a single male may be present in the female’s reproductive tract [34]. Furthermore, females energetically invest greatly in each offspring, and experience a relatively slow reproductive rate, providing conditions that are likely to promote selection for CFC [35]. We focused on vaginal pH and gene expression, as these may contribute to sperm survival [36,37] and can be characterized following mating in unanesthetized individuals using positive reinforcement training. We conducted both genome-wide reduced representation DNA sequencing and MHC genotyping on four intact males and nine parous females and strategically paired each male with 2–3 females to encompass a broad range of genetic diversity and complementarity (i.e., similarity) values across mating dyads. We first characterized vaginal pH and gene expression across the cycle in the absence of mating and used these samples as baseline comparisons for post-copulatory responses. We asked how vaginal gene expression and pH changes: (1) across female ovarian cycle phases; (2) in response to mating; and (3) in relation to the genetic diversity and complementarity of the mating male. We hypothesized that females will exhibit a stronger immune response and lower vaginal pH, both potentially harmful to sperm survival, after mating with males who are less genetically diverse and complementary. We predicted this pattern based on the selective pressures favoring offspring with greater genetic diversity, particularly at the MHC, while reducing the risks associated with inbreeding.

Results

Vaginal gene expression varies across the ovarian cycle

We determined the timing of ovulation based on vaginal cytology (Fig 1A; see Materials and methods). We identified a 5-day fertile phase, a 5-day pre-fertile phase, a 5-day post-fertile phase, and classified the remainder of the cycle as the non-fertile phase [3841]. We analyzed 32 non-copulatory vaginal RNA samples from eight females (8 per cycle phase) and 275 non-copulatory pH measurements from nine females (68.8 ± 21.8 s.d. per cycle phase; Fig 1A; additional details on dataset composition provided in Table A in S2 Appendix).

Fig 1. Differential gene expression measured across ovarian cycle phases.

Fig 1

(A) We analyzed 32 RNA-seq samples and 275 pH measurements taken across the four cycle phases as determined by vaginal cytology; (B) The number of differentially expressed (DE) genes across each phase comparison. The largest differences in gene expression were observed when comparing the non-fertile phase to the pre-fertile, fertile, and post-fertile phases; (C) Numerous DE genes were unique to particular cycle phases (i.e., 143 DE genes were unique to the fertile phase), while others were shared across two or more phases (i.e., 1,974 DE genes were shared across the pre-fertile, fertile, and post-fertile phases); (D) Enrichment distributions showing the ranked distribution of genes in the top 5 over/underrepresented gene sets in the fertile phase; (E) Normalized read counts of TLR2, a gene involved in the positive regulation of the inflammatory response and suppressed in the fertile phase; (F) Normalized read counts of SLC4A8, a gene involved in ion transmembrane transport and activated in the fertile phase. The data underlying this figure are provided in S1 Data. Artwork by LMN.

To understand baseline vaginal gene expression, we performed differential gene expression analyses with robust empirical Bayes moderation followed by adaptive shrinkage [42,43]. We included cycle phase as the predictor variable, female ID as a blocking factor (i.e., a random effect), and RNA quality (RIN) as a covariate. As some samples were taken prior to male introduction into the enclosure, we also included “male presence” as a covariate (a binary yes or no variable). We found 2,480 differentially expressed (DE) genes between the non-fertile versus pre-fertile phase (2,330 at LFSR < 0.05, 2,050 at LFSR < 0.01), 2,537 between the non-fertile versus fertile phase (2,403 at LFSR < 0.05, 2,174 at LFSR < 0.01), and 2,277 between the non-fertile versus post-fertile phase (2,080 at LFSR < 0.05, 1,675 at LFSR < 0.01; Fig 1B), with many of these shared across the pre-fertile, fertile and post-fertile phases (n = 1,974; Fig 1C and Table B in S2 Appendix). We performed gene set enrichment analysis (GSEA) to describe the biological functions of DE genes and found that the most strongly upregulated pathways in the fertile phase involve G protein-coupled receptor activity and ion transmembrane transport, and the most strongly downregulated pathways include positive regulation of the inflammatory response, phagocytic vesicles, and cell adhesion (Fig 1D and Fig A in S1 Appendix and Table C in S2 Appendix). For example, TLR2, a gene involved in the positive regulation of the inflammatory response, was suppressed in the fertile phase (Fig 1E) and SLC4A8, a gene involved in ion transmembrane transport, was activated in the fertile phase (Fig 1F).

To assess changes in vaginal pH across the cycle, we used robust linear mixed models [44] including female ID as a random effect and average temperature, which is known to impact pH readings, as a covariate [45,46]. We did not find statistically significant differences in vaginal pH across cycle phases (Fig B in S1 Appendix and Table D in S2 Appendix).

Vaginal gene expression and pH indicate responses to copulation

We analyzed 25 post-copulatory RNA samples from six females and 15 post-copulatory pH measurements from five females (described in Table A in S2 Appendix), both collected four hours after an observed copulation with ejaculation to maximize potential changes in gene expression [19]. All post-copulatory samples were collected during the pre-fertile, fertile, or post-fertile phases, and compared to non-copulatory samples taken from the same females during those same three phases when copulation had not been observed that day and there was no evidence of a sperm plug (non-copulatory RNA samples: nnon-cop = 30, pH measurements: nnon-cop = 47; Fig 2A). To account for male-derived RNA present in the vagina, we performed RNA-seq on two semen samples from each male collected opportunistically following masturbation (N= 8 samples total). From these samples, we identified 1,442 genes highly expressed in semen (average expression of >20 cpm, Table E in S2 Appendix), and removed these genes from all subsequent post-copulatory gene expression analyses.

Fig 2. Differential gene expression and pH in response to copulation.

Fig 2

(A) We analyzed 30 RNA-seq samples and 47 pH measurements taken when there was no evidence of recent copulation, and 25 RNA-seq samples and 15 pH measurements taken 4 hours following copulation; (B) Genes within two immune-related GO pathways upregulated in post-copulatory samples. Gene set nodes are sized based on the number of genes within them, genes nodes are colored by their log fold change (logFC) in expression; (C) Normalized counts of TLR2 in non-copulatory vs. post-copulatory samples; (D) Predicted expression of immune system related genes that are differentially expressed in post-copulatory vs. non-copulatory samples. Columns represent samples (left = non-copulatory, right = post-copulatory), rows represent genes, and cell color represents the predicted increase (blue) vs. decrease (purple) in expression, scaled across each row; (E) Vaginal pH was significantly lower in post-copulatory compared to non-copulatory samples. Colored points and error bars represent model predictions ± one standard error and black points represent raw data; (F) Females show non-uniform patterns in the direction and magnitude of pH change between non-copulatory and post-copulatory samples. The data underlying this figure are provided in S2 Data. Artwork by LMN.

We performed differential expression analyses using robust estimation followed by adaptive shrinkage, with post-copulatory status (yes or no) as the predictor variable, dyad ID as a blocking factor, and RIN, cycle phase, and male presence as covariates. We identified 941 DE genes in post-copulatory versus non-copulatory samples (715 at LFSR < 0.05, 383 at LFSR < 0.01; Table F in S2 Appendix). DE genes were enriched for two ontology pathways, both of which are involved in immune system processes (Fig 2B and Table G in S2 Appendix). These enriched pathways include well-described genes that regulate chemokine signaling, such as TLR2 (Fig 2C), and are generally predicted to have higher expression in post-copulatory versus non-copulatory contexts (Fig 2D).

To assess alterations in vaginal pH following copulation, we used robust linear mixed models including dyad ID as a random effect and cycle phase and average temperature as covariates. Although we did not find an association between cycle phase and pH in our dataset, we included cycle phase as a covariate due to previous work observing lower vaginal pHs around the time of ovulation in baboons and humans [47,48]. We found that post-copulatory pH measurements were significantly lower than non-copulatory measurements (estimate = −0.39, SE = 0.12, p = 0.001; Table H in S2 Appendix and Fig 2E), with substantial individual variation in the magnitude of pH change following copulation (Fig 2F). Although variance tests can be sensitive to small sample sizes, we nonetheless detect significantly greater variance in pH among post-copulatory samples compared to non-copulatory ones (Breusch-Pagan test: p = 0.05), which aligns with our initial hypothesis and supports the use of an interaction model to test for the role of male genetic diversity and complementarity in moderating post-copulatory vaginal pH.

Male genetic diversity and complementarity modulate post-copulatory vaginal gene expression and pH

To explore whether post-copulatory vaginal gene expression and pH are modulated by male genetic makeup, we estimated both genome-wide and MHC diversity and complementarity. We used double digest restriction-site associated DNA sequencing (ddRAD-seq) to estimate standardized multi-locus heterozygosity (stMLH) and kinship to approximate genome-wide diversity and complementarity, respectively. We used amplicon sequencing of the antigen-binding cleft of four MHC loci (2 class I: A and B, and 2 class II: DQA and DRB) to calculate MHC diversity as the number of class I and class II alleles and complementarity as the proportion of shared alleles between dyads. We paired males and females based on their relative genetic compatibility to produce mating dyads with kinships ranging from −0.18 to 0.24 and MHC complementarity ranging from 10% to 40% (class I loci) and 0% to 40% (class II loci). We also characterized biologically relevant MHC “supertypes” based on amino acid polarity at positively selected sites within the antigen-binding cleft (detailed methods in [49]) to determine supertype-based diversity and complementarity. In total, we tested five measures of male diversity and five measures of complementarity between each mating dyad, summarized in Table I in S2 Appendix.

We performed 10 separate differential gene expression analyses, one for each measure of male genetic diversity or complementarity, applying robust estimation and adaptive shrinkage. Each model was constructed with dyad ID as a blocking factor, RIN, cycle phase, and male presence as covariates, and an interactive effect between post-copulatory status (yes or no) and male genotype as the predictor variable. Measures of male MHC diversity and complementarity were associated with an excess of low p-values relative to the null expectation, suggestive that male MHC diversity and complementarity broadly influence gene expression (Fig 3A and 3B). Complementarilty as measured using alleles was associated with stronger deviations from the null expectation compared to supertypes (Fig 3B). Genome-wide diversity (stMLH), in contrast, was not as strongly associated with gene expression changes (Fig 3A). Postcopulatory expression of 456 genes was associated with male MHC allele or supertype diversity, meeting both an LFSR < 0.1 and family-wise error rate (FWER)-adjusted p < 0.05 criteria (Table J in S2 Appendix). Although the different MHC metrics did not share any of the same significant genes, GSEA revealed that male MHC I allelic diversity and MHC II supertype diversity were both positively associated with the expression of genes involved in RNA polymerase activity and intercellular signaling pathways (Table L in S2 Appendix). Likewise, we identified 590 genes whose post-copulatory expression was associated with either MHC or genome-wide complementarity at an LFSR < 0.1 and FWER-adjusted p < 0.05, representing 25 and 17 GSEA pathways, respectively (Tables K and L in S2 Appendix). Once again, each metric was associated with a unique set of significant genes, however, GSEA revealed that both MHC I and MHC II allelic complementarity were both positively associated with pathways involved in immune response and cellular signaling (Fig 3C and Figs C and D in S1 Appendix and Table L in S2 Appendix). For example, MAP3K2, which functions in the MAP kinase signaling pathway and has been implicated in the activation of NF-κB and downstream cytokine production, is expressed more in females who mated with males with whom they share a greater number of MHC I alleles (i.e., low complementarity) in comparison to females who mated with males with whom they share fewer MHC I alleles (i.e., high complementarity; Fig 3D). All genotype-dependent differential expression and GSEA results are summarized in Table M in S2 Appendix.

Fig 3. Post-copulatory vaginal gene expression in relation to male diversity and complementarity in five mating dyads.

Fig 3

(A) and (B) Quantile–quantile (Q–Q) plots comparing observed to expected p-values for gene expression association with measures of male diversity (A) and complementarity (B). Low p-values are highly enriched in our observed data compared to the null expectation (black line on x = y) when assessing the effect of male MHC diversity (but not genome-wide diversity) and MHC I and II allelic complementarity; (C) Enrichment distributions showing the ranked distribution of genes in the top 8 overrepresented gene sets that are upregulated in expression with low MHC I allelic complementarity; (D) Expression of MAP3K2, a gene which plays a key role in phagocytosis, in a non-copulatory context (left panel) and in a post-copulatory context subset by the degree of MHC I allelic complementarity (high complementarity: sharing < 30% of alleles, low complementary: sharing > 30% of alleles). The data underlying this figure are provided in S3 Data.

To evaluate the robustness of our findings, we conducted a leave-one-out sensitivity analysis in which we iteratively excluded a single mating dyad from our dataset and re-ran the differential gene expression analysis. For each iteration, we tested the interaction between post-copulatory status and MHC I or MHC II allelic complementarity—two genotype features that yielded the strongest initial associations. Across iterations, we consistently observed a substantial number of DE genes associated with MHC I allelic complementarity (range: 203−1,191, mean = 837.2, SD = 343.1, LFSR < 0.1) indicating that this association is not driven by any single dyad. Differential expression linked to MHC class II allelic complementarity was more variable, yet still consistently present across dyads (range: 86–1,006, mean = 456.3, SD = 317.3, LFSR < 0.1), indicating that our MHC class II results may be more sensitive to the individual dyads included in the analysis. Future studies with larger sample sizes will be necessary to validate these results.

To assess how male genetic diversity and complementarity modulates post-copulatory vaginal pH, we again fit 10 separate models, one for each measure of male genetic diversity or complementarity. We included dyad ID as a random effect, cycle phase, and average temperature as covariates, and an interaction between post-copulatory status and male genotype as the predictor variable. We found a significant interaction for three measures of genetic complementarity: kinship, class II allelic complementarity, and class II supertype complementarity (Table N in S2 Appendix). For all three metrics, the largest drops in post-copulatory vaginal pH are observed among females mating with genetically similar males (i.e., low complementarity), and the smallest drops (or potential increases) in vaginal pH are observed among females mating with genetically dissimilar males (i.e., high complementarity; Fig 4). Model fit was evaluated with Akaike’s Information Criterion (AIC). For all three significant models, inclusion of the interaction term significantly improved model fit compared to simplified models that did not include male genotype (dAIC > 2). To assess whether our model estimates were driven by particular mating dyads, we refit models testing for the effect of kinship, MHC II allelic complementarity, and MHC II supertype complementarity using the leave-one-out method (see Materials and methods). Our results were generally recapitulated across iterations (Fig E in S1 Appendix). Although smaller sample sizes generate larger standard errors, all model estimates trended in the same direction as the model which included all mating dyads.

Fig 4. Post-copulatory vaginal pH in relation to male genetic complementarity.

Fig 4

Model predictions illustrating the interaction between genetic complementarity and post-copulatory status in predicting post-copulatory vaginal pH, with the lowest post-copulatory pH observed among females mating with males with high degrees of kinship (A) and low degrees of MHC class II allelic (B) and supertype (C) complementarity. Filled points and error bars represent model predictions ± one standard error and open points represent raw data. The data underlying this figure are provided in S4 Data.

Discussion

Together, our findings suggest that aspects of female reproductive physiology can respond differentially to male inseminations, providing preliminary support for genetically-based sperm discrimination—a potential mechanism by which post-copulatory mate choice may occur. Vaginal immune responses can protect females from infection, but these processes may need to be carefully regulated mid-cycle to accommodate exposure to paternal-derived molecules [5052]. Our dataset supports this hypothesis, revealing a mid-cycle suppression of immune-related genes. In contrast, these same immune pathways show heightened expression post-copulation, with the magnitude of this response linked to male genetic characteristics. The striking convergence on similar pathways influenced by genotype at MHC class I and class II loci presents a particularly compelling case that vaginal responses may contribute to CFC, especially given the absence of a strong correlation between genetic diversity at these loci in this population [49]. The female immune system poses a potential detriment to sperm survival through processes enriched following mating with genetically similar males [53], however, a strong immune response may also prime the female reproductive tract for implantation [5456], and future studies will be needed to distinguish between vaginal immune responses promoting and antagonizing successful conception [5759]. Lastly, we find that post-copulatory vaginal pH is strongly associated with male genetic complementarity, with the largest drops—detrimental to sperm survival—occurring after mating with genetically similar males, suggesting that vaginal pH dynamics may also serve as a mechanism of CFC alongside changes in vaginal immune response.

This study furthers our understanding of how the mammalian vaginal environment, experienced as a first point of contact between male gametes and the female reproductive tract, may mechanistically contribute to sperm success and potential offspring genotypes. While these findings are based on a limited dataset and should be interpreted as such, they provide intriguing support for a potential mechanism of non-directional sexual selection driven by genetic complementarity (i.e., non-additive mate choice). In this context, genotype-by-genotype interactions would drive CFC, dampening consistent directional shifts in allele frequencies over time across the population. Future work with larger sample sizes will be needed to confirm these patterns, as well as to explore male-driven sexually antagonistic strategies that circumvent female-mediated processes. As our close evolutionary relatives, we are excited by the potential of future non-human primate research to clarify the molecular underpinnings and evolutionary origins of variation in conception probability in humans as well as other mammals.

Materials and methods

Study subjects and experimental design

We worked with a population of captive olive baboons housed at le Centre National de La Recherche Scientifique Station de Primatologie (CNRS SdP), in Rousset, France. Study subjects consisted of 13 individuals, 4 intact males and 9 parous females. We created 4 small study groups composed of 1 male and either 2 or 3 females (3 groups contained 2 females, 1 group contained 3 females). Females were not on any form of contraception. Prior to the start of this study, each group of 2–3 females was housed with a vasectomized male and none of the females were pregnant. To create our study groups, CNRS SdP staff relocated the resident vasectomized male and allowed each group of females to live without a male for one month (the length of one ovarian cycle), during which time females underwent positive reinforcement clicker training to present their hindquarters for vaginal swabbing and pH measurement. After one month, the resident ethologist managed the introduction of an intact male by introducing males to females first from an adjacent enclosure, allowing visual and olfactory interaction for 2–3 days prior to physical introduction. After the intact male was physically introduced, we collected data on each group for two months (the duration of two ovarian cycles). During the course of the study, two females became pregnant, and all data collected from these females following the ovulation window in which they conceived was discarded. All manipulations and treatments received ethical approval from the Ministry of Higher Education, Research and Innovation in France (APAFIS#15021-2018051115066627) and the NYU University Animal Welfare Committee (18-1504), and were compliant with the European Science Foundation animal-handling guidelines to minimize pain and distress.

Genome-wide and MHC genotyping

We utilized genotyping data generated as part of a previous study assessing the concordance between genome-wide and MHC diversity and complementarity in olive baboons living at CNRS SdP [49]. Detailed library preparation, sequencing, and bioinformatic methods are described in detail in [49] and are described in brief below.

We extracted DNA from whole blood using the Qiagen QIAamp DNA mini kit (N = 4; nfemale = 3, nmale = 1) or the GEN-IAL First-DNA All tissue kit (N = 9; nfemale = 6, nmale = 3) following manufacturer’s instructions. To assess genome-wide diversity and complementarity between dyads, we performed double digest restriction-site associated DNA sequencing (ddRAD-seq). We prepared ddRAD-seq libraries following [60]. We digested 1 µg of DNA using restriction enzymes (SphI and MluCI), and size selected for 185 (±19) bp fragments using the Blue Pippin System. We ligated Illumina platform adapters, indexed samples using NEBNext Multiplex Oligos for Illumina sequencing, and sequenced on the Illumina HiSeq 2500 platform using one lane and 150 bp PE reads. We excluded low-quality reads and reads not containing both enzyme cut sites, mapped reads to the olive baboon reference genome (Panu v3) using the bwa mem aligner with default parameters [61], and performed shared SNP calling using the STACKS v2 reference pipeline [62]. We required that a locus be sequenced in at least 80% of individuals to be included in the final SNP set, and excluded SNPs in strong linkage disequilibrium (r2 > 0.5) using PLINK [63]. Our final SNP set consisted of 35,509 SNPs to be used in the calculation of stMLH [64] for each individual, and for genome-wide complementarity between each dyad (i.e., kinship). We calculated stMLH by dividing the proportion of genotyped loci at which an individual was heterozygous by the population mean heterozygosity at all genotyped loci, using the “Rhh” package in R [65]. We calculated kinship between each dyad using the relationship inference algorithm in the software package KING v2.2.4 [66].

To assess MHC diversity and complementary between dyads, we performed PCR amplification of the functionally important antigen-binding regions of two class I MHC loci (A and B) and two class II MHC loci (DQA and DRB). We chose to assess both class I and class II loci because they encode for molecules that are present on different cell types and perform unique functions: class I molecules are present on the surface of nearly all nucleated cells and bind to intracellular pathogens such as viruses, and class II molecules are found on the surface of antigen-presenting cells and bind to extracellular pathogens such as bacteria [67]. Moreover, previous results suggest that in this population, genetic diversity at MHC class I loci is not strongly associated with diversity at class II loci, meaning that cryptic choice mechanisms may favor diversity and/or complementarity at one locus and not the other [49]. We targeted a 195 bp segment within the α1 domain of the class I receptor types (MHC-A and -B), a 188 bp segment within the α1 domain of the DQA receptor, and a 252 bp segment within the β1 domain of the DRB receptor. These sequences make up part of the antigen-binding cleft of each receptor type, and amino acid variation within these regions can result in variable pathogen recognition and binding [68]. We amplified the desired sequences using the MilliporeSigma FastStart High Fidelity PCR System and primers described in Table O in S2 Appendix. Following amplification, we selected the amplicon of the appropriate length using gel electrophoresis and band excision, performed an indexing PCR using Hot Start Pfu DNA Polymerase, and sequenced on the Illumina MiSeq platform with v2 chemistry and 200 bp PE reads. Following sequencing, we trimmed and mapped sequences to MHC-A, -B, -DQA, and -DRB sequences taken from the IPD-MHC database and for each individual retained unique MHC sequences that had over 1,000 reads and were also present at >5% copy number in another individual. We calculated MHC diversity and complementarity for class I and class II loci separately. We calculated an individual’s MHC allelic diversity as the number of unique MHC alleles and calculated MHC allelic complementarity as the number of MHC alleles shared between two individuals divided by the total number of unique MHC alleles possessed by the two individuals in total.

Identification of MHC supertypes

To support the potential biological relevance of our measures of MHC diversity and complementarity, we additionally identified MHC supertypes based on the physiochemical properties of the amino acids involved in antigen binding and calculated MHC diversity and complementarity for each dyad at the supertype level. To do so, we followed methods from [69], which are described briefly below and in detail for this specific dataset in [49]. First, we identified positively selected sites (PSS) within the antigen-binding region of each MHC locus by comparing rates of synonymous (dS) to non-synonymous (dN) nucleotide substitutions in protein-coding regions using methods described by [70]. To do so, we determined sequence reading frames by performing an alignment to published sequences in the IPD-MHC database using NCBI’s basic local alignment search tool (BLAST). We translated aligned sequences in R using the package ‘seqinr’ [71] and performed multiple protein sequence alignment in MAFFT v.7 [72]. We converted protein alignments into codon alignments using PAL2NAL v.14 [73], and constructed a phylogenetic tree of the alignments using randomized axelerated maximum likelihood (RAxML) [74] and a generalized time reversible (GTR) GAMMA substitution model, with the best-scoring tree selected using 100 bootstrap iterations. We then computed substitution rate ratios (dN/dS) by inputting the PAL2NAL codon alignment and RAxML tree into the CODEML program within the Phylogenetic Analysis by Maximum Likelihood (PAML) package [75]. This software identifies statistically significant PSS using the Bayes Empirical Bayes (BEB) analysis computed under NSsite model 8 [76]. Next, we aligned the amino acids associated with each PSS and described the physiochemical properties of each site in the form of five z-descriptors: z1 (hydorphobicity), z2 (steric bulk), z3 (polarity), z4, and z5 (electronic effects) [77]. We compiled a mathematical matrix containing the five z-scores of each PSS of each allele and performed an agglomerative hierarchical clustering analysis using Euclidian distance and the average linkage method with the R function ‘hclust’ in the ‘stats’ package [78]. We used the R package ‘dynamicTreeCut’ [79] to identify significant clusters, while specifying a minimum cluster size of 2 [80]. These methods for determining MHC supertypes have been shown to identify biologically relevant variation in MHC allele functionality in both human and non-human primate studies [69,8184]. We calculated an individual’s MHC supertype diversity as the number of unique MHC supertypes and calculated MHC supertype complementarity as the number of MHC supertypes shared between two individuals divided by the total number of unique supertypes possessed by the two individuals in total. Using our genome-wide metrics, as well as our allele-based and supertype-based MHC descriptors, we calculated in total 5 metrics of diversity and 5 metrics of genetic complementarity for each individual, summarized in Table I in S2 Appendix.

Vaginal RNA sample collection

We collected vaginal RNA samples (N = 307 samples, 34.1 ± 2 samples per female) every other day throughout sexual skin tumescence and detumescence, and every 3 days throughout the rest of the cycle. To collect RNA samples, we inserted a sterile cotton swab into the vaginal opening (~2 to 3 inches) and rotated for 10 seconds. Once removed, we immediately placed the swab into a 1.5 mL DNA lo-bind collection tube containing 500 µl of Qiagen RNA Protect cell reagent and placed it into a cooler for transport back to the lab within one hour. When taking a post-copulatory sample, we removed any visible sperm plug from the vaginal opening using autoclaved forceps before inserting the cotton swab. We performed a piggyback centrifugation to transfer the vaginal cells in solution from the sample collection tube (containing the swab) into a new cryotube and froze at −80°C.

Vaginal pH measurement

We measured the vaginal pH of each female daily (N = 359 measurements, 39.9 ± 6.4 measurements per female), using an ISFET probe and portable SI400 pH meter (Sentron). Prior to each sampling, we calibrated the probe using pH 4 and pH 7 buffers. The linear relationship between the raw voltage reading and the pH values of the known calibration solutions always fell between 95% and 105%, indicating proper function of the probe. We collected three sequential pH measures to determine an average pH reading for each female each day. In the case that a female was not cooperative in taking three separate readings, we instead took only two (n = 7) or one (n = 5). To do so, we inserted the probe into the vaginal opening (~2 to 3 inches) and waited for the reading to stabilize (~6 to 10 s) before recording the value. The ISFET probe simultaneously measures temperature and performs an automatic temperature compensation correction to account for differences in temperature between readings. We refrained from taking pH readings within 30 min following urination, and took post-copulatory measurements (npost-cop = 15) immediately following post-copulatory RNA sampling, approximately 4 hours after an observed copulation with ejaculation. Between sequential readings from the same individual, we cleaned the probe using deionized water. Between individuals, we cleaned the probe with 70% ethanol and deionized water. The probe was stored overnight in a pH 7 buffer, as per the manufacturer’s instructions.

Vaginal cytology

We predicted the timing of ovulation using vaginal cytology. We collected vaginal swabs for cytological slides by inserting a sterile cotton swab into the posterior vagina and rotating it for 10 s before removal. We prepared slides by rolling the swab across a glass microscope slide, applying a spray fixative (CytoRAL), and staining slides with a commercially available simplified Harris-Schorr staining kit (Diagnoestrus; RAL Diagnostics). Vaginal epithelial cells undergo characteristic cyclical changes throughout the ovarian cycle, allowing cycle phase to be determined by the proportion of cell types present on each slide (Fig F in S1 Appendix) [85,86]. Approaching ovulation, white blood cells (WBCs) and mucus are present, and the proportion of large, geometric superficial cells gradually increases. Ovulation is detected by a sharp drop in the proportion of red-staining superficial cells, quantified by assessing the stained color of 100 cells and calculating the eosinophilic index (EI) as number of red cells + the number of red/blue (polychromatophilic) cells * 0.5 (Fig G in S1 Appendix) [87]. In addition to this quantitative measure, ovulation is also qualitatively associated with the disappearance of WBCs and mucus. The postovulatory phase is characterized by the return of WBCs and mucus, cellular clumping, and a return of basal and intermediate cell types.

From these patterns, we identified a 2-day ovulation window as the day of ovulation and the previous day. We then defined a 5-day fertile phase as the two days prior to and one day following the 2-day ovulation window [40]. We classified the 5 days preceding the fertile phase as the pre-fertile phase and the 5 days following the fertile phase as the post-fertile phase. This method for pre-fertile, fertile, and post-fertile phase classification is well established in the primatological literature, and has been used in numerous studies with respect to non-human primate sexual swellings and behavior [3840]. The evaluation of cytological slides to determine ovarian cycle phase has been used with great success in this study population [88,89].

Semen sample collection

To account for male-derived RNA present in post-copulatory vaginal RNA samples, we collected two masturbatory semen samples from each male (N = 8 samples) and conducted RNA sequencing. To do so, we collected coagulated semen left on the enclosure substrate immediately following an observed masturbation. We used autoclaved tweezers to place the sample into a 5mL lo-bind Eppendorf tube and immediately transported it back to the lab. Under a sterile fume hood, we removed the solid portion of the ejaculate, measured the volume of the remaining liquid portion using a pipette, added RNAprotect Cell Reagent in a volume 5 times the liquid sample volume, and froze at −80°C. All samples were frozen within 20 min following the time of ejaculation. We used semen samples collected after masturbation to minimize potential contamination from female-derived RNA. Furthermore, collection of semen from the female vaginal tract post-mating would have required disruption of the sperm plug, which was incompatible with later vaginal RNA sampling 4 hours later. Although the composition of ejaculates produced via masturbation may differ from those produced in a mating context, this approach represented the most feasible and controlled option for characterizing male-derived RNA in ejaculates.

RNA extraction and sequencing

We extracted RNA from vaginal and semen samples using the Qiagen RNeasy Mini kit, according to the manufacturers’ recommended protocol. We incorporated a preliminary PBS wash of the cells, used a QIAshredder for sample homogenization, performed an on-column DNase digestion to improve quality and concentrations, and measured RNA concentration and integrity using an Agilent TapeStation. Across all collected samples, vaginal sample concentrations ranged from 0.07 to 499.5 ng/µl (mean = 15.4 ± 2.7 ng/µl) and semen sample concentrations ranged from 0.005 to 0.5 ng/µl (mean = 0.18 ± 0.03 ng/µl). Library preparation and sequencing was performed at the University of Calgary’s Centre for Health Genomics and Information sequencing core. We sequenced 106 vaginal samples and 8 semen samples with RIN values varying from 1.6 to 9.2 (mean = 5.3 ± 1.7 s.d.). We performed strand-specific library preparation using the NEBNext Ultra II RNA kit with rRNA depletion following the manufacturer’s instructions and performed whole transcriptome sequencing on one NovaSeq6000 S2 100 cycle v1.5 run, generating 50 bp PE reads.

Data processing of RNA-seq libraries

RNA sequencing generated an average of 38.1M (±21.4 s.d.) reads per sample. We trimmed and filtered reads for quality using the program Trimmomatic [90], with the following parameters: -phred33 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36. We used the splice-aware alignment tool STAR [91] to align sequences to the olive baboon reference genome (NCBI: GCA_008728515.1). Due to the degraded nature of our samples, we used the following parameters to allow for shorter alignments, as has been done with success in other studies: --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 –outFilterMatchNmin 0 [9294]. Due to issues with paired-end sequence alignment, substantially more reads displayed unique mapping in single-end versus paired-end mapping mode (paired-end mode: 3.8 ± 0.56 million uniquely mapped reads per sample, single-end mode: 9.9 ± 0.52 million uniquely mapped reads per sample). To maximize read counts mapped to genomic features, we used only R1 data for all analyses presented here. Studies have demonstrated an approximately 5% false positive and 5% false negative discovery rates for DE genes using single-end as opposed to paired-end reads [95]. These small discrepancies can exacerbate differences in identified gene ontology terms, with overlap between single- and paired-end data falling into the range of 40% [95]. To mitigate potential mapping errors due to the large bacterial cell populations present in the vagina, we additionally filtered mapped reads based on their taxonomic classification using Kraken 2 [96]. We built a custom database containing the olive baboon reference genome, as well as all bacterial, archaeal, fungal, protozoal, and viral genomes available through NCBI. We classified sequences using default kraken parameters, and filtered the STAR mapping results to include only reads that were either confidently classified as olive baboon or not classified as any other type of microorganism. Due to a high number of reads mapping to bacterial, archaeal, fungal, protozoal, and viral genomes, a mean of 27.8% of reads per sample (9.9M ± 5.6 s.d.) passed our kraken classification filter and were mapped uniquely to the baboon genome. From the filtered alignment files, we generated read counts of genomic features using the program Rsubread [97] and the Panubis1.0 genome annotation release 104.

Modeling vaginal physiology across cycle phases

To examine how vaginal gene expression differs across cycle phases, we performed differential expression analysis using the edgeR package in R [98]. To do so, we first subset samples to include the 8 from each cycle phase (pre-fertile, fertile, post-fertile, and non-fertile) that had the highest number of uniquely mapped reads (N = 32 samples total). Eight of the 9 study females are represented in the final set of 32 samples, with a mean of 6.25 females included within each cycle phase. By analyzing only a subset of all sequenced samples, we ensured an equal sample number across cycle phases and increased the mean number of uniquely mapped reads across samples from 9.9M to 13.7M. We filtered the list of genes included in the analysis by removing ribosomal protein genes, genes without human orthologs, and genes in which more than half of the samples had less than 10 counts per million, resulting in a mean library size of 4.9M read counts across 3,154 analyzable genes. We normalized library sizes based on the filtered gene list using the edgeR function ‘calcNormFactors’ and transformed count data for linear modeling using the ‘voomWithQualityWeights’ function in the package ‘limma’ [42]. To model our data, we controlled for female ID using the limma function ‘duplicateCorrelation’ with female ID as a blocking variable and included sample RIN and male presence (whether or not the sample was taken during the month prior to male introduction or after the male had been introduced) as covariates. We fit a linear model for each gene using the function ‘lmFit’ and stabilized the variance estimates across genes by applying a robust empirical Bayes moderation to the standard errors of the fitted coefficients using the function ‘eBayes’ and argument ‘robust=TRUE’. We then applied empirical Bayes adaptive shrinkage using the ‘ash’ function from the ‘ashr’ package in R, which borrows information across genes to shrink effect size and uncertainty estimates towards zero, generate more robust posterior estimates, and calculate local false sign rate (LFSR) [43]. LFSR is a measure which integrates both effect size and certainty to generate a posterior probability that an estimated effect is in the correct direction (positive or negative). This approach is particularly advantageous in small-sample settings when variance estimates are unstable because it downweights imprecise measurements and provides a more reliable alternative to standard FDR corrections [99]. We identified DE genes as those falling below a 10% LFSR, a cut-off which is standard in the field of genomics [100102], and also report the number of DE genes at more stringent 5% and 1% cutoffs. We performed GSEA using the ‘fgsea’ function in the R package ‘clusterProfiler’ [103] and the Papio anubis Ensembl genome annotations available through biomaRt [104], using a p-value cutoff of 0.05.

To examine how vaginal pH changes across cycle phases, we conducted robust linear mixed modeling using the R package ‘robustlmm’ [44]. We included cycle phase as a categorical predictor variable with the non-fertile phase as the reference category, vaginal pH (averaged across the 3 measurements for that day) as the response variable, vaginal temperature (averaged across the 3 measurements for that day) as a covariate, and female ID as a random effect. For this analysis, we included only pH measurements in which there was no observed mating or obvious signs of previous mating (i.e., sperm plug present) that day (N = 275 measurements, 30.6 ± 5.41 s.d. per female, 68.8 ± 21.8 s.d. per cycle phase). Visual inspection of quantile–quantile and residual variance plots confirmed homoscedastic residual variance structure and variance inflation factor (VIF) <2 confirmed no issues of collinearity.

Modeling vaginal physiology in response to mating

To examine how vaginal gene expression changes in response to mating, we again performed differential expression analysis using the edgeR package in R. We analyzed 25 post-copulatory and 30 non-copulatory RNA samples, all of which were taken from the pre-fertile, fertile, or post-fertile phases and had greater than 5M uniquely mapped reads. We were able to collect post-copulatory samples from six of the nine females, and thus limited our non-copulatory samples to those six females as well. This resulted in a mean of 4.3 post-copulatory samples per female and 5 non-copulatory samples per female (Table A in S2 Appendix), with 12.3M (± 7 s.d.) uniquely mapped reads for post-copulatory samples and 8.5M (± 4 s.d.) for non-copulatory samples. To account for male-derived RNA present in the vagina, we first removed genes found to be highly expressed in semen samples. From these samples, we identified 1,442 genes highly expressed in semen (average expression of >20 cpm, Table E in S2 Appendix), and removed these genes from all subsequent post-copulatory gene expression analyses. We then filtered the remaining genes by removing ribosomal protein genes, genes without human orthologs, and genes in which more than half of the samples had less than 10 counts per million, resulting in a mean library size of 1.9M read counts across 2,716 analyzable genes. We normalized library sizes based on the filtered gene list, controlled for dyad ID using a blocking variable, and fit a linear model applying a robust empirical Bayes moderation, including sample RIN, cycle phase, and male presence as covariates. The limma function ‘duplicateCorrelation’ supports the inclusion of only a single blocking factor, and thus we included dyad ID as the blocking variable as this uniquely identifies each male-female pair, capturing the repeated measures associated with both individual IDs. We identified DE genes as those falling below a 10% LFSR and performed GSEA as described above.

To examine how vaginal pH differs following mating, we conducted robust linear mixed effects modeling. We used a binary predictor variable (yes or no) indicating whether the pH measurement for that day was a post-copulatory or non-copulatory measurement (N = 62; npost-cop = 15, nnon-cop = 47). Post-copulatory pH measurements were obtained from five out of nine females during their pre-fertile or fertile phase, thus we limited non-copulatory measurements to those same females and phases (Table A in S2 Appendix). We used pH as the response variable, phase and temperature as covariates, and dyad ID as a random effect. Visual inspection of a quantile-quantile plot revealed greater residual variance in post-copulatory versus non-copulatory samples, which we statistically confirmed with a Breusch-Pagan test using the ‘bptest’ function in the ‘lmtest’ package in R [105]. We used VIF to confirm no issues of collinearity (VIF < 2).

Modeling vaginal physiology in relation to genetic diversity and complementarity

To test how post-copulatory gene expression is related to male genetic diversity and complementarity, we used the same subset of RNA-seq samples described above for our post-copulatory analyses. We fit 10 separate models, each testing for the interactive effect between post-copulatory status (yes or no) and one genotype metric (listed in Table I in S2 Appendix). We controlled for dyad ID using a blocking variable and ran a linear model with robust empirical Bayes moderation on the transformed counts including sample RIN, cycle phase, and male presence as covariates. We identified DE genes as those falling below a 10% LFSR and performed GSEA as described above. Because we tested genotype × post-copulatory status effects across 10 separate models, we applied an additional FWER correction using the Holm method to adjust the p-values for each gene across all models [106]. In the Results, we report how many genes with LFSR < 10% also meet a FWER-adjusted p < 0.05 after this correction.

To test whether post-copulatory vaginal pH is related to male genetic diversity and complementarity, we conducted robust linear mixed effects modeling using the same pH measurements as described above for our post-copulatory analyses. We ran 10 separate models, each testing for the interactive effect between post-copulatory status (yes or no) and one genotype metric (listed in Table I in S2 Appendix). We included cycle phase and temperature as covariates and dyad ID as a random effect. We confirmed homoscedastic residual variance structure by visually inspecting quantile–quantile and residual variance plots, and adjusted p-values for multiple hypothesis testing using the Holm FWER correction [106].

To evaluate the robustness of our findings to the influence of individual mating pairs, we conducted a leave-one-out sensitivity analysis. In this approach, we iteratively removed a single mating dyad from our dataset and reran our analyses testing for the effect of male genotype in modulating post-copulatory vaginal gene expression and pH. For each iteration, we recorded the number of DE genes and the interaction effect size estimate and associated confidence intervals. This method allowed us to identify potentially influential observations and quantify the overall stability of our results.

Supporting information

S1 Appendix

Fig A. GSEA pathways enriched for differential expression in the fertile phase. The “activated” panel represents pathways enriched for genes with heightened expression in the fertile compared to non-fertile phase and the “suppressed” panel represents pathways enriched for genes with lower expression in the fertile compared to non-fertile phase. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig B. Vaginal pH did not vary significantly between cycle phases. Error bars represent model predictions ± one standard error and points represent the raw data. The data underlying this figure are provided in S5 Data. Fig C. GSEA pathways enriched for differential expression post-copulation. Three gene set pathways are enriched for genes with heightened expression in post-copulatory versus non-copulatory contexts. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig D. GSEA pathways enriched for differential expression in relation to MHC I allelic complementarity. Pathways enriched for genes with heightened expression after mating with males with low complementarity. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig E. Leave-one-out sensitivity analysis of vaginal pH model estimates. Shown are estimated interaction effects between post-copulatory status and male genotype on vaginal pH across leave-one-out iterations. Points represent the estimated interaction term (post-copulatory status × male genotype), and error bars denote 95% confidence intervals. The genotype included in each model is indicated in the facet title. The data underlying this figure are provided in S5 Data. Fig F. Vaginal epithelial cells stained using a modified Harris-Schorr technique. Epithelial cell types include basal cells (A), intermediate cells (B), polychromatophilic superficial cells (C), and eosinophilic superficial cells (D). The preovulatory phase (E) exhibits a gradual increase in the proportion of red to blue staining cells, and the post-ovulatory phase (F) is characterized by an increase in cellular clumping, mucus, and WBCs. Fig G. Composite profile demonstrating fluctuations in EI over the course of the ovarian cycle (N = 22 cycles). Points represent mean EI values on each day in relation to ovulation, and error bars represent the standard error of the mean. The two-day ovulation window is designated by consecutive 0’s on the x axis and is shaded in red. The days leading up to ovulation are designated by negative numbers and the days following ovulation designated by positive numbers. The data underlying this figure are provided in S5 Data.

(DOCX)

pbio.3003699.s001.docx (1.4MB, docx)
S2 Appendix

Table A. Description of dataset composition for vaginal pH and gene expression analyses. Total number of samples, as well as number of samples per cycle phase or copulatory status for cycle phase and post-copulatory analyses, respectively. Measures of male genetic heterozygosity and complementarity are described in Table 1 in S2 Appendix. Table B. Differentially expressed genes- between phases. Genes with significant (FDR < 10%) differential expression in pairwise comparisons between cycle phases. Negative coefficients correspond to lower expression in the phase listed first in the comparison column, and positive coefficients correspond to higher expression in the phase listed first. Table C. Gene set enrichment analysis- between phases. GO pathways that are overrepresented among genes that are differentially expressed between the fertile and non-fertile phase. Table D. Vaginal pH linear model results- cycle phase. Table E. Genes highly expressed in semen. Genes with an average of >20 cpm in semen samples. Table F. Differentially expressed genes- post-copulatory versus non-copulatory. Genes with significant (LFSR < 10%) differential expression in post-copulatory versus non-copulatory samples. Negative coefficients correspond to lower expression post-copulation and positive coefficients correspond to higher expression post-copulation. Table G. Gene set enrichment analysis- post-copulatory versus non-copulatory. GO pathways that are overrepresented among genes that are differentially expressed between post-copulatory versus non-copulatory samples. Table H. Vaginal pH linear model results- post-copulatory status. Table I. Measures of male genetic diversity and dyadic complementarity. Table J. Differentially expressed genes- male genetic diversity. Genes with a significant (FDR < 10%) interactive effect between post-copulatory status and one metric of male genetic diversity (listed in the “genotype metric” column). Table K. Differentially expressed genes- genetic complementarity. Genes with a significant (FDR < 10%) interactive effect between post-copulatory status and one metric of genetic complementarity (listed in the “genotype metric” column). Table L. Gene set enrichment analysis- male genetic diversity and complementarity. Pathways with a significant enrichment (FDR < 10%) interactive effect between post-copulatory status and one metric of male genetic diversity or complementarity (listed in “genotype metric” column). Table M. Number of genes/pathways whose post-copulatory expression is significantly modified by an aspect of male genetic makeup across five mating dyads. Number of differentially expressed (DE) genes at passing an LFSR < 0.1, LFSR < 0.05, and LFSR < 0.01 threshold, number of LFSR < 0.1 genes which also pass a family-wise error rate (FWER)-adjusted p < 0.05 threshold, and number of gene set enrichment analysis (GSEA) pathways at a p < 0.05 threshold. Table N. Vaginal pH linear model results- male genetic diversity and complementarity. Interactive effect of post-copulatory status and each measure of male genetic diversity or complementarity in predicting vaginal pH. Each genetic metric was tested in a separate model. Table O. Primer sequences used to amplify MHC A, B, DQA, and DRB loci.

(XLSX)

pbio.3003699.s002.xlsx (888.3KB, xlsx)
S1 Data. Data underlying main figure Fig 1B1F.

(XLSX)

pbio.3003699.s003.xlsx (1.9MB, xlsx)
S2 Data. Data underlying main figure Fig 2B2F.

(XLSX)

pbio.3003699.s004.xlsx (4.9MB, xlsx)
S3 Data. Data underlying main figure Fig 3A3D.

(XLSX)

pbio.3003699.s005.xlsx (2.8MB, xlsx)
S4 Data. Data underlying main figure Fig 4A4C.

(XLSX)

pbio.3003699.s006.xlsx (53.9KB, xlsx)
S5 Data. Data underlying Figs A, B, C, D, E and G in S1 Appendix.

(XLSX)

pbio.3003699.s007.xlsx (103.6KB, xlsx)

Acknowledgments

We would like to thank members of the Primate Hormones and Behavior lab at NYU, the Primate Genetics Lab at DPZ, and the Melin lab at the University of Calgary for their support in completing this work. We extend a huge thank you to all of the staff at the CNRS Station de Primatologie for their assistance in executing this project, specifically Romain Lacoste, Slaveia Garbit, Magali Ghirart, Pascaline Boitelle, and Pau Molina. Thank you to Beth Archie and Cliff Jolly for their value feedback throughout the formulation and execution of this project. Thank you to Kristi Holt for collecting data and Stefano Vaglio for his insight into baboon training. Thank you to Patrícia Ströher, Gwen Duytschaever, and the University of Calgary’s Centre for Health Genomics and Informatics sequencing core for facilitating RNA preparation and sequencing. This work was supported in part through the NYU IT High Performance Computing resources, services, and staff expertise.

Abbreviations

BEB

Bayes Empirical Bayes

BLAST

basic local alignment search tool

CFC

cryptic female choice

ddRAD-seq

double digest restriction-site associated DNA sequencing

DE

differentially expressed

EI

eosinophilic index

GSEA

gene set enrichment analysis

LFSR

local false sign rate

MHC

major histocompatibility complex

PAML

Phylogenetic Analysis by Maximum Likelihood

PSS

positively selected sites

RAxML

randomized axelerated maximum likelihood

stMLH

standardized multi-locus heterozygosity

VIF

variance inflation factor

WBCs

white blood cells

Data Availability

The sequencing reads generated in this study have been submitted to the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) under accession numbers PRJNA875430 (ddRAD sequences) and PRJNA1232174 (RNA sequences), and in GenBank (https://www.ncbi.nlm.nih.gov/genbank/) under accession numbers OP375715-OP375798 (MHC sequences). Counts matrices, pH measurements, and metadata needed to rerun all code are available at Zenodo (https://zenodo.org/records/14976902). Code is available at Zenodo (https://zenodo.org/records/18705035) and at RMP’s personal GitHub page: www.github.com/rachpetersen/cryptic_choice_anubis.git. Data used to generate figures are available in S1S5 Data.

Funding Statement

This study was funded by: National Science Foundation Doctoral Dissertation Research Improvement Grant (grant no. 1826804) to RMP and JPH, Wenner-Gren Foundation Dissertation Fieldwork Grant (grant #9921) to RMP, Leakey Foundation Research Grant to RMP, Animal Behavior Society Student Research Grant to RMP, Primate Society of Great Britian Primate Research Grant to RMP, International Primatological Society Research Grant to RMP, American Society of Mammologists Grant-in-Aid to RMP, Society for Integrative and Comparative Biology Research Grant to RMP, Sigma Xi Grant-in-Aid of Research to RMP, New York University Intramural Funds to JPH, and the Canada Research Chairs program (grant no. 950-231257) to ADM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Darwin C. The descent of man, and selection in relation to sex. London: John Murray. 1871. [Google Scholar]
  • 2.Berglund A, Bisazza A, Pilastro A. Armaments and ornaments: an evolutionary explanation of traits of dual utility. Biol J Linn Soc. 1996;58(4):385–99. doi: 10.1111/j.1095-8312.1996.tb01442.x [DOI] [Google Scholar]
  • 3.Coleman SW, Patricelli GL, Borgia G. Variable female preferences drive complex male displays. Nature. 2004;428(6984):742–5. doi: 10.1038/nature02419 [DOI] [PubMed] [Google Scholar]
  • 4.Prum RO. Aesthetic evolution by mate choice: Darwin’s really dangerous idea. Philos Trans R Soc Lond B Biol Sci. 2012;367(1600):2253–65. doi: 10.1098/rstb.2011.0285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Eberhard WG. Female control: sexual selection by cryptic female choice. Princeton University Press; 1966. [Google Scholar]
  • 6.Firman RC, Gasparini C, Manier MK, Pizzari T. Postmating female control: 20 years of cryptic female choice. Trends Ecol Evol. 2017;32(5):368–82. doi: 10.1016/j.tree.2017.02.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Marie-Orleach L, Vellnow N, Schärer L. The repeatable opportunity for selection differs between pre- and postcopulatory fitness components. Evol Lett. 2020;5(1):101–14. doi: 10.1002/evl3.210 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rosenthal GG, Ryan MJ. Sexual selection and the ascent of women: mate choice research since Darwin. Science. 2022;375(6578):eabi6308. doi: 10.1126/science.abi6308 [DOI] [PubMed] [Google Scholar]
  • 9.Martín-Coello J, Benavent-Corai J, Roldan ERS, Gomendio M. Sperm competition promotes asymmetries in reproductive barriers between closely related species. Evolution. 2009;63(3):613–23. doi: 10.1111/j.1558-5646.2008.00585.x [DOI] [PubMed] [Google Scholar]
  • 10.Sutter A, Lindholm AK. No evidence for female discrimination against male house mice carrying a selfish genetic element. Curr Zool. 2016;62(6):675–85. doi: 10.1093/cz/zow063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Coltman DW, Bancroft DR, Robertson A, Smith JA, Clutton-Brock TH, Pemberton JM. Male reproductive success in a promiscuous mammal: behavioural estimates compared with genetic paternity. Mol Ecol. 1999;8(7):1199–209. doi: 10.1046/j.1365-294x.1999.00683.x [DOI] [PubMed] [Google Scholar]
  • 12.Curie-Cohen M, Yoshihara D, Luttrell L, Benforado K, MacCluer JW, Stone WH. The effects of dominance on mating behavior and paternity in a captive troop of rhesus monkeys (Macaca mulatta). Am J Primatol. 1983;5(2):127–38. doi: 10.1002/ajp.1350050204 [DOI] [PubMed] [Google Scholar]
  • 13.Stern BR, Smith DG. Sexual behaviour and paternity in three captive groups of rhesus monkeys (Macaca mulatta). Anim Behav. 1984;32:23–32. [Google Scholar]
  • 14.Hardy MP, Dent JN. Transport of sperm within the cloaca of the female red-spotted newt. J Morphol. 1986;190(3):259–70. doi: 10.1002/jmor.1051900303 [DOI] [PubMed] [Google Scholar]
  • 15.Roldan ER, Vitullo AD, Merani MS, Von Lawzewitsch I. Cross fertilization in vivo and in vitro between three species of vesper mice, Calomys (Rodentia, Cricetidae). J Exp Zool. 1985;233(3):433–42. doi: 10.1002/jez.1402330312 [DOI] [PubMed] [Google Scholar]
  • 16.Yeates SE, Diamond SE, Einum S, Emerson BC, Holt WV, Gage MJG. Cryptic choice of conspecific sperm controlled by the impact of ovarian fluid on sperm swimming behavior. Evolution. 2013;67(12):3523–36. doi: 10.1111/evo.12208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Firman RC, Simmons LW. Gametic interactions promote inbreeding avoidance in house mice. Ecol Lett. 2015;18(9):937–43. doi: 10.1111/ele.12471 [DOI] [PubMed] [Google Scholar]
  • 18.Almiñana C, Caballero I, Heath PR, Maleki-Dizaji S, Parrilla I, Cuello C, et al. The battle of the sexes starts in the oviduct: modulation of oviductal transcriptome by X and Y-bearing spermatozoa. BMC Genomics. 2014;15(1):293. doi: 10.1186/1471-2164-15-293 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Sharkey DJ, Macpherson AM, Tremellen KP, Robertson SA. Seminal plasma differentially regulates inflammatory cytokine gene expression in human cervical and vaginal epithelial cells. Mol Hum Reprod. 2007;13(7):491–501. doi: 10.1093/molehr/gam028 [DOI] [PubMed] [Google Scholar]
  • 20.Fitzpatrick JL, Willis C, Devigili A, Young A, Carroll M, Hunter HR, et al. Chemical signals from eggs facilitate cryptic female choice in humans. Proc Biol Sci. 2020;287(1928):20200805. doi: 10.1098/rspb.2020.0805 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Forsberg LA, Dannewitz J, Petersson E, Grahn M. Influence of genetic dissimilarity in the reproductive success and mate choice of brown trout – females fishing for optimal MHC dissimilarity. J Evol Biol. 2007;20(5):1859–69. doi: 10.1111/j.1420-9101.2007.01380.x [DOI] [PubMed] [Google Scholar]
  • 22.Thoss M, Ilmonen P, Musolf K, Penn DJ. Major histocompatibility complex heterozygosity enhances reproductive success. Mol Ecol. 2011;20(7):1546–57. doi: 10.1111/j.1365-294X.2011.05009.x [DOI] [PubMed] [Google Scholar]
  • 23.Kalbe M, Eizaguirre C, Dankert I, Reusch TBH, Sommerfeld RD, Wegner KM, et al. Lifetime reproductive success is maximized with optimal major histocompatibility complex diversity. Proc Biol Sci. 2009;276(1658):925–34. doi: 10.1098/rspb.2008.1466 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Sauermann U, Nürnberg P, Bercovitch FB, Berard JD, Trefilov A, Widdig A, et al. Increased reproductive success of MHC class II heterozygous males among free-ranging rhesus macaques. Hum Genet. 2001;108(3):249–54. doi: 10.1007/s004390100485 [DOI] [PubMed] [Google Scholar]
  • 25.Schwensow N, Fietz J, Dausmann K, Sommer S. MHC-associated mating strategies and the importance of overall genetic diversity in an obligate pair-living primate. Evol Ecol. 2008;22:617–36. [Google Scholar]
  • 26.Setchell JM, Charpentier MJE, Abbott KM, Wickings EJ, Knapp LA. Opposites attract: MHC-associated mate choice in a polygynous primate. J Evol Biol. 2010;23(1):136–48. doi: 10.1111/j.1420-9101.2009.01880.x [DOI] [PubMed] [Google Scholar]
  • 27.Huchard E, Baniel A, Schliehe-Diecks S, Kappeler PM. MHC-disassortative mate choice and inbreeding avoidance in a solitary primate. Mol Ecol. 2013;22(15):4071–86. doi: 10.1111/mec.12349 [DOI] [PubMed] [Google Scholar]
  • 28.Yang B, Ren B, Xiang Z, Yang J, Yao H, Garber PA, et al. Major histocompatibility complex and mate choice in the polygynous primate: the Sichuan snub-nosed monkey (Rhinopithecus roxellana). Integr Zool. 2014;9(5):598–612. doi: 10.1111/1749-4877.12084 [DOI] [PubMed] [Google Scholar]
  • 29.Chaves PB, Strier KB, Di Fiore A. Paternity data reveal high MHC diversity among sires in a polygynandrous, egalitarian primate. Proc Biol Sci. 2023;290(2004):20231035. doi: 10.1098/rspb.2023.1035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gasparini C, Congiu L, Pilastro A. Major histocompatibility complex similarity and sexual selection: different does not always mean attractive. Mol Ecol. 2015;24(16):4286–95. doi: 10.1111/mec.13222 [DOI] [PubMed] [Google Scholar]
  • 31.Løvlie H, Gillingham MAF, Worley K, Pizzari T, Richardson DS. Cryptic female choice favours sperm from major histocompatibility complex-dissimilar males. Proc Biol Sci. 2013;280(1769):20131296. doi: 10.1098/rspb.2013.1296 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Rülicke T, Chapuisat M, Homberger FR, Macas E, Wedekind C. MHC-genotype of progeny influenced by parental infection. Proc Biol Sci. 1998;265(1397):711–6. doi: 10.1098/rspb.1998.0351 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Yeates SE, Einum S, Fleming IA, Megens H-J, Stet RJM, Hindar K, et al. Atlantic salmon eggs favour sperm in competition that have similar major histocompatibility alleles. Proc Biol Sci. 2009;276(1656):559–66. doi: 10.1098/rspb.2008.1257 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Higham JP, Semple S, MacLarnon A, Heistermann M, Ross C. Female reproductive signaling, and male mating behavior, in the olive baboon. Horm Behav. 2009;55(1):60–7. doi: 10.1016/j.yhbeh.2008.08.007 [DOI] [PubMed] [Google Scholar]
  • 35.Jolly CJ, Phillips-Conroy JE. Testicular size, mating system, and maturation schedules in wild Anubis and Hamadryas baboons. Int J Primatol. 2003;24:125–42. [Google Scholar]
  • 36.Olmsted SS, Dubin NH, Cone RA, Moench TR. The rate at which human sperm are immobilized and killed by mild acidity. Fertil Steril. 2000;73(4):687–93. doi: 10.1016/s0015-0282(99)00640-8 [DOI] [PubMed] [Google Scholar]
  • 37.Schjenken JE, Robertson SA. The female response to seminal fluid. Physiol Rev. 2020;100(3):1077–117. doi: 10.1152/physrev.00013.2018 [DOI] [PubMed] [Google Scholar]
  • 38.Deschner T, Heistermann M, Hodges K, Boesch C. Female sexual swelling size, timing of ovulation, and male behavior in wild West African chimpanzees. Horm Behav. 2004;46(2):204–15. doi: 10.1016/j.yhbeh.2004.03.013 [DOI] [PubMed] [Google Scholar]
  • 39.Higham JP, Heistermann M, Ross C, Semple S, Maclarnon A. The timing of ovulation with respect to sexual swelling detumescence in wild olive baboons. Primates. 2008;49(4):295–9. doi: 10.1007/s10329-008-0099-9 [DOI] [PubMed] [Google Scholar]
  • 40.Young C, Majolo B, Heistermann M, Schülke O, Ostner J. Male mating behaviour in relation to female sexual swellings, socio-sexual behaviour and hormonal changes in wild Barbary macaques. Horm Behav. 2013;63(1):32–9. doi: 10.1016/j.yhbeh.2012.11.004 [DOI] [PubMed] [Google Scholar]
  • 41.Wilcox AJ, Dunson D, Baird DD. The timing of the “fertile window” in the menstrual cycle: day specific estimates from a prospective study. BMJ. 2000;321(7271):1259–62. doi: 10.1136/bmj.321.7271.1259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. doi: 10.1093/nar/gkv007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Stephens M, Carbonetto P, Dai C, Gerard D, Lu M. ashr: methods for adaptive shrinkage, using Empirical Bayes. 2020.
  • 44.Koller M. Robustlmm: an R package for robust estimation of linear mixed-effects models. J Stat Softw. 2016;75:1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Cook JD, Strauss KA, Caplan YH, Lodico CP, Bush DM. Urine pH: the effects of time and temperature after collection. J Anal Toxicol. 2007;31(8):486–96. doi: 10.1093/jat/31.8.486 [DOI] [PubMed] [Google Scholar]
  • 46.Karlsson AH, Rosenvold K. The calibration temperature of pH-glass electrodes: significance for meat quality classification. Meat Sci. 2002;62(4):497–501. doi: 10.1016/s0309-1740(02)00037-2 [DOI] [PubMed] [Google Scholar]
  • 47.Miller EA, Beasley DE, Dunn RR, Archie EA. Lactobacilli dominance and vaginal pH: why is the human vaginal microbiome unique? Front Microbiol. 2016;7:1936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Miller EA, Livermore JA, Alberts SC, Tung J, Archie EA. Ovarian cycling and reproductive state shape the vaginal microbiota in wild baboons. Microbiome. 2017;5(1):8. doi: 10.1186/s40168-017-0228-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Petersen RM, Bergey CM, Roos C, Higham JP. Relationship between genome-wide and MHC class I and II genetic diversity and complementarity in a nonhuman primate. Ecol Evol. 2022;12(10):e9346. doi: 10.1002/ece3.9346 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Munoz-Suano A, Hamilton AB, Betz AG. Gimme shelter: the immune system during pregnancy: the immunology of pregnancy. Immunol Rev. 2011;241:20–38. [DOI] [PubMed] [Google Scholar]
  • 51.Wira CR, Rodriguez-Garcia M, Patel MV, Biswas N, Fahey JV. Endocrine regulation of the mucosal immune system in the female reproductive tract. Mucosal Immunology. 2015. p. 2141–56. doi: 10.1016/b978-0-12-415847-4.00110-5 [DOI] [Google Scholar]
  • 52.Wagner RD, Johnson SJ. Probiotic lactobacillus and estrogen effects on vaginal epithelial gene expression responses to Candida albicans. J Biomed Sci. 2012;19(1):58. doi: 10.1186/1423-0127-19-58 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Wigby S, Sirot LK, Linklater JR, Buehner N, Calboli FCF, Bretman A, et al. Seminal fluid protein allocation and male reproductive success. Curr Biol. 2009;19(9):751–7. doi: 10.1016/j.cub.2009.03.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Robertson SA. Seminal plasma and male factor signalling in the female reproductive tract. Cell Tissue Res. 2005;322(1):43–52. doi: 10.1007/s00441-005-1127-3 [DOI] [PubMed] [Google Scholar]
  • 55.Scherjon S, Lashley L, van der Hoorn M-L, Claas F. Fetus specific T cell modulation during fertilization, implantation and pregnancy. Placenta. 2011;32 Suppl 4:S291-7. doi: 10.1016/j.placenta.2011.03.014 [DOI] [PubMed] [Google Scholar]
  • 56.Schjenken JE, Robertson SA. Seminal fluid and immune adaptation for pregnancy – comparative biology in mammalian species. Reprod Domest Anim. 2014;49 Suppl 3:27–36. doi: 10.1111/rda.12383 [DOI] [PubMed] [Google Scholar]
  • 57.Lee JY, Lee M, Lee SK. Role of endometrial immune cells in implantation. Clin Exp Reprod Med. 2011;38(3):119–25. doi: 10.5653/cerm.2011.38.3.119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Robertson SA, Care AS, Moldenhauer LM. Regulatory T cells in embryo implantation and the immune response to pregnancy. J Clin Invest. 2018;128(10):4224–35. doi: 10.1172/JCI122182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Robertson SA, Prins JR, Sharkey DJ, Moldenhauer LM. Seminal fluid and the generation of regulatory T cells for embryo implantation. Am J Reprod Immunol. 2013;69(4):315–30. doi: 10.1111/aji.12107 [DOI] [PubMed] [Google Scholar]
  • 60.Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE. Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One. 2012;7(5):e37135. doi: 10.1371/journal.pone.0037135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013. http://arxiv.org/abs/1303.3997 [Google Scholar]
  • 62.Rochette NC, Rivera-Colón AG, Catchen JM. Stacks 2: analytical methods for paired-end sequencing improve RADseq-based population genomics. Mol Ecol. 2019;28(21):4737–54. doi: 10.1111/mec.15253 [DOI] [PubMed] [Google Scholar]
  • 63.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. doi: 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Coltman DW, Pilkington JG, Smith JA, Pemberton JM. Parasite-mediated selection against inbred soay sheep in a free-living island populaton. Evolution. 1999;53(4):1259–67. doi: 10.1111/j.1558-5646.1999.tb04538.x [DOI] [PubMed] [Google Scholar]
  • 65.Alho JS, Välimäki K, Merilä J. Rhh: an R extension for estimating multilocus heterozygosity and heterozygosity-heterozygosity correlation. Mol Ecol Resour. 2010;10(4):720–2. doi: 10.1111/j.1755-0998.2010.02830.x [DOI] [PubMed] [Google Scholar]
  • 66.Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen W-M. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867–73. doi: 10.1093/bioinformatics/btq559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Piertney SB, Oliver MK. The evolutionary ecology of the major histocompatibility complex. Heredity. 2006;96(1):7–21. doi: 10.1038/sj.hdy.6800724 [DOI] [PubMed] [Google Scholar]
  • 68.Hughes AL, Yeager M. Natural selection and the evolutionary history of major histocompatibility complex loci. Front Biosci. 1998;3:d509-16. doi: 10.2741/a298 [DOI] [PubMed] [Google Scholar]
  • 69.Schwensow N, Fietz J, Dausmann KH, Sommer S. Neutral versus adaptive genetic variation in parasite resistance: importance of major histocompatibility complex supertypes in a free-ranging primate. Heredity. 2007;99(3):265–77. doi: 10.1038/sj.hdy.6800993 [DOI] [PubMed] [Google Scholar]
  • 70.Goodswen SJ, Kennedy PJ, Ellis JT. A gene-based positive selection detection approach to identify vaccine candidates using Toxoplasma gondii as a test case protozoan pathogen. Front Genet. 2018;9:332. doi: 10.3389/fgene.2018.00332 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Charif D, Lobry JR. SeqinR 1.0-2: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. Biological and Medical Physics, Biomedical Engineering. Springer Berlin Heidelberg. 2007. p. 207–32. doi: 10.1007/978-3-540-35306-5_10 [DOI] [Google Scholar]
  • 72.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80. doi: 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34 Suppl 2:W609-12. doi: 10.1093/nar/gkl315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. doi: 10.1093/bioinformatics/btu033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91. doi: 10.1093/molbev/msm088 [DOI] [PubMed] [Google Scholar]
  • 76.Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22(4):1107–18. doi: 10.1093/molbev/msi097 [DOI] [PubMed] [Google Scholar]
  • 77.Sandberg M, Eriksson L, Jonsson J, Sjöström M, Wold S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem. 1998;41(14):2481–91. doi: 10.1021/jm9700575 [DOI] [PubMed] [Google Scholar]
  • 78.Team RC R. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021. [Google Scholar]
  • 79.Langfelder P, Zhang B, Horvath S. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics. 2008;24(5):719–20. doi: 10.1093/bioinformatics/btm563 [DOI] [PubMed] [Google Scholar]
  • 80.Greenbaum J, Sidney J, Chung J, Brander C, Peters B, Sette A. Functional classification of class II human leukocyte antigen (HLA) molecules reveals seven different supertypes and a surprising degree of repertoire sharing across supertypes. Immunogenetics. 2011;63(6):325–35. doi: 10.1007/s00251-011-0513-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Lund O, Nielsen M, Kesmir C, Petersen AG, Lundegaard C, Worning P, et al. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics. 2004;55(12):797–810. doi: 10.1007/s00251-004-0647-4 [DOI] [PubMed] [Google Scholar]
  • 82.Sette A, Sidney J. Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism. Immunogenetics. 1999;50(3–4):201–12. doi: 10.1007/s002510050594 [DOI] [PubMed] [Google Scholar]
  • 83.Southwood S, Sidney J, Kondo A, del Guercio MF, Appella E, Hoffman S, et al. Several common HLA-DR types share largely overlapping peptide binding repertoires. J Immunol. 1998;160(7):3363–73. doi: 10.4049/jimmunol.160.7.3363 [DOI] [PubMed] [Google Scholar]
  • 84.Trachtenberg E, Korber B, Sollars C, Kepler TB, Hraber PT, Hayes E, et al. Advantage of rare HLA supertype in HIV disease progression. Nat Med. 2003;9(7):928–35. doi: 10.1038/nm893 [DOI] [PubMed] [Google Scholar]
  • 85.Wildt DE, Doyle LL, Stone SC, Harrison RM. Correlation of perineal swelling with serum ovarian hormone levels, vaginal cytology, and ovarian follicular development during the baboon reproductive cycle. Primates. 1977;18(2):261–70. doi: 10.1007/bf02383104 [DOI] [Google Scholar]
  • 86.Hendrickx A. Embryology of the baboon. University of Chicago Press, Chicago, IL; 1971. [Google Scholar]
  • 87.MacLennan AH, Wynn RM. Menstrual cycle of the baboon. I. Clinical features, vaginal cytology and endometrial histology. Obstet Gynecol. 1971;38(3):350–8. [PubMed] [Google Scholar]
  • 88.Vaglio S, Minicozzi P, Kessler SE, Walker D, Setchell JM. Olfactory signals and fertility in olive baboons. Sci Rep. 2021;11(1):8506. doi: 10.1038/s41598-021-87893-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Vaglio S, Ducroix L, Rodriguez Villanueva M, Consiglio R, Kim AJ, Neilands P, et al. Female copulation calls vary with male ejaculation in captive olive baboons. Behaviour. 2020;157: 807–822. [Google Scholar]
  • 90.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21. doi: 10.1093/bioinformatics/bts635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Janiszewska M, Tabassum DP, Castaño Z, Cristea S, Yamamoto KN, Kingston NL, et al. Subclonal cooperation drives metastasis by modulating local and systemic immune microenvironments. Nat Cell Biol. 2019;21(7):879–88. doi: 10.1038/s41556-019-0346-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Littleton ES, Childress ML, Gosting ML, Jackson AN, Kojima S. Genome-wide correlation analysis to identify amplitude regulators of circadian transcriptome output. Sci Rep. 2020;10(1):21839. doi: 10.1038/s41598-020-78851-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Rajkov J, El Taher A, Böhne A, Salzburger W, Egger B. Gene expression remodelling and immune response during adaptive divergence in an African cichlid fish. Mol Ecol. 2021;30(1):274–96. doi: 10.1111/mec.15709 [DOI] [PubMed] [Google Scholar]
  • 95.Corley SM, MacKenzie KL, Beverdam A, Roddam LF, Wilkins MR. Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols. BMC Genomics. 2017;18(1):399. doi: 10.1186/s12864-017-3797-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019;20(1):257. doi: 10.1186/s13059-019-1891-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Liao Y, Smyth GK, Shi W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 2019;47(8):e47. doi: 10.1093/nar/gkz114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40. doi: 10.1093/bioinformatics/btp616 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Stephens M. False discovery rates: a new deal. Biostatistics. 2017;18(2):275–94. doi: 10.1093/biostatistics/kxw041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Resztak JA, Wei J, Zilioli S, Sendler E, Alazizi A, Mair-Meijers HE, et al. Genetic control of the dynamic transcriptional response to immune stimuli and glucocorticoids at single-cell resolution. Genome Res. 2023;33(6):839–56. doi: 10.1101/gr.276765.122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Lea AJ, Peng J, Ayroles JF. Diverse environmental perturbations reveal the evolution and context-dependency of genetic effects on gene expression levels. Genome Res. 2022;32(10):1826–39. doi: 10.1101/gr.276430.121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Petersen RM, Vockley CM, Lea AJ. Uncovering methylation-dependent genetic effects on regulatory element function in diverse genomes. Genome Res. 2025;35(8):1781–93. doi: 10.1101/gr.279957.124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Xu S, Hu E, Cai Y, Xie Z, Luo X, Zhan L, et al. Using clusterProfiler to characterize multiomics data. Nat Protoc. 2024;19(11):3292–320. doi: 10.1038/s41596-024-01020-z [DOI] [PubMed] [Google Scholar]
  • 104.Durinck S, Spellman PT, Birney E, Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc. 2009;4(8):1184–91. doi: 10.1038/nprot.2009.97 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Zeileis A, Hothorn T. Diagnostic checking in regression relationships. R News. 2002. [Google Scholar]
  • 106.Holm S. A simple sequentially rejective multiple test procedure. Scand J Stat. 1979;6: 65–70. [Google Scholar]

Decision Letter 0

Ines Alvarez-Garcia

6 Sep 2025

Dear Dr Petersen,

Thank you for submitting your manuscript entitled "Genetically-based sperm discrimination in the vaginal tract of a primate species" for consideration as a Research Article by PLOS Biology.

Your manuscript has now been evaluated by the PLOS Biology editorial staff as well as by an academic editor with relevant expertise and I am writing to let you know that we would like to send your submission out for external peer review.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. After your manuscript has passed the checks it will be sent out for review. To provide the metadata for your submission, please Login to Editorial Manager (https://www.editorialmanager.com/pbiology) within two working days, i.e. by Sep 09 2025 11:59PM.

During the process of completing your manuscript submission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF.

Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission.

Kind regards,

Ines

--

Ines Alvarez-Garcia, PhD

Senior Editor

PLOS Biology

ialvarez-garcia@plos.org

Decision Letter 1

Ines Alvarez-Garcia

24 Oct 2025

Dear Dr Petersen,

Thank you for your patience while your manuscript entitled "Genetically-based sperm discrimination in the vaginal tract of a primate species" was peer-reviewed at PLOS Biology. The manuscript has now been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by two independent reviewers.

The reviews are attached below. As you will see, while we are mostly satisfied with how you have addressed the previous issues raised by the associated reviews you shared with us, we were wondering about the effect the small sample size has in your results, as this was not covered by the previous reviewers. Thus, the two new reviewers looked only at this aspect and find the conclusions potentially interesting, but they also raise concerns about the sample size that should be addressed before we can consider the manuscript for publication. Reviewer 1 thinks that the high-dimensional RNAseq data is highly underpowered and might lead to an increased sensitivity to outliers. This reviewer also notes a lack of adjustment for the testing across models and that interactions normally require even a higher sample size to be able to compare them to main effects. In addition, the reviewer thinks you should provide effect size estimates with confidence intervals for all major findings. Reviewer 2 concludes that the sample is quite small to validate the findings, and that the word ‘sample’ is confusing throughout the manuscript and should be clarified. The reviewer suggests using N to denote the whole sample in all cases and toning down the claims, stating clearly the limitations of the sample size, which should be clearly discussed in the manuscript, including the abstract.

In light of the reviews, we would like to invite you to revise the work to thoroughly address the reviewers' reports. However, we would like to consider the manuscript as a Discovery Report, thus please select that format from the drop down when you submit the revision. Discovery Reports only allow 4 main figures, but since your manuscript has that exact number, the change of format will be only nominal and won't require any changes. Your revised manuscript is likely to be sent for further evaluation by all or a subset of the reviewers.

In addition to these revisions, you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests shortly.

We expect to receive your revised manuscript within 3 months. Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension.

At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we may withdraw it.

**IMPORTANT - SUBMITTING YOUR REVISION**

Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.

*NOTE: In your point-by-point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually, point by point.

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Revised Article with Changes Highlighted" file type.

3. Resubmission Checklist

When you are ready to resubmit your revised manuscript, please refer to this resubmission checklist: https://plos.io/Biology_Checklist

To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Please make sure to read the following important policies and guidelines while preparing your revision and fulfil the editorial requests:

a) *PLOS Data Policy*

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). Please also indicate in each figure legend where the data can be found. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5

b) *Published Peer Review*

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

c) *Blot and Gel Data Policy*

Please provide the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

d) *Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosbiology/s/submission-guidelines#loc-materials-and-methods

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Ines

--

Ines Alvarez-Garcia, PhD

Senior Editor

PLOS Biology

ialvarez-garcia@plos.org

------------------------------------

Reviewers' comments

Rev. 1:

The manuscript tackles an important and underexplored question. However, the statistical evidence is fragile due to small sample size, multiple testing, and reliance on interaction models. Results should be interpreted as preliminary, and stronger claims of genetically-based sperm discrimination in primates are not statistically justified at present.

Major Concerns

- The study sample size is very small producing limited dyads. For high-dimensional RNA-seq data, this looks like it is highly underpowered and may lead to increased sensitivity to outliers.

- Thousands of genes are tested assuming a10% FDR: how do you justify such a high level? Given small n, variance estimates are unstable, raising the risk of false discoveries.

- Several measures of genetic diversity/complementarity are evaluated. Although FDR is controlled within each analysis, no adjustment for the family-wise testing across all these models is provided.

- Mixed models with few levels for random effects and limited sample sizes are highly instable and reliable variance component estimation is questionable.

- Interaction terms between post-copulatory status and genotype are the central findings. Interactions notoriously require even higher sample size to be evaluated compared to main effects, and this study is clearly underpowered. The small sample size makes effect estimates hard to believe.

- The post-copulatory pH dataset is very limited (15 measurements). Linear mixed modeling with multiple covariates on this dataset is likely over-parameterized.

- Variance tests (Breusch-Pagan) with such small samples may not be robust to distributional assumptions.

- Provide effect size estimates with confidence intervals for all major findings (gene expression contrasts, pH changes, genotype interactions).

Rev. 2:

Overall, we think the statistical methods used in the manuscript are appropriate; the major issue is the sample sizes, or so-called 'large p small n problem'. No matter whether this manuscript will be published or not, we must thank the authors for their efforts and time on the experiments, which are impressive, expansive and exciting to analyze. In the following, not all comments need a response - some are purely for communicating statistical ideas or information; in that case, they are obvious.

In the 'Genome-wide and MHC genotyping section', we see that the sample sizes (N=13, n=4 males, and n=9 females) are small; when they are divided into subset groups, n is even smaller for each (refer to line 526 for different cycles); so we suggest that the authors would use very conservative languages to communicate their conclusions for the current study and remind the readers that all the conclusions are tentative therefore need larger sample experiments to validate.

The word sample is quite confusing throughout the manuscript. We should distinguish between the whole sample size (N as it usually is used in statistics, representing the population that was being studied) and the samples or sample sizes that were a subset of the whole, for example RNA samples (refer to the line 450). So, we'd suggest that the authors add a capital N to the whole sample such as N=13 or whatever is more appropriate in other cases (when N is number of animals, such as N1=9 or N2=4 etc. So that N1+N2+... =13).

Since N is small, the evidence from the analyses on the results wouldn't be very strong and the validity of the whole study could be greatly reduced, therefore not truly convincing our readers. In other words, if N is smaller than a dozen, then all the statistical models could be invalid, whether it is a mixed model or a simple linear model because there is virtually no degrees of freedom that was left for variation estimation in the model, which will include huge variations, for example, a very wide 95% CI (or large width) on the mean or median value of something measured. Intuitively and statistically approved, it's hard to believe anything deduced from a small sample. Genes, cycles etc. are secondary level variables (derived) in the current study, the first level ones (root, original) are the number of animals, the IDs, the male or female etc., as we generally see from a so-called demographic table (the 1st table usually) in a paper.

On lines 86 to 88, the authors mentioned the number of RNA samples and related number of measurements on those samples, but we'd love to see the number of animals, which is more significant in the current study. And besides, the sample size of the study should be reflected in the ABSTRACT section, where we did not see either. In addition, on line 60 and following lines, we did not observe a description on the sample size of the current study. On lines 84 to 88, we did not observe a description either. On line 176, we should have seen a sample size number, but we did not.

Refer to lines 89 and following lines, line 133 and following lines. Since the authors used ID as block factor and it will consume degrees of freedom of the data for each ID included, therefore reduced the effective sample size of the data. How was this issue considered in the processing of the data?

Line 104 and following lines, or line 141 and following lines - the authors used a linear mixed model. How were those variables selected? Any experimental (clinical) or statistical considerations

or a purely arbitrary choice?

On line 196, we see a very tiny p value, but this comes from a very small sample size of study, which is a very common dilemma in a genetic study. Statistically, a very small p value should come from a large sample. And a common misunderstanding about the p value is that the statistical null hypothesis is that there is no difference or difference is zero, a p value only tells us that the difference is different from zero (not zero), but how different is not solved. For example, the difference of 0.002 is different from zero, and 0.02 or 0.2 is different from zero but they are 10 or 100 times larger or smaller comparing one to another.

On line 211, the numbers look reasonable, but on line 214, we see a different scenario: the range is huge, because this is small sample size, and anything derived from a small sample would be an unstable sample, does not follow a normal distribution (when median is close to the mean value), which is why people have to rely upon a median value (like median income); however, a median is farther away from the truth (the mean is better, at least a trend, the characteristic or the majority of something). So, we need a large sample to validate a conclusion generally.

In Table 1 and in Figure 3, again, we don't know the sample sizes, so we are not convinced that the results are good to generalize to other cases or external data, this is, we are not certain whether we should accept the conclusions.

In Figure S5, we are not sure what that "p < 0.05" is, please add an explanation at the footnote.

Decision Letter 2

Ines Alvarez-Garcia

18 Feb 2026

Dear Dr Petersen,

Thank you for your patience while we considered your revised manuscript entitled "Evidence for genetically-based sperm discrimination in the vaginal tract of a primate species" for publication as a Discovery Report at PLOS Biology. This revised version of your manuscript has been evaluated by the PLOS Biology editors, the Academic Editor and one of the original reviewers.

Based on the review, we are likely to accept this manuscript for publication, provided you satisfactorily address the data and other policy-related requests stated below my signature.

In addition, we would like you to consider a suggestion to improve the title:

"A female primate can exert post-copulatory selection via genetically-based sperm discrimination in the vaginal tract"

As you address these items, please take this last chance to review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the cover letter that accompanies your revised manuscript.

In addition to these revisions, you may need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests shortly. If you do not receive a separate email within a few days, please assume that checks have been completed, and no additional changes are required.

We expect to receive your revised manuscript within two weeks.

To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include the following:

- a cover letter that should detail your responses to any editorial requests, if applicable, and whether changes have been made to the reference list

- a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable, if not applicable please do not delete your existing 'Response to Reviewers' file.)

- a track-changes file indicating any changes that you have made to the manuscript.

NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines:

https://journals.plos.org/plosbiology/s/supporting-information

*Published Peer Review History*

Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://plos.org/published-peer-review-history/

*Press*

Should you, your institution's press office or the journal office choose to press release your paper, please ensure you have opted out of Early Article Posting on the submission form. We ask that you notify us as soon as possible if you or your institution is planning to press release the article.

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please do not hesitate to contact me should you have any questions.

Sincerely,

Ines

--

Ines Alvarez-Garcia, PhD

Senior Editor

PLOS Biology

ialvarez-garcia@plos.org

------------------------------------------------------------------------

DATA POLICY:

You may be aware of the PLOS Data Policy, which requires that all data be made available without restriction: http://journals.plos.org/plosbiology/s/data-availability. For more information, please also see this editorial: http://dx.doi.org/10.1371/journal.pbio.1001797

Note that we do not require all raw data. Rather, we ask that all individual quantitative observations that underlie the data summarized in the figures and results of your paper be made available in one of the following forms:

1) Supplementary files (e.g., excel). Please ensure that all data files are uploaded as 'Supporting Information' and are invariably referred to (in the manuscript, figure legends, and the Description field when uploading your files) using the following format verbatim: S1 Data, S2 Data, etc. Multiple panels of a single or even several figures can be included as multiple sheets in one excel file that is saved using exactly the following convention: S1_Data.xlsx (using an underscore).

2) Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.

Regardless of the method selected, please ensure that you provide the individual numerical values that underlie the summary data displayed in the following figure panels as they are essential for readers to assess your analysis and to reproduce it:

Fig. 1D-F; Fig. 2C-F; Fig. 3A-D; Fig. 4A-C; Fig. S1; Fig. S2; Fig. S3; Fig. S4; Fig. S5 and Fig. S7

NOTE: the numerical data provided should include all replicates AND the way in which the plotted mean and errors were derived (it should not present only the mean/average values).

Please also ensure that figure legends in your manuscript include information on WHERE THE UNDERLYING DATA CAN BE FOUND, and ensure your supplemental data file/s has a legend.

Please ensure that your Data Statement in the submission system accurately describes where your data can be found.

------------------------------------------------------------------------

CODE POLICY

Per journal policy, if you have generated any custom code during the course of this investigation, please make it available without restrictions. Please ensure that the code is sufficiently well documented and reusable, and that your Data Statement in the Editorial Manager submission system accurately describes where your code can be found. More information on our Code Policy, what and how to share can be found here: https://journals.plos.org/plosbiology/s/code-availability

Please note that we cannot accept sole deposition of code in GitHub, as this could be changed after publication. However, you can archive this version of your publicly available GitHub code to Zenodo. Once you do this, it will generate a DOI number, which you will need to provide in the Data Accessibility Statement (you are welcome to also provide the GitHub access information). See the process for doing this here: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

------------------------------------------------------------------------

SPECIES INDICATED IN THE ABSTRACT?

- Please note that per journal policy, the model system/species studied should be clearly stated in the abstract of your manuscript.

------------------------------------------------------------------------

Reviewers' comments

Rev. 2:

We are happy to read the revised manuscript. Thank you very much for your efforts and time to revise the mansucript by following carefully the previous comments and answered each questions with new evidence and methods in some cases.

Decision Letter 3

Ines Alvarez-Garcia

24 Feb 2026

Dear Dr Petersen,

Thank you for the submission of your revised Discovery Report entitled "Evidence for genetically-based sperm discrimination in the vaginal tract of a primate species" for publication in PLOS Biology. On behalf of my colleagues and the Academic Editor, Masahito Ikawa, I am delighted to let you know that we can in principle accept your manuscript for publication, provided you address any remaining formatting and reporting issues. These will be detailed in an email you should receive within 2-3 business days from our colleagues in the journal operations team; no action is required from you until then. Please note that we will not be able to formally accept your manuscript and schedule it for publication until you have completed any requested changes.

Please take a minute to log into Editorial Manager at http://www.editorialmanager.com/pbiology/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have previously opted in to the early version process, we ask that you notify us immediately of any press plans so that we may opt out on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

Many congratulations and thanks again for choosing PLOS Biology for publication and supporting Open Access publishing. We look forward to publishing your study.

Sincerely,

Ines

--

Ines Alvarez-Garcia, PhD

Senior Editor

PLOS Biology

ialvarez-garcia@plos.org

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix

    Fig A. GSEA pathways enriched for differential expression in the fertile phase. The “activated” panel represents pathways enriched for genes with heightened expression in the fertile compared to non-fertile phase and the “suppressed” panel represents pathways enriched for genes with lower expression in the fertile compared to non-fertile phase. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig B. Vaginal pH did not vary significantly between cycle phases. Error bars represent model predictions ± one standard error and points represent the raw data. The data underlying this figure are provided in S5 Data. Fig C. GSEA pathways enriched for differential expression post-copulation. Three gene set pathways are enriched for genes with heightened expression in post-copulatory versus non-copulatory contexts. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig D. GSEA pathways enriched for differential expression in relation to MHC I allelic complementarity. Pathways enriched for genes with heightened expression after mating with males with low complementarity. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig E. Leave-one-out sensitivity analysis of vaginal pH model estimates. Shown are estimated interaction effects between post-copulatory status and male genotype on vaginal pH across leave-one-out iterations. Points represent the estimated interaction term (post-copulatory status × male genotype), and error bars denote 95% confidence intervals. The genotype included in each model is indicated in the facet title. The data underlying this figure are provided in S5 Data. Fig F. Vaginal epithelial cells stained using a modified Harris-Schorr technique. Epithelial cell types include basal cells (A), intermediate cells (B), polychromatophilic superficial cells (C), and eosinophilic superficial cells (D). The preovulatory phase (E) exhibits a gradual increase in the proportion of red to blue staining cells, and the post-ovulatory phase (F) is characterized by an increase in cellular clumping, mucus, and WBCs. Fig G. Composite profile demonstrating fluctuations in EI over the course of the ovarian cycle (N = 22 cycles). Points represent mean EI values on each day in relation to ovulation, and error bars represent the standard error of the mean. The two-day ovulation window is designated by consecutive 0’s on the x axis and is shaded in red. The days leading up to ovulation are designated by negative numbers and the days following ovulation designated by positive numbers. The data underlying this figure are provided in S5 Data.

    (DOCX)

    pbio.3003699.s001.docx (1.4MB, docx)
    S2 Appendix

    Table A. Description of dataset composition for vaginal pH and gene expression analyses. Total number of samples, as well as number of samples per cycle phase or copulatory status for cycle phase and post-copulatory analyses, respectively. Measures of male genetic heterozygosity and complementarity are described in Table 1 in S2 Appendix. Table B. Differentially expressed genes- between phases. Genes with significant (FDR < 10%) differential expression in pairwise comparisons between cycle phases. Negative coefficients correspond to lower expression in the phase listed first in the comparison column, and positive coefficients correspond to higher expression in the phase listed first. Table C. Gene set enrichment analysis- between phases. GO pathways that are overrepresented among genes that are differentially expressed between the fertile and non-fertile phase. Table D. Vaginal pH linear model results- cycle phase. Table E. Genes highly expressed in semen. Genes with an average of >20 cpm in semen samples. Table F. Differentially expressed genes- post-copulatory versus non-copulatory. Genes with significant (LFSR < 10%) differential expression in post-copulatory versus non-copulatory samples. Negative coefficients correspond to lower expression post-copulation and positive coefficients correspond to higher expression post-copulation. Table G. Gene set enrichment analysis- post-copulatory versus non-copulatory. GO pathways that are overrepresented among genes that are differentially expressed between post-copulatory versus non-copulatory samples. Table H. Vaginal pH linear model results- post-copulatory status. Table I. Measures of male genetic diversity and dyadic complementarity. Table J. Differentially expressed genes- male genetic diversity. Genes with a significant (FDR < 10%) interactive effect between post-copulatory status and one metric of male genetic diversity (listed in the “genotype metric” column). Table K. Differentially expressed genes- genetic complementarity. Genes with a significant (FDR < 10%) interactive effect between post-copulatory status and one metric of genetic complementarity (listed in the “genotype metric” column). Table L. Gene set enrichment analysis- male genetic diversity and complementarity. Pathways with a significant enrichment (FDR < 10%) interactive effect between post-copulatory status and one metric of male genetic diversity or complementarity (listed in “genotype metric” column). Table M. Number of genes/pathways whose post-copulatory expression is significantly modified by an aspect of male genetic makeup across five mating dyads. Number of differentially expressed (DE) genes at passing an LFSR < 0.1, LFSR < 0.05, and LFSR < 0.01 threshold, number of LFSR < 0.1 genes which also pass a family-wise error rate (FWER)-adjusted p < 0.05 threshold, and number of gene set enrichment analysis (GSEA) pathways at a p < 0.05 threshold. Table N. Vaginal pH linear model results- male genetic diversity and complementarity. Interactive effect of post-copulatory status and each measure of male genetic diversity or complementarity in predicting vaginal pH. Each genetic metric was tested in a separate model. Table O. Primer sequences used to amplify MHC A, B, DQA, and DRB loci.

    (XLSX)

    pbio.3003699.s002.xlsx (888.3KB, xlsx)
    S1 Data. Data underlying main figure Fig 1B1F.

    (XLSX)

    pbio.3003699.s003.xlsx (1.9MB, xlsx)
    S2 Data. Data underlying main figure Fig 2B2F.

    (XLSX)

    pbio.3003699.s004.xlsx (4.9MB, xlsx)
    S3 Data. Data underlying main figure Fig 3A3D.

    (XLSX)

    pbio.3003699.s005.xlsx (2.8MB, xlsx)
    S4 Data. Data underlying main figure Fig 4A4C.

    (XLSX)

    pbio.3003699.s006.xlsx (53.9KB, xlsx)
    S5 Data. Data underlying Figs A, B, C, D, E and G in S1 Appendix.

    (XLSX)

    pbio.3003699.s007.xlsx (103.6KB, xlsx)
    Attachment

    Submitted filename: PLOSBio_R1_R2R_final.docx

    pbio.3003699.s010.docx (3.1MB, docx)
    Attachment

    Submitted filename: PLOSBio_R2_R2R.docx

    pbio.3003699.s011.docx (70.6KB, docx)

    Data Availability Statement

    The sequencing reads generated in this study have been submitted to the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) under accession numbers PRJNA875430 (ddRAD sequences) and PRJNA1232174 (RNA sequences), and in GenBank (https://www.ncbi.nlm.nih.gov/genbank/) under accession numbers OP375715-OP375798 (MHC sequences). Counts matrices, pH measurements, and metadata needed to rerun all code are available at Zenodo (https://zenodo.org/records/14976902). Code is available at Zenodo (https://zenodo.org/records/18705035) and at RMP’s personal GitHub page: www.github.com/rachpetersen/cryptic_choice_anubis.git. Data used to generate figures are available in S1S5 Data.


    Articles from PLOS Biology are provided here courtesy of PLOS

    RESOURCES