Skip to main content
eLife logoLink to eLife
. 2025 Nov 10;14:RP105466. doi: 10.7554/eLife.105466

Parallel HIV-1 fitness landscapes shape viral dynamics in humans and macaques that develop broadly neutralizing antibodies

Kai S Shimagaki 1,2, Rebecca M Lynch 3, John P Barton 1,2,
Editors: Anne-Florence Bitbol4, Joshua T Schiffer5
PMCID: PMC12600013  PMID: 41212071

Abstract

HIV-1 evolves within individual hosts to escape adaptive immune responses while maintaining its capacity for replication. Coevolution between HIV-1 and the immune system generates extraordinary viral genetic diversity. In some individuals, this process also results in the development of broadly neutralizing antibodies (bnAbs) that can neutralize many viral variants, a key focus of HIV-1 vaccine design. However, a general understanding of the forces that shape virus-immune coevolution within and across hosts remains incomplete. Here, we performed a quantitative study of HIV-1 evolution in humans and rhesus macaques, including individuals who developed bnAbs. We observed strong selection early in infection for mutations affecting HIV-1 envelope glycosylation and escape from autologous strain-specific antibodies, followed by weaker selection for bnAb resistance. The inferred fitness effects of HIV-1 mutations in humans and macaques were remarkably similar. Moreover, we observed a striking pattern of rapid HIV-1 fitness gains that precedes the development of bnAbs. Our work highlights strong parallels between infection in rhesus macaques and humans, and it reveals a quantitative evolutionary signature of bnAb development.

Research organism: Viruses

eLife digest

Viruses are genetic particles composed of DNA or RNA, encased by a protective protein shell called the capsid. They cannot reproduce independently and must infect a host cell to replicate. Many viruses mutate rapidly, allowing them to adapt to and evade the immune responses of their hosts.

For example, HIV-1, the virus that causes AIDS, has a high mutation rate, resulting in the emergence of many distinct variants of the virus. Therefore, an effective vaccine needs to be able to stimulate a special type of antibody known as broadly neutralizing antibody (bnAb). These large defense proteins can recognize and neutralize many different viral strains, which could make them a key focus in HIV vaccine development.

Researchers often use rhesus macaques as a model system to study how HIV-1 evolves and interacts with the immune system. Previous studies have shown that some viruses mutate in similar ways in both humans and rhesus macaques. However, the details of HIV-1 evolution and mutation patterns in these two hosts remain unclear. Gaining deeper insight into the evolutionary processes linked to bnAb development could inform vaccine design and evaluate the suitability of rhesus macaques as an animal model for HIV-1 research.

Shimagaki et al. aimed to quantify how HIV-1 evolves in different hosts and whether these evolutionary patterns differ between individuals who do or do not develop bnAbs. The researchers reanalyzed previously collected HIV-1 data from two humans who developed bnAbs and 13 rhesus macaques, using computational models to estimate how various mutations affect viral replication (i.e., viral fitness). Their analysis revealed strong quantitative similarities in viral evolution between humans and macaques: the estimated fitness effects of mutations were highly correlated across species. Rapid increases in viral fitness were observed before bnAbs were detected, suggesting that selective pressure on the virus may help drive the development of antibody breadth.

These findings suggest that vaccine strategies designed to replicate the conditions that lead to rapid viral adaptation may help stimulate broadly neutralizing antibody responses. The observed parallels in HIV-1 evolution between humans and rhesus macaques also support the continued use of macaques as a relevant model for HIV-1 research. Still, significant challenges remain. Future studies should explore the link between viral evolution and antibody development in larger cohorts. Moreover, vaccine development requires addressing many practical aspects – such as antigen selection and dosing regimens – which extend beyond the viral fitness dynamics explored in this study.

Introduction

HIV-1 rapidly mutates and proliferates in infected individuals. The immune system is a major driver of HIV-1 evolution, as the virus accumulates mutations to escape from host T cells and antibodies (Wei et al., 2003; Allen et al., 2005; Li et al., 2007). Due to the chronic nature of HIV-1 infection, coupled with high rates of mutation and replication, HIV-1 genetic diversity within and between infected individuals is incredibly high. Genetic diversity challenges vaccine development, as vaccine-elicited antibodies must be able to neutralize many strains of the virus to protect against infection (Altfeld and Allen, 2006).

However, there exist rare antibodies that are capable of neutralizing a broad range of HIV-1 viruses. These broadly neutralizing antibodies (bnAbs) have therefore been the subject of intense research (Kwong et al., 2013; Burton and Hangartner, 2016; Sok and Burton, 2018; Haynes et al., 2023). Eliciting bnAbs through vaccination remains a major goal of HIV-1 vaccine design. However, the development of exceptionally broad antibody responses is rare, and such antibodies typically develop only after several years of infection (Doria-Rose et al., 2010; Haynes et al., 2012; Haynes et al., 2016; Hraber et al., 2014).

Recent years have yielded important insights into the coevolutionary process between HIV-1 and antibodies that sometimes leads to the development of bnAbs. Clinical studies have collected serial samples of HIV-1 sequences from a few individuals who developed bnAbs and characterized the resulting antibodies, their developmental stages, and binding sites (Liao et al., 2013; Bonsignori et al., 2017; Doria-Rose et al., 2014). The contributions of HIV-1 and its coevolution to bnAb development are complex (Moore et al., 2015; Landais and Moore, 2018). High viral loads and viral diversity have been positively associated with bnAb development (Gray et al., 2011; Moore et al., 2015; Landais and Moore, 2018). However, superinfection, which can vastly increase HIV-1 diversity, is not always associated with bnAb development (Cornelissen et al., 2016), and it does not appear to broaden antibody responses in the absence of other factors (Landais and Moore, 2018).

Here, we sought to characterize the evolutionary dynamics of HIV-1 that accompany the development of bnAbs in clinical data. In particular, we inferred the landscape of selective pressures that shape the evolution of HIV-1 within hosts, reflecting the effects of the immune environment. We first analyzed data from two individuals who developed bnAbs within a few years after HIV-1 infection (Liao et al., 2013; Bonsignori et al., 2017). In both individuals, HIV-1 mutations inferred to be the most beneficial were observed early in infection. In general, mutations that provided resistance to autologous strain-specific antibodies were inferred to be more strongly selected than ones that escaped from bnAbs. We also observed clusters of beneficial mutations along the HIV-1 genome, which were associated with envelope protein (Env) structure.

To confirm the generality of these patterns in a broader sample, we studied recent data from rhesus macaques (RMs) infected with simian-human immunodeficiency viruses (SHIV) that incorporated HIV-1 Env proteins derived from the two individuals above (Roark et al., 2021). This study also compared patterns of Env evolution in HIV-1 and SHIV in response to host immunity. We observed striking parallels between the inferred fitness effects of Env mutations in RMs and humans, suggesting highly similar selective pressures on the virus despite different host species and differences in individual immune responses. Furthermore, we found that RMs that developed broad, potent antibody responses could clearly be distinguished from those with narrowly focused responses using the evolutionary dynamics of the virus. Specifically, the virus population in individuals who developed greater breadth was distinguished by larger and more rapid gains in fitness than in other individuals. Collectively, these results show high similarity between SHIV evolutionary dynamics in RMs and HIV-1 in humans, and that viral fitness gain is associated with antibody breadth.

Results

Quantifying HIV-1 evolutionary dynamics

We studied HIV-1 evolution accompanying the development of bnAbs in two donors, CH505 and CH848, enrolled in the Center for HIV/AIDS Vaccine Immunology 001 acute infection cohort (Tomaras et al., 2008). CH505 developed the CD4 binding site-targeting bnAb CH103, which was first detectable 14 weeks after HIV-1 infection (Liao et al., 2013). CH103 maturation was found to be associated with viral escape from another antibody lineage, CH235, that ultimately developed significant breadth (Gao et al., 2014; Bonsignori et al., 2016). CH848 developed a bnAb, DH270, targeting a glycosylated site near the third variable loop (V3) of Env (Bonsignori et al., 2017). Similar to the bnAb development process in CH505, escape from ‘cooperating’ DH272 and DH475 lineage antibodies was observed to contribute to the maturation of DH270 (Bonsignori et al., 2017).

To quantify HIV-1 evolutionary dynamics, we sought to infer a fitness model that best explained the changes in the genetic composition of the viral population observed in each individual over time. In recent years, a wide variety of approaches have been developed to infer the fitness effects of mutations from temporal genetic data (Bollback et al., 2008; Illingworth and Mustonen, 2011; Illingworth and Mustonen, 2012; Malaspinas et al., 2012; Mathieson and McVean, 2013; Lacerda and Seoighe, 2014; Feder et al., 2014; Steinrücken et al., 2014; Foll et al., 2014; Terhorst et al., 2015; Schraiber et al., 2016; Tataru et al., 2017; Paris et al., 2019; Mathieson and Terhorst, 2022; He et al., 2023; Sohail et al., 2022; Sohail et al., 2021; Shimagaki and Barton, 2025b; Gao and Barton, 2025; Lee et al., 2025). The vast majority of these methods focus on a single locus at a time, ignoring correlations between genetic variants at different loci. While the recombination rate of HIV-1 is high (Neher and Leitner, 2010; Romero and Feder, 2024), the virus also evolves under strong selection, which can lead to interference between clones with different beneficial mutations (Rouzine and Weinberger, 2013; Pandit and de Boer, 2014; Garcia and Regoes, 2014; Garcia et al., 2016; Williams and Pennings, 2020; Sohail et al., 2021). Thus, we applied MPL, an inference method that systematically accounts for genetic correlations (Sohail et al., 2022; Sohail et al., 2021; Shimagaki and Barton, 2025b; Gao and Barton, 2025; Lee et al., 2025), to estimate the fitness effects of HIV-1 mutations.

Model overview

Here, we provide a brief overview of the key steps in the MPL approach to inferring selection. Further details are available in Methods and in prior work (Sohail et al., 2022; Sohail et al., 2021; Shimagaki and Barton, 2025b; Gao and Barton, 2025; Lee et al., 2025). First, we assume that the effect on viral fitness of each individual mutation 𝑎 at each site 𝑖 is quantified by a selection coefficient 𝑠𝑖(𝑎), with positive coefficients 𝑠𝑖(𝑎)>0 denoting mutations that are beneficial for the virus and 𝑠𝑖(𝑎)<0 denoting deleterious ones. We further assume that the cumulative fitness effects of mutations are additive, such that the overall fitness 𝐹𝛼 of a viral sequence α is given by the sum of the selection coefficients for all the mutations that it bears. That is,

Fα=F(gα)=1+i=1La=1qsi(a)gi,aα, (1)

where 𝑔𝛼=((𝑔𝑖,𝑎𝛼)𝑎=1𝑞)𝑖=1𝐿 represents the viral sequence, with 𝑔𝑖,𝑎𝛼 equal to one if genotype α has allele 𝑎 at site 𝑖 and zero otherwise. 𝐿 is the length of the genetic sequence, and 𝑞 represents the number of genetic states (i.e., 𝑞=5 for nucleotide sequences and 21 for amino acids, including gaps/deletions).

We assume that viral replication is stochastic, where viruses with higher fitness are more likely to spread infection to new cells than ones with lower fitness. Let us write the number of viruses of each genotype in the population at time 𝑡 as 𝑛(𝑡)=(𝑛1(𝑡),𝑛2(𝑡),,𝑛𝑀(𝑡)), where 𝑀=𝑞𝐿 is the total number of possible genotypes. In our model, the probability of obtaining a new distribution of genotypes 𝑛(𝑡+1) in the next generation is multinomial,

P(n(t+1))=N!α=1Mpα(n(t))nα(t+1)nα(t+1)!, (2)

with N=αnα the total population size. The probabilities 𝑝𝛼(𝑛(𝑡)) are influenced by fitness, as well as mutation, recombination, and the current frequency of each genotype (see Methods). Essentially, viruses that are fitter than the current population average are more likely to increase in frequency, while ones that are less fit are likely to decline. Mutations and recombination introduce genetic variation into the population. The population size 𝑁 determines the relative stochasticity of the dynamics, with smaller populations having more random fluctuations than larger ones.

Following this model, we can then quantify the probability of any evolutionary history of the viral population (i.e., the distribution of viral genotypes over time) as a function of the selection coefficients (Methods).

Assuming that the population size 𝑁 is large and the selection coefficients are small (such that |𝑠|1), we can write an analytical expression for the selection coefficients that best fit the viral dynamics observed in data (Sohail et al., 2022; Sohail et al., 2021):

s^=(Cint+γI)1[ΔxintΔuint]. (3)

Here, 𝐶int is the covariance matrix of allele frequencies (i.e. the linkage disequilibrium matrix) integrated over time, and γ is a regularization parameter. The terms Δxint and Δuint represent the net change in allele frequency and the total expected change in allele frequency due to spontaneous mutations alone, respectively (see Methods for details). Intuitively, this expression says that net allele frequency changes that cannot be explained by mutations are likely due to selection, either on that specific allele or the associated genetic background, which is quantified by Cint. Alleles that have large, rapid changes in frequency are more likely to be under strong selection than those with smaller, slower frequency changes. Beyond HIV-1 (Sohail et al., 2021), this approach has been successfully applied to study the evolution of SARS-CoV-2 (Lee et al., 2025) and experimental evolution in bacteria (Li and Barton, 2023; Li and Barton, 2024).

Broad patterns of HIV-1 selection

As described above, we used MPL (Sohail et al., 2022; Sohail et al., 2021) to infer the selection coefficients that best fit the viral dynamics observed in data from CH505 and CH848. While the great majority of HIV-1 mutations were inferred to be neutral (𝑠𝑖(𝑎)0), a few mutations substantially increase viral fitness (Figure 1). Strongly beneficial mutations occurred in clusters along the genome and preferentially appeared in specific regions of Env (Figure 2).

Figure 1. Beneficial mutations occur in clusters along the genome.

Inferred fitness effects of HIV-1 mutations in CH505 (A) and CH848 (B). The position along the radius of each circle specifies the strength of selection: mutations plotted closer to the center are more deleterious, while those closer to the edge are more beneficial. For both individuals, clusters of beneficial mutations are observed in the variable loops of Env, some of which are associated with antibody escape. For CH848, a group of strongly beneficial mutations also appears in Nef.

Figure 1.

Figure 1—figure supplement 1. Weak correlation between sequence variability, as measured by entropy, and inferred selection.

Figure 1—figure supplement 1.

To quantify sequence variability, we computed “site-dependent entropy” scores Hi=a=1qxi(a,t)log(xi(a,t))t for each site 𝑖 in HIV-1 sequences from CH505 and CH848. In the expression for,𝐻𝑖 𝑥𝑖(𝑎,𝑡) represents the frequency of allele 𝑎 at site 𝑖 at time,𝑡 and 𝑡 denotes the average over time points. For both CH505 (A) and CH848 (B), we found no clear systematic association between the entropy values and beneficial inferred selection coefficients. Pearson’s 𝑟 values between entropy and inferred selection coefficients are 0.32 and 0.30 for CH505 and CH848, respectively. In particular, large selection coefficients are not concentrated among sites with the highest entropy values. Among the mutations in the top 5% inferred selection coefficients, the 𝑟 values are −0.01 and 0.11 for CH505 and CH848, respectively.

Figure 2. Visualization of the effects of HIV-1 mutations in CH505 on the Env trimer.

(A) Side view of the Env trimer (Jardine et al., 2016), with detail views of selection for mutations in the CH103 binding site (B) CH235 binding site (C) and sites associated with escape from autologous strain-specific antibodies (D). (E) Top view of the trimer, with detail views of variable loops (F-I) and the CD4 binding site (J). Generally, beneficial mutations appear more frequently near exposed regions at the top of the Env trimer, and deleterious mutations appear in more protected regions. Mutations near the V1/V2 apex can affect the binding and neutralization of antibodies targeting the CD4 binding site (Gao et al., 2014).

Figure 2.

Figure 2—figure supplement 1. CH848 exhibits similar spatial patterns of selection coefficients.

Figure 2—figure supplement 1.

CH848 exhibits similar spatial patterns of selection coefficients; significantly strongly selected mutations are located at the apex, particularly on the edge of the Env protein. (A, B) Side and top views of the Env protein with inferred selection coefficient values. The apex region is enriched with moderately and strongly beneficial mutations. (C-F) Mutations that are presumably resistant to bnAbs and autologous nAbs are highlighted. The selection values in the autologous nAbs binding regions were slightly larger than those in the bnAbs binding regions. (G-K) Top views highlight the individual variable regions.

To quantify patterns of selection in Env, we examined the top 2% of mutations inferred to be the most beneficial in CH505 and CH848. The fractions of nonsynonymous mutations within these subsets were 97% and 92% for CH505 and CH848, respectively. These fractions are significantly higher than chance expectations (𝑝=6.7×103 and 5.6×104, Fisher’s exact test; Methods), supporting the model’s ability to accurately infer fitness effects in this data. For CH505, we found 10.9-fold more strongly beneficial mutations in the first variable loop (V1) than expected by chance (𝑝=2.5×103). This is consistent with the presence of V1 mutations conferring resistance to autologous strain-specific antibodies (Gao et al., 2014). Mutations in V4, a region targeted by CD8 + T cells (Gao et al., 2014), were also 10.0-fold enriched in this subset (𝑝=9.0×105). For CH848, mutations in V1, V3, and V5 were enriched by factors of 14.2, 6.3, and 19.1 among the top 2% most beneficial mutations (p=5.9×1010, 6.6×103, and 4.8×106). Mutations in these regions were shown to play a role in resistance to DH270 and DH475 lineage antibodies (Bonsignori et al., 2017). To test whether our results might be biased by overall sequence variability, we examined the relationship between our inferred selection coefficients and entropy, a common measure of sequence variability. Overall, we found only a modest correlation between selection and entropy, suggesting that the signs of selection that we observe are not due to increased sequence variability alone (Figure 1—figure supplement 1).

Reversions and mutations affecting N-linked glycosylation motifs were also likely to be beneficial. We define reversions as mutations where the transmitted/founder (TF) amino acid changes to match the subtype consensus sequence at the same site. Among the top 2% most beneficial mutations, reversions were enriched by factors of 19.9 and 17.8 for CH505 and CH848 viruses, respectively (𝑝=2.1×108 and 8.5×1013), consistent with past work finding strong selection for reversions (Zanini et al., 2015; Sohail et al., 2021). For CH848, this group also includes several strongly selected mutations observed in Nef (Figure 1B), a protein that plays multiple roles during HIV-1 infection (Das and Jameel, 2005; Barton et al., 2019). Mutations affecting N-linked glycosylation motifs (i.e., by adding, removing, or shifting a glycosylation motif) were enriched by factors of 4.6 (𝑝=7.0×103) and 8.7 (p=1.4×1010). Changes in glycosylation patterns contributed to antibody escape for both CH505 and CH848 (Gao et al., 2014; Bonsignori et al., 2017).

Selection for antibody escape

To quantify levels of selection for antibody escape, we computed selection coefficients for mutations that were observed to contribute to resistance to bnAbs as well as bnAb precursors and autologous strain-specific antibodies (Supplementary file 1 and Supplementary file 3; Gao et al., 2014; McCurley et al., 2017; Kong et al., 2019; Saunders et al., 2022; Bauer et al., 2023). We mapped the inferred selection coefficients to the Env protein structure (Liao et al., 2013), highlighting the binding sites for bnAbs and resistance mutations for strain-specific antibodies, as well as important parts of Env (Figure 2, Figure 1—figure supplement 1). We also analyzed when these resistance mutations were first observed in each individual.

Overall, we observed stronger selection for escape from autologous strain-specific antibodies and/or changes in glycosylation during the first six months of infection (Supplementary file 4 and Supplementary file 5). This was then followed by more modest selection for escape from bnAb lineages (Figure 3). For CH505, mutations that conferred resistance to the intermediate-breadth CH235 lineage (Gao et al., 2014) were less beneficial than top mutations escaping from strain-specific antibodies (Figure 3A). In turn, resistance mutations for the broader CH103 lineage were less beneficial than CH235 resistance mutations (Figure 3A). CH848 is similar, with some highly beneficial mutations affecting glycosylation observed early in infection (Figure 3B). While a few mutations affecting DH270 appear strongly selected, these mutations appeared long before the DH270 lineage was detected (around 3.5 years after infection Bonsignori et al., 2017). Thus, these mutations may have initially been selected for other reasons.

Figure 3. Trajectories and inferred selection coefficients for immune escape mutations and mutations affecting Env glycosylation.

(A) For CH505, early, strongly selected mutations include ones that escape from CD8 + T cells and autologous strain-specific antibodies (ssAbs). More moderately selected bnAb resistance mutations tend to arise later. Note that mutations that affect bnAb resistance can appear in the viral population before bnAbs are generated. Open circles and error bars reflect the mean and standard deviation of inferred selection coefficients in each category. (B) For CH848, mutations affecting glycosylation dominate the early phase of evolution, followed later by mutations affecting bnAb resistance. For easier visualization, frequency trajectories are shown with exponential smoothing with a time scale of t=50 days. See Figure 2, Figure 2—figure supplement 1 for a detailed view of mutation frequency trajectories by mutation type without smoothing.

Figure 3.

Figure 3—figure supplement 1. Frequencies of different types of HIV-1 mutations over time.

Figure 3—figure supplement 1.

CH505 (A) and CH848 (B) mutation frequencies over time. Mutation types are the same as in Figure 3 in the main text, but with all mutations affecting resistance to bnAb lineage antibodies grouped together.

Consistent patterns of selection in SHIV evolution

Simian-human immunodeficiency viruses (SHIVs) have numerous applications in HIV/AIDS research (Hatziioannou and Evans, 2012; Li et al., 2016). Recently, Roark and collaborators studied SHIV-antibody coevolution in rhesus macaques (RMs), which they compared with patterns of HIV-1 evolution (Roark et al., 2021). Two of the SHIV constructs in this study included envelope sequences derived from CH505 and CH848 transmitted/founder (TF) viruses. There, it was found that 2 out of 10 RMs inoculated with SHIV.CH505 and 2 out of 6 RMs inoculated with SHIV.CH848 developed antibodies with substantial breadth.

To understand whether the patterns of HIV-1 selection observed in CH505 and CH848 are repeatable, and to search for viral factors that distinguish between individuals who develop bnAbs and those who do not, we analyzed SHIV.CH505 and SHIV.CH848 evolution in RMs (Roark et al., 2021). To prevent spurious inferences, we first omitted data from RMs with <3 sampling times or <4 sequences in total (Methods). After processing, we examined evolutionary data from seven RMs inoculated with SHIV.CH505 and six RMs inoculated with SHIV.CH848 (Supplementary file 6 and Supplementary file 7). We then computed selection coefficients for SHIV mutations within each RM. Reasoning that selective pressures across SHIV.CH505 and SHIV.CH848 viruses are likely to be similar, we also inferred two sets of joint selection coefficients that best describe SHIV evolution in SHIV.CH505- and SHIV.CH848-inoculated RMs, respectively (Methods).

As before, we examined the top 2% of SHIV.CH505 and SHIV.CH848 mutations that we inferred to be the most beneficial for the virus. Overall, we found consistent selection for reversions (17.9- and 14.2-fold enrichment, p=4.3×1011 and 1.2×1011 for SHIV.CH505 and SHIV.CH848, respectively), with slightly attenuated enrichment in mutations that affect N-linked glycosylation (2.7- and 4.2-fold enrichment, 𝑝=5.4×102 and 3.4×106). However, there is a small subset of mutations that shift glycosylation sites by simultaneously disrupting one N-linked glycosylation motif and completing another, where highly beneficial mutations occur far more often than expected by chance (158.3- and 191.1-fold enrichment, p=1.7×104 and 7.9×1011 for CH505 and SHIV.CH505, respectively; 118.7-fold and 90.4-fold enrichment, 𝑝=2.0×104 and 2.3×107 for CH848 and SHIV.CH848).

Intuitively, one may expect that strongly beneficial SHIV mutations are more likely to be observed in samples from multiple RMs. However, we found that the number of RMs in which a mutation is observed is only weakly associated with the fitness effect of the mutation (Figure 4, Figure 4—figure supplement 1). While substantially deleterious SHIV mutations are rarely observed across multiple RMs, neutral and nearly neutral mutations are common. Thus, in this data set, it is not generally true that SHIV mutations observed in multiple hosts must significantly increase viral fitness.

Figure 4. Example trajectories and selection coefficients for SHIV mutations that affect viral load, bnAb recognition, or glycosylation.

(A) In RM5695, infected with SHIV.CH505, mutations known to increase viral load (Bauer et al., 2023) and ones affecting glycosylation were rapidly selected. Mutations affecting resistance to broad antibodies (Roark et al., 2021) arose later under moderate selection. Open circles and error bars reflect the mean and standard deviation of inferred selection coefficients in each category. (B) Slower but qualitatively similar evolutionary patterns were observed in RM6163, infected with SHIV.CH848. For easier visualization, frequency trajectories are shown with exponential smoothing with a time scale of t=50 days.

Figure 4.

Figure 4—figure supplement 1. The number of RMs in which a SHIV mutation was observed is only weakly associated with its inferred fitness effect.

Figure 4—figure supplement 1.

Weak association between the number of RMs in which a SHIV mutation was observed and its inferred fitness effect. Inferred selection coefficients for SHIV.CH505 (A) and SHIV.CH848 (B) mutations, sorted by the number of RMs in which the mutation was observed.

We observed some differences between HIV-1 and SHIV in the precise locations of the most beneficial Env mutations. For example, mutations in V4 are highly enriched in CH505 due to a CD8 + T cell epitope in this region, but not in SHIV.CH505 (2.7-fold enrichment, p=0.13). For SHIV.CH848, beneficial mutations are modestly enriched in V1 and V5 (4.7- and 4.2-fold, 𝑝=4.0×103 and 6.7×102), as in CH848, but not for V3 (0.9-fold enrichment, p=0.22).

Despite some differences in the top mutations, patterns of selection over time in SHIV were very similar to those found for HIV-1. As before, highly beneficial mutations, including ones affecting glycosylation, tended to appear earlier in infection. This was followed by modestly beneficial mutations at later times, including ones involved in resistance to bnAbs in the RMs who developed antibodies with significant breadth (examples in Figure 4).

Detection of SHIV mutations that increase viral load

A major goal of nonhuman primate studies with SHIV is to faithfully recover important aspects of HIV-1 infection in humans. However, due to the divergence of simian immunodeficiency viruses and HIV-1, SHIVs are not always well-adapted to replication in RMs (Bauer et al., 2023). To combat this problem, a recent study identified six SHIV.CH505 mutations that increase viral load (VL) in RMs (Bauer et al., 2023). These mutations result in viral kinetics that better mimic HIV-1 infection in humans.

Our analysis readily identifies the SHIV.CH505 mutations shown to increase VL. Five out of the top six SHIV.CH505 mutations with the largest average selection coefficients are associated with increased VL (Supplementary file 8). The final mutation identified by Bauer et al., N130D, is ranked tenth. We also find highly beneficial mutations in SHIV.CH848 that are distinct from those in SHIV.CH505 (Supplementary file 9). Highly ranked mutations identified here may be good experimental targets for future studies aimed at increasing SHIV.CH848 replication in vivo.

Fitness agreement between HIV-1 and SHIV

Next, we explored the similarity in the overall viral fitness landscapes inferred for HIV-1 and SHIV, beyond just the top mutations. First, we computed the fitness of each SHIV sequence using the joint SHIV.CH505 and SHIV.CH848 selection coefficients inferred from RM data. Then, we computed fitness values for SHIV sequences using selection coefficients inferred from HIV-1 evolution in CH505 and CH848.

We observed a remarkable agreement between SHIV fitness values computed from these two sources (Figure 5). For both SHIV.CH505 and SHIV.CH848, the correlation between viral fitness estimated using data from humans (i.e. CH505 and CH848) and RMs is strongly and linearly correlated (Pearson’s r=0.96 and 0.95, p<1020). This implies that evolutionary pressures on the envelope protein SHIV-infected RMs are highly similar to those on HIV-1 in humans with the same TF Env. In fact, this relationship holds even beyond the SHIV sequences observed during infection. Fitness estimates for sequences with randomly shuffled sets of mutations are also strongly correlated (Figure 5—figure supplement 1). In contrast, the inferred fitness landscapes of CH505 and CH848, which share few mutations in common, are poorly correlated (Figure 5—figure supplement 2). This suggests that the similarities between viral fitness values in humans and RMs are not artifacts of the model, but rather stem from similarities in underlying evolutionary drivers.

Figure 5. Inferred fitness landscapes in HIV-1 and SHIV are highly similar.

(A) Fitness of SHIV.CH505 sequences relative to the TF sequence across 7 RMs, including 2 that developed bnAbs, 1 that developed tier 2 nAbs that lacked a critical mutation for breadth, and 4 that did not develop broad antibody responses, evaluated using fitness effects of mutations using data from CH505 and using RM data. The fitness values are strongly correlated, indicating the similarity of Env fitness landscapes inferred using HIV-1 or SHIV data. Values are normalized such that the fitness gain of the TF sequence is zero. (B) Fitness values for SHIV.CH848 sequences also show strong agreement between CH848 and SHIV.CH848 landscapes.

Figure 5.

Figure 5—figure supplement 1. Randomized sequences show broad similarity between HIV-1 and SHIV fitness landscapes.

Figure 5—figure supplement 1.

Broad similarity between HIV-1 and SHIV fitness landscapes with the same TF Env sequence. (A) Fitness estimates for a sample of artificial Env sequences, obtained by independently shuffling observed amino acids in SHIV.CH505 sequences at each residue. This random sequence ensemble conserves single-residue frequencies, but not correlations between mutations. Even on these artificial sequences, the fitness estimates using a model trained on HIV-1 data strongly agree with the SHIV model (Pearson’s r=0.84, p<1020). This implies that the similarity of the fitness landscapes is not confined to the specific genotypes observed in HIV-1 or SHIV evolution, but also extends to more distant sequences. (B) Similar results also hold for fitness landscapes based on CH848 and SHIV.CH848 data (Pearson’s R=0.74, p<1020).
Figure 5—figure supplement 2. Little correlation in fitness values estimated from evolutionarily distant sequences.

Figure 5—figure supplement 2.

Comparison between estimated fitness values for HIV-1 sequences using fitness landscapes learned from CH505 and CH848 data. There is little correlation between the inferred landscapes. Furthermore, the CH505 landscape captures little variation in fitness for CH848 sequences (and vice versa). 199 mutations are shared between CH505 and CH848, among 868 and 1406 total Env mutations, respectively.

Evolutionary dynamics forecast antibody breadth

Given the similarity of HIV-1 and SHIV evolution, we sought to identify evolutionary features that distinguish between hosts who develop broad antibody responses and those who do not. Figure 5 shows that SHIV sequences from hosts with bnAbs often reach higher fitness values than those in hosts with only narrow-spectrum antibodies. We hypothesized that stronger selective pressures on the virus might drive viral diversification, stimulating the development of antibody breadth. Past studies have associated higher viral loads with bnAb development and observed viral diversification around the time of bnAb emergence (Liao et al., 2013; Moore et al., 2015; Landais and Moore, 2018). Computational studies and experiments have also shown that sequential exposure to diverse antigens can induce cross-reactive antibodies (Wang et al., 2015; Escolano et al., 2016; Sprenger et al., 2020).

To further quantify SHIV evolutionary dynamics, we computed the average fitness gain of viral populations in each RM over time. We observed a striking difference in SHIV fitness gains between RMs that developed broad antibody responses and those that did not (Figure 6). In particular, SHIV fitness increased rapidly before the development of antibody breadth. SHIV fitness gains in RM5695, which developed exceptionally broad and potent antibodies (Roark et al., 2021), were especially rapid and dramatic. These fitness differences were not attributable to bnAb resistance mutations, which were only moderately selected and generally appeared after bnAbs developed.

Figure 6. Rapid SHIV fitness gains precede the development of broadly neutralizing antibodies.

For both SHIV.CH505 (A) and SHIV.CH848 (B), viral fitness gains over time display distinct patterns in RM hosts that developed bnAbs versus those that did not. Notably, the differences in SHIV fitness gains between hosts with and without broad antibody responses appear before the development of antibody breadth and cannot be attributed to selection for bnAb resistance mutations. RM6072 is an unusual case, exhibiting antibody development that was highly similar to CH505. Although RM6072 developed tier 2 nAbs, they lacked key mutations critical for breadth (Roark et al., 2021). Points and error bars show the mean and standard deviation of fitness gains across SHIV samples in each RM at each time.

Figure 6.

Figure 6—figure supplement 1. Contributions of different types of SHIV mutations to viral fitness gains over time.

Figure 6—figure supplement 1.

SHIV.CH505 (A) and SHIV.CH848 (B) mutations were grouped into four categories to assess their contributions to SHIV fitness gains over time. For SHIV.CH505, the mutations N334S, H417R, K302N, Y330H, N279D, and N130D are classified as viral load (VL) enhancing mutations, following Bauer et al., 2023. For both SHIV.CH505 and SHIV.CH848, we then separated out the contributions of known antibody resistance mutations, including mutations that affect N-linked glycosylation motifs. We then computed the collective fitness contributions from subsets of mutations that affect N-linked glycosylation motifs that were not known to affect resistance to specific antibodies and reversions to the HIV-1 subtype consensus sequence. Each mutation appears in only one category in this figure, sorted in the order above. For example, a mutation that affects an N-linked glycosylation motif and which is a reversion to the subtype consensus sequence, but which has not been established to affect resistance to a specific antibody, would have its contribution to fitness counted in the glycan category.
Figure 6—figure supplement 2. Distribution of inferred epistatic interactions.

Figure 6—figure supplement 2.

Most of the inferred epistatic interactions concentrate around zero, with only a small fraction exhibiting moderately larger absolute values. Distribution of inferred epistatic interactions for CH505 (A) CH848 (B) SHIV.CH505 (C) and SHIV.CH848 (D).
Figure 6—figure supplement 3. Similarity between effective selection coefficients obtained from the epistatic model and selection coefficients in the additive model.

Figure 6—figure supplement 3.

Effective selection coefficients obtained from the epistatic model align well with the selection coefficients in the additive model. We compared selection coefficients from the additive model analyzed throughout the manuscript to the effective selection coefficients from the epistatic model for CH505 (A) and CH848 (B). Intuitively, the effective selection coefficient is defined as the average difference in fitness when a particular mutation is replaced by the TF nucleotide/amino acid. For definiteness, let 𝑔 represent an arbitrary sequence and let 𝑔𝑖 represent a sequence that is identical to 𝑔, except that the nucleotide/amino acid at site 𝑖 has been replaced by the TF one. The effective selection coefficient is then defined as 𝑠𝑖@@𝑒𝑓𝑓(𝑎)=𝐹(𝑔)𝐹(𝑔𝑖)𝑔@@𝑑𝑎𝑡𝑎|𝑔𝑖=𝑎, where the average runs over all the sequences in the data set that have mutant allele 𝑎 at site. 𝑖 Similarly, we compared selection from the additive and epistatic models for SHIV.CH505 (C) and SHIV.CH848 (D).
Figure 6—figure supplement 4. Comparison between fitness values in the additive and epistatic models.

Figure 6—figure supplement 4.

Fitness values are consistent between the additive and epistatic models. Comparison of the fitness values obtained from the additive and epistatic models for CH505 (A), CH848 (B), SHIV.CH505 (C), and SHIV.CH848 (D), respectively.
Figure 6—figure supplement 5. Robustness of the inferred selection coefficients using bootstrap resampling.

Figure 6—figure supplement 5.

Selection coefficients are robust to finite sampling noise. Selection coefficients from the full and bootstrap-resampled data are compared for CH505 (A), CH848 (B), SHIV.CH505 (C), and SHIV.CH848 (D). Each point and error bar represents the mean and confidence interval, respectively, based on 10 independently inferred selection coefficients from bootstrap samples. The bootstrap samples are obtained by uniformly resampling the same number of sequences from the sequence ensemble for each subject at each time point.

One outlier in this pattern is RM6072, infected with SHIV.CH505. Antibody development in RM6072 followed a path that was remarkably similar to CH505, including a lineage of antibodies, DH650, directed toward the CD4 binding site of Env (Roark et al., 2021). However, resistance to the DH650 lineage is conferred by a strongly selected mutation that adds a glycan at site 234 (T234N, with an inferred selection coefficient of 4.5%). Broadly neutralizing antibodies similar to DH650 are able to accommodate this glycan due to shorter and/or more flexible light chains (Zhou et al., 2013; Roark et al., 2021), but DH650 cannot. Antibody evolution in RM6072 thus proceeded along a clear pathway toward bnAb development but lacked critical mutations to achieve breadth.

Next, we quantified how different types of SHIV mutations contributed to viral fitness gains over time. We examined contributions from VL-enhancing mutations (Bauer et al., 2023), antibody escape mutations (Bauer et al., 2023; Roark et al., 2021), other mutations affecting Env glycosylation, and reversions to subtype consensus. We found increased fitness gains across all types of mutations in RMs that developed broad antibody responses, compared to those that did not (Figure 6—figure supplement 1). VL-enhancing mutations, known antibody resistance mutations, and reversions typically made the largest contributions to viral fitness.

Robustness of inferred selection to changes in the fitness model and finite sampling

In the analysis above, we used a simple model where the net fitness effect of multiple mutations is simply equal to the sum of their individual effects. Recently, methods have also been developed that can infer epistatic fitness effects from data, which include pairwise interactions between mutations (Sohail et al., 2022; Shimagaki and Barton, 2025b). We reanalyzed these data to examine how inferred fitness changes when epistasis is included in the model, using the approach of Shimagaki and Barton (2025; Methods). Overall, the inferred epistatic interactions were modest (Figure 6—figure supplement 2). In CH505, we found that the CD4 binding site, V1 (especially sites 136–146 in HXB2 numbering) and V5 regions were modestly but significantly enriched in the most beneficial (top 1%) of epistatic interactions (2.5-, 1.2-, and 1.8-fold enrichment with p=1.0×1021,6.3×106 and 6.3 × 10-5, respectively). Epistatic interactions between N280S/V281A and E275K/V281G, which confer resistance to CH235 (Gao et al., 2014), ranked in the top 6.5% and 13.0% of interactions. In CH848, we found 1.3-, 1.5-, and 2.3-fold enrichment in strong beneficial epistatic interactions in the CD4 binding site, V4, and V5 regions, respectively (p=4.0×106,p=2.5×1014 and 3.2 × 10-19).

To compare the typical fitness effects of individual mutations in the model with epistasis to those in the additive model, we computed effective selection coefficients for the epistatic model. For each mutant allele 𝑎 at each site 𝑖, we computed the average difference in fitness between sequences in the data set with the mutation and hypothetical sequences that are the same as those in the data, except with the mutant allele 𝑎 reverted to the TF one. In this way, the effective selection coefficient measures the typical effect of each mutation in the data set, while also accounting for epistatic interactions with the sequence background. We found that the effective selection coefficients were highly correlated with the selection coefficients from the additive model (Figure 6—figure supplement 3). We also found strong agreement between the additive and epistatic model fitness values for each sequence in both HIV-1 and SHIV data (Figure 6—figure supplement 4).

Finite sampling of sequence data could also affect our analyses. To further test the robustness of our results, we inferred selection coefficients using bootstrap resampling, where we resample sequences from the original ensemble, maintaining the same number of sequences for each time point and subject. The selection coefficients from the bootstrap samples are consistent with the original data (see Figure 6—figure supplement 5 for a typical example), with Pearson’s 𝑟 values of around 0.85 for HIV-1 data sets and 0.95 for SHIV data sets, respectively.

Discussion

HIV-1 evolves under complex selective pressures within individual hosts, balancing replicative efficiency with immune evasion. Here, we quantitatively studied the evolution of HIV-1 and SHIV (featuring HIV-1-derived Env sequences) across multiple hosts, including some who developed broad antibody responses against the virus. Our study highlighted how different classes of mutations (e.g. mutations affecting T cell escape or Env glycosylation) affect fitness in vivo. In both HIV-1 and SHIV, we found strong selection for reversions to subtype consensus and some mutations that affected N-linked glycosylation motifs or resistance to autologous strain-specific antibodies. Few CD8 + T cell epitopes were identified in this data set, but the T cell escape mutations that we did observe were highly beneficial for the virus. Consistent with past work studying VRC26 escape in CAP256, we observed more modest selection for bnAb resistance mutations (Sohail et al., 2021).

Overall, we found striking similarities between Env evolution in humans and RMs. Importantly, these parallels extend beyond the observation of repetitive mutations: the number of hosts in which a mutation was observed was only weakly associated with the mutation’s fitness effect (Figure 4—figure supplement 1). Our inferred Env fitness values in humans and RMs were highly correlated, indicating that the functional and immune constraints shaping Env evolution in HIV-1 and SHIV infection are very similar. Our findings, therefore, reinforce SHIV as a model system that closely mirrors HIV-1 infection.

We discovered that the speed of SHIV fitness gains was clearly higher in RMs that developed broad antibody responses than in those with narrow-spectrum antibodies. Fitness gains in the viral population preceded the development of bnAbs, and they were not driven by bnAb resistance mutations. This suggests that rapid changes in the viral population are a cause rather than a consequence of antibody breadth. While our sample is limited to 13 RMs and two founder Env sequences, we find a clear separation between RMs that did or did not develop antibody breadth. Thus, the dynamics of viral fitness may serve as a quantitative signal associated with bnAb development.

The induction of bnAbs is a major goal of HIV-1 vaccine design (Haynes et al., 2023). Both computational (Wang et al., 2015; Shaffer et al., 2016; Sprenger et al., 2020; Nourmohammad et al., 2016) and experimental (Dosenovic et al., 2015; Escolano et al., 2016; Williams et al., 2023) studies, as well as observations from individuals who developed bnAbs (Gao et al., 2014; Liao et al., 2013; Bonsignori et al., 2017), suggest that the co-evolution of antibodies and HIV-1 is important to stimulate broad antibody responses. Our results could thus inform HIV-1 vaccine research. While precise immune responses and viral escape pathways can differ across individuals, the quantitative similarity in viral evolutionary constraints across humans and RMs suggests that SHIV data can provide a valuable source of information about Env variants that contribute to bnAb development, especially when detailed longitudinal data from humans does not exist. While the concept of sequential immunization is well-established (Pancera et al., 2010; Haynes et al., 2012; Klein et al., 2013; Wang et al., 2015; Escolano et al., 2016), our findings also suggest a possible new design principle. Immunogens could be engineered to reproduce the dynamics of viral population change that are associated with rapid fitness gains, which we found to precede the emergence of bnAbs. This emphasis on broader, population-level dynamics could complement investigations of the molecular details of virus and antibody coevolution.

As noted above, Roark and collaborators also performed a detailed comparison of HIV-1 and SHIV evolution with the same TF Env sequences (Roark et al., 2021). One of their main conclusions was that most Env mutations were selected for escape from CD8+ T cells or antibodies. We found that many antibody resistance mutations identified by Roark et al. are also positively selected in our analysis. Mutations at sites 166 and 169 were shown to confer resistance to a V2 apex bnAb, RHA1, isolated in RM5695 (Roark et al., 2021). We inferred moderately positive selection coefficients of 0.49% and 0.43% for R166K and R169K, respectively. The same mutations were found in RM6070, which also developed V2 apex bnAbs, with a selective advantage of 1.7% (Supplementary file 10). Mutations conferring resistance to autologous strain-specific nAbs were identified at multiple sites by Roark and colleagues: 130, 234, 279, 281, 302, 330, and 334 in RM6072, which developed antibody responses targeting the CD4 binding site (DH650) and V3 (DH647 and DH648) regions. Mutations Y330H and N334S, which confer resistance to V3 autologous nAbs, were detected in all RMs infected with SHIV.CH505, with selective advantages of 3.0% and 4.6% in RM6072, and 1.7% and 3.2% on average across RMs, respectively. Overall, we found that mutations conferring resistance to autologous strain-specific antibodies were common and more strongly selected than bnAb resistance mutations (Supplementary file 10 and Supplementary file 11).

We note that our conclusions about the phenotypic effects of HIV-1 mutations under selection are constrained by the available data. While we observed strong selection for strain-specific antibody resistance mutations, these results could also be affected by the effects of these mutations on viral replication independent of immune escape. In particular, many ssAb resistance mutations are also reversions to the subtype consensus sequence, which have often been observed to improve viral fitness (Zanini et al., 2015; Sohail et al., 2021). For example, N334S, K302N, and T234N are all reversions. These are among the most beneficial mutations inferred for SHIV.CH505 (Supplementary file 8). In future work, it would be interesting to attempt to fully separate the fitness effects of mutations due to antibody escape and intrinsic replication (Gao and Barton, 2025). Although we have systematically compiled information about mutations known to affect antibody resistance and glycosylation, this data is necessarily incomplete. Some of the strongly beneficial mutations with unknown functional effects that we observe could therefore reflect escape from unmapped immune responses.

There are additional methodological and technical limitations that should be considered in the interpretation of our results. Most notably, we assume that the viral fitness landscape is static in time. While we do not expect selection for effective replication (‘intrinsic’ fitness) to change substantially over time, pressure for immune escape could vary along with the immune responses that drive them. In prior work, we have found that constant selection coefficients typically reflect the average fitness effect of a mutation when its true contribution to fitness is time-varying (Gao and Barton, 2025; Lee et al., 2025). This may not adequately describe mutational effects that undergo large or rapid shifts in time. Future work should also examine temporal patterns in selection for individual mutations.

While we found a strong relationship between viral fitness dynamics and the emergence of bnAbs, it may not be true that the former stimulates the latter. For example, bnAbs may have been present within each host before they were experimentally detected. Rapid viral fitness gains within hosts that developed broad antibody responses could then have been driven by undetected bnAb lineages. However, we did not find strong selection for known bnAb resistance mutations, and in at least one case (RM5695), rapid fitness gains (roughly 2 weeks after infection) substantially preceded bnAb detection (16 weeks). Still, given the limited size of the data set that we studied, it is unclear the extent to which our results will transfer to larger and broader data sets.

Among other analyses, Roark et al. used LASSIE (Hraber et al., 2015) to identify putative sites under selection (Supplementary file 12 and Supplementary file 13). This method works by identifying sites where non-TF alleles reach high frequencies. We found modest overlap between the sites under selection as identified by LASSIE and the mutations that we inferred to be the most strongly selected. For SHIV.CH505, the E640D mutation at site 640 identified by LASSIE is ranked second among 664 mutations in our analysis, and mutations at the remaining 5 sites identified by LASSIE are all within the top 20% of mutations that we infer to be the most beneficial. For SHIV.CH848, the R363Q mutation that is ranked first in our analysis appears at one of the 17 sites identified by LASSIE. Some mutations at the majority of these 17 sites fall within the top 20% most beneficial mutations in our analysis, but some are outliers. In particular, we infer both S291A/P to be somewhat deleterious, with S291P ranked 810th out of 863 mutations.

Beyond the specific context of HIV-1 and bnAb development, our study also provides insight into viral evolution across hosts and related host species. Parallels between the HIV-1 and SHIV fitness landscapes that we infer suggest that there are strong constraints on viral protein function, with few paths to significantly higher fitness. This is consistent with the ideas of methods that use sequence statistics across multiple individuals and hosts to predict the fitness effects of mutations (Ferguson et al., 2013; Mann et al., 2014; Lässig et al., 2017; Łuksza and Lässig, 2014; Barton et al., 2016 Louie et al., 2018 Hie et al., 2021). However, the relationship between the number of individuals in which a mutation was observed and its inferred fitness effect was fairly weak. This suggests that mutational biases and/or sequence space accessibility may play significant roles in short-term viral evolution, even for highly mutable viruses such as HIV-1 and SHIV. As described above, high-frequency mutations were also not necessarily highly beneficial. While the recombination rate of HIV-1 is high, correlations between mutations persist, making it difficult to unambiguously interpret frequency changes as signs of selection (Sohail et al., 2021).

Our results also point to strong similarities in the immune environment across closely related host species, including preferential targeting of specific parts of viral surface proteins by antibodies. This is supported by the enrichment of beneficial mutations within variable loop regions and at sites that affect the glycosylation of Env. However, despite these constraints, there may still exist a large number of neutral or nearly-neutral mutational paths that remain unexplored.

Overall, our findings support the potential predictability of viral evolution, at least over short time scales. While there are contingencies in evolution – for example, disparate host immune responses or strong epistatic constraints between mutations – these are not so pervasive that they completely change the effective viral fitness landscape or paths of evolution across hosts, given the same founder virus sequence. Similar observations of parallel evolution in HIV-1 have been reported in monozygotic twins infected by the same founder virus (Draenert et al., 2006), common patterns of immune escape across hosts (Choisy et al., 2004; Barton et al., 2016) and drug resistance (Wensing et al., 2016; Feder et al., 2014; Feder et al., 2016; Feder et al., 2021), and long-term experimental evolution (Bons et al., 2020). Our results thus contribute to a growing body of research identifying predictable features in viral evolution. Understanding such features could ultimately inform practical applications such as anticipating the emergence of drug resistance or designing vaccines to limit likely pathways of escape.

Methods

Data

We retrieved HIV-1 sequences from CH505 (703010505) and CH848 (703010848) from the HIV sequence database at Los Alamos National Laboratory (LANL) (Los Alamos National Laboratory, 2023a). The rhesus macaque (RM) SHIV sequences (Roark et al., 2021) were obtained from GenBank (Benson et al., 2012). We then co-aligned SHIV.CH505 and SHIV.CH848 sequences with CH505 and CH848 HIV-1 sequences, respectively, using HIValign (Los Alamos National Laboratory, 2023b).

CH505

CH505 developed two distinct lineages of CD4 binding site (CD4bs) bnAbs, CH103 and CH235 (Gao et al., 2014; Kreer et al., 2023). CH103 antibodies were detectable by 14 weeks after infection and further developed neutralization breadth between 41–92 weeks (Liao et al., 2013). IC50 values of CH235 against the TF virus were 6.5-fold lower than those of CH103 (Gao et al., 2014). CH235 lineages could neutralize autologous viruses at week 30. However, viruses that acquired mutations at loop D from 53 to 100 weeks escaped CH235 neutralization (Gao et al., 2014). Although the neutralization breadth of CH235 was not as broad as that of CH103, this lineage played a critical role; escaping mutations from the CH235 lineage stimulated the development of another lineage with broader neutralization depth (Gao et al., 2014). Mutations in loop D enabled the virus to escape from CH235, but these sequential mutations in loop D, such as E275K, N279D, and V281S, favorably bound to the mature CH103 and continuously increased the binding affinity between mature CH103 and loop D (Gao et al., 2014). Gradually, CH103 matured, developing a broader neutralization breadth.

CH848

CH848 developed DH270, a bnAb that targets the glycosylated site adjacent to the third variable loop (V3). DH270 was detectable three and a half years after infection (Bonsignori et al., 2017). Similar to the CH505 case, the CH848 case exhibited cooperative virus and antibody coevolution. The earlier antibody lineages, DH272 and DH475, could neutralize autologous viruses until week 51 and weeks 15–39, respectively. The virus escaped from DH272 and DH475 afterward, with escape mutations including a longer V1V2 loop. DH270 then developed, with potent and broad neutralization breadth (Bonsignori et al., 2017).

Rhesus macaques

Chimeric viruses, SHIVs, were constructed by bearing the transmitted/founder (TF) Env from three HIV-1 patients, including CH505 and CH848 (Roark et al., 2021). In some RMs, SHIV developed similar patterns of mutations to those observed in human donors. In our analysis, we considered RMs with SHIV sequences sampled at at least three points in time. This yielded a set of 7 RMs and 6 RMs for SHIV.CH505 and SHIV.CH848, respectively. The Supplementary file 6 and Supplementary file 7 summarize the number of sequences, time points, and the development of bnAbs for each individual in the SHIV cases as well as HIV-1 cases.

Sequence data processing

Data quality control

To focus our analysis on functional sequences, we removed sequences with more than 200 gaps. To eliminate rare insertions or possible alignment errors, we also masked sites where gaps occurred in more than 95% of sequences within each individual host. To limit errors in virus frequencies, we only considered data from time points with four or more sequences.

Identifying reversions

A mutation is classified as a reversion if the new (mutant) nucleotide matches with the nucleotide at the same site in the HIV-1 consensus sequence from the same subtype. Here, all viruses were subtype C, so we compared with the subtype C consensus sequence as defined by LANL.

Identifying mutations that affect N-linked glycosylation

To identify mutations that affect glycosylation, we search for Env mutations that modify the N-linked glycosylation motif Asn-X-Ser/Thr, where X can be any amino acid except proline. We identified three types of mutations affecting glycosylation: ‘shields’, which complete a previously incomplete glycosylation motif, ‘holes’, which disrupt an existing glycosylation motif, and ‘shifts’, which simultaneously complete one N-linked glycosylation motif and disrupt another.

Enrichment analysis

We used fold enrichment values and Fisher’s exact test to quantify the excess or lack of mutations. For a particular subset of mutations (for example, the top 𝑥% beneficial mutations), we first computed the number of mutations in that subset that do (𝑛sel) and do not (𝑁sel) have a particular property (e.g. nonsynonymous mutations in the CD4 binding site). We then computed the total number of mutations that do and do not have the property (𝑛null and 𝑁null, respectively) across the entire data set. The fold enrichment value is then 𝑛sel𝑁sel𝑛null𝑁null. The term 𝑛sel𝑁sel quantifies the fraction of mutations having specific properties across the selected mutations, while the denominator 𝑛null𝑁null is the fraction of all mutations that have the property. Fisher’s exact 𝑝 values are computed from the 2 × 2 table with 𝑛sel, 𝑛null𝑛sel in the first row, and 𝑁sel𝑛sel, 𝑁null𝑁sel(𝑛null𝑛sel) in the second row (Ruxton and Neuhäuser, 2010).

Inferring fitness effects of mutations

In this section, we describe the inference framework used to infer the fitness effects of mutations (selection coefficients) from temporal genetic data.

Evolutionary model

We model viral evolution with the Wright-Fisher (WF) model, a fundamental model in population genetics (Ewens, 2004). In this model, a population of 𝑁 individuals (viruses or infected cells, in our case) undergoes discrete rounds of selection, mutation, and replication. Each genotype 𝛼 is represented by a sequence 𝑔𝛼=((𝑔𝑖,𝑎𝛼)𝑎=1𝑞)𝑖=1𝐿, where 𝑔𝑖,𝑎𝛼 is equal to one if genotype 𝛼 has allele 𝑎 at locus 𝑖 and zero otherwise. Here, 𝐿 and 𝑞 represent the length of the genetic sequence (number of loci) and the number of statues at each locus (i.e. number of nucleotides or amino acids), respectively. We use 𝑞=5 and q=21 for DNA and amino acid sequences, respectively, in real HIV-1 and SHIV data.

We define the fitness of an individual with genetic sequence 𝑔 by

F(g)=1+i=1La=1qsi(a)gi,a. (4)

Here 𝑠𝑖(𝑎) is a selection coefficient, quantifying the fitness effect of allele 𝑎 at locus 𝑖. If 𝑠𝑖(𝑎)>0, the allele 𝑎 is beneficial (enhancing replication), and if 𝑠𝑖(𝑎)<0 it is deleterious (impairing replication). By convention, we set the selection coefficient for TF alleles to zero. Individuals with higher fitness values are more likely to replicate than those with lower fitness.

Mutations introduce new genotypes and drive the evolution of the population. Let us define 𝜇𝛼𝛽 as the probability of mutation from genotype 𝛼 to genotype 𝛽 per replication cycle. Below, we will express this probability in terms of a mutation rate per site per round of replication. In the analysis of real data, we use asymmetric mutation rates estimated from intra-host HIV-1 data (Zanini et al., 2015).

Given these parameters, the WF model describes the dynamics of the frequencies of different genotypes in the population over time. We write the frequency of genotype 𝛼 at time 𝑡 as 𝑧𝛼(𝑡). Given that the frequency of genotypes in the population at time 𝑡 is 𝑧(𝑡)=(𝑧1(𝑡),𝑧2(𝑡),,𝑧𝑀(𝑡)), where 𝑀 is the total number of genotypes, the probability distribution of the frequency of genotypes in the next generation 𝑧(𝑡+1) is

P(z(t+1)|z(t);s,μ,N)=a=1M(pα(z(t))Nzα(t+1)[Nzα(t+1)]!). (5)

pα here is

pα(z(t))Fαzα+β(α)[μβαzβ(t)μαβzα(t)]. (6)

where 𝐹𝛼 is the fitness value of genotype α, based on Equation 4. Across 𝐾 generations, the probability of an entire evolutionary trajectory, defined by the vector of genotype frequencies at each time, is then

P((z(tk))k=0K|s,μ,N)=k=0K1P(z(tk+1)|z(tk);s,μ,N). (7)

Diffusion limit

When the population size is sufficiently large, the evolution of the population defined in Equation 5 can be reasonably well approximated by a Gaussian process, which is a solution to the Fokker-Planck (forward Kolmogorov) equation (Kimura, 1964; Crow, 2017).

P(z(t+Δt)|z(t);s,μ,N)N(z(t)+Δtd(t),C(z(t))/N), (8)

with the drift vector 𝑑(𝑡) and the diffusion matrix 𝐶(𝑧(𝑡))𝑁 such that

Cαβ(z(t))={zα(t)zβ(t)for αβzα(t)(1zα(t))for α=β , (9)

and

d(t)=C(z(t))s+u . (10)

Dimensional reduction

While the WF process in genotype space provides valuable insights into genotype dynamics, the mathematical expressions are sometimes challenging to interpret. To obtain more intuitive expressions, we can project the dynamics onto the space of allele frequencies,

xi(a)=agi,aαzα. (11)

One can then find the drift vector 𝑑 and diffusion matrix 𝐶𝑁 in allele frequency space,

di(a)=Cii(a,a)si(a)+j(i)Lb=1qCij(a,b)sj(b)+ui(a), (12)

and

Cij(a,b)={xij(a,b)xi(a)xj(b)for ijxi(a)(1xi(a))for i=j . (13)

Here, 𝑥𝑖𝑗(𝑎,𝑏) is the frequency of individuals with alleles 𝑎 and 𝑏 at loci 𝑖 and 𝑗, and 𝑢𝑖(𝑎) is net expected change in frequency of allele 𝑎 at 𝑖 due to mutations, which is given explicitly in Equation 17 below. The first term in Equation 12 gives the expected change in the frequency 𝑥𝑖(𝑎) due to the direct fitness effect 𝑠𝑖(𝑎), while the second term represents the contributions due to indirect or genetic linkage effects with other alleles 𝑗.

Maximum path likelihood

Following recent work (Sohail et al., 2021), we employed Bayes’ rule to find the selection coefficients that best explain the data. These are the coefficients that maximize the posterior distribution

𝑃posterior(𝑠|(𝑧(𝑡𝑘))𝑘)𝑃((𝑧(𝑡𝑘))𝑘|𝑠)𝑃prior(𝑠), (14)

which is a product of the likelihood of the evolutionary trajectory observed in the data Equation 7 (under the diffusion limit Equation 8) and a prior distribution for the selection coefficients. We chose a Gaussian prior distribution with zero mean and a covariance of 𝐼(𝑁𝛾), where 𝐼 is the identity matrix. This prior distribution penalizes the inferences of large selection coefficients when they are not well-supported by the data. The maximum a posteriori selection coefficients are then given by Sohail et al., 2021

s^=(Cint+γI)1[ΔxintΔuint]. (15)

Here 𝐶int, Δ𝑥int, and Δ𝑢int represent the covariance matrix, vector of frequency changes, and mutational flux integrated over the evolution

Cint=k=0KΔtkC(x(tk))Δxint=x(tK)x(t0)Δuint=k=0K1Δtku(tk). (16)

The mutational flux 𝑢 is characterized by the rates of mutations from nucleotides 𝑏 to 𝑎, denoted by 𝜇𝑎𝑏, which are determined from longitudinal HIV-1 populations in untreated patients (Zanini et al., 2015). The change of the 𝑎 nucleotide frequency at locus 𝑖 due to mutation is given by:

ui(a)=b(μabxi(b)μbaxi(a)). (17)

Inverting the integrated covariance matrix effectively reveals the underlying direct allele interactions and resolves the genetic linkage effects.

The shift in the covariance diagonal in Equation 15, arising from the selection coefficients’ posterior distribution, reflects the uncertainty in the selection distribution. We used γ=10 for all data sets, but the model is robust to variation in the strength of regularization (Sohail et al., 2021). For the mutation rates, we incorporated the transition probabilities among arbitrary DNA nucleotides, estimated from whole-genome deep sequencing of multiple untreated HIV-1 patients followed for 5–8 years post-infection (Zanini et al., 2015).

Integration of covariance

When the time interval of the observation Δt is sufficiently short, the trajectory of the allele frequency would be continuous and ideally it would be a smooth curve (Shimagaki and Barton, 2023). To accurately estimate the covariance matrix, we employ piecewise linear interpolation for frequencies. Let τ[0,1], then the linear interpolation for a frequency vector can be expressed as:

xi[k,k+1](τ)=(1τ)xi(tk)+τxi(tk+1) ,xij[k,k+1](τ)=(1τ)xij(tk)+τxij(tk+1) , (18)

which yields,

𝑥𝑖int=𝑘=0𝐾1Δ𝑡𝑘01𝑑𝜏𝑥𝑖[𝑘,𝑘+1](𝜏)𝐶𝑖𝑗int=𝑘=0𝐾1Δ𝑡𝑘01𝑑𝜏 (𝑥𝑖𝑗[𝑘,𝑘+1](𝜏)𝑥𝑖[𝑘,𝑘+1](𝜏)𝑥𝑗[𝑘,𝑘+1](𝜏)).

For simplicity in notation, we omitted nucleotide indices. The explicit expression of the integrated covariance is given in Sohail et al., 2021; Shimagaki and Barton, 2023.

Joint RM model

In addition to fitness models derived from SHIV data for individual RMs, we inferred a joint model under the assumption that virus evolution within each individual RM with the same TF virus is governed by a similar fitness landscape. This method improves inference accuracy from the WF process and deep mutational scanning data (Sohail et al., 2022; Shimagaki and Barton, 2025b; Hong et al., 2024). The joint path likelihood for allele frequency trajectories across RMs with the same TF virus is then

p(((xα(tk))k=0Kα)r=1R|s,μ,γ)=(r=1Rk=0K1pM(xr(tk+1)|xr(tk);s,μ,N))p(x(t0)) p(s|γ). (19)

Here, 𝑥𝑟 is the allele frequency of the 𝑟-th individual and 𝑅 is the number of replicate individuals (i.e. the number of RMs sharing the same TF virus). The initial state is 𝑝(𝑥𝑟(𝑡0))=𝑝(𝑥(𝑡0)) for all 𝑟 for individuals with the same TF virus. The solution of the joint path likelihood is given by

s^=(C¯int+γI)1[Δx¯intΔaμ¯int]. (20)

Here, the overbar denotes the sum over the replicate RMs.

We emphasize that the joint selection coefficients in Equation 20 are not the same as selection coefficients that are simply averaged across RMs with the same TF virus. The joint selection coefficients are more robust, as they are guided by the level of evidence within each individual rather than naive averaging.

Geometrical interpretation of the fitness comparison

The Pearson values we utilized to compare the fitness landscapes denoted as 𝐹𝑠(𝑔)=𝐹(𝑔|𝑠) and 𝐹(𝑔)=𝐹(𝑔|) can be expressed by the following simple relation:

r=Cov(Fs,Fh)Var(Fs)Var(Fh)=sChsCshCh. (21)

Here, Cov(𝐹𝑠,𝐹) and Var(𝐹𝑠) represent the covariance and variance values estimated from the samples being compared, (𝐹𝑠(𝑔𝑛),𝐹(𝑔𝑛))𝑛. 𝐶 is the covariance matrix defined between arbitrary loci. The last equation can be interpreted as an angle between two vectors, 𝑠 and , with a metric matrix 𝐶; if 𝑠=, the Pearson value clearly becomes 1. However, the ‘similarity’ also depends on how these vectors are projected by 𝐶; eigenmodes associated with larger variance of statistics will be more emphasized. The last expression readily implies an interpretation for the case of shuffled sequences; shuffling the sequences equates to diluting the covariance between loci, resulting in the metric matrix becoming a diagonal matrix. Removing the off-diagonal elements corresponds to lifting the constraints on the fitness landscape.

Epistatic fitness model

We consider the following pairwise epistatic fitness function, which depends on epistatic interactions 𝑠𝑖𝑗 across all possible pairs of loci:

F(g)=1+i=1La=1qsi(a)gi,a+i<ja,b=1sij(a,b)gi,agj,b. (22)

Our goal is to obtain the epistatic interactions 𝑠𝑖𝑗(𝑎,𝑏) as well as the selection coefficients 𝑠𝑖(𝑎) from temporal genetic sequences.

The basic logic for inferring these fitness parameters parallels the additive case. The only practical difference is that epistatic interactions can influence the dynamics of additive and pairwise mutation frequencies. Under the diffusion limit, we obtain the following drift terms, which align with (12) in the additive model,

di(a)=k=1Lc=1qCik(a,c)sk(c)+k<lLc,d=1qCikl(a,c,d)skl(c,d)+ui(a)dij(a,b)=kLc=1LCijk(a,b,c)sk(c)+k<lLc,d=1qCijkl(a,b,c,d)skl(c,d)+uij(a,b)+vij(a,b). (23)

and diffusion matrices,

𝐶𝑖𝑘𝑙(𝑎,𝑐,𝑑)=𝑥𝑖𝑘𝑙(𝑎,𝑐,𝑑)𝑥𝑖(𝑎)𝑥𝑘𝑙(𝑐,𝑑)𝐶𝑗𝑖𝑘𝑙(𝑎,𝑏,𝑐,𝑑)=𝑥𝑖𝑗𝑘𝑙(𝑎,𝑏,𝑐,𝑑)𝑥𝑖𝑗(𝑎,𝑏)𝑥𝑘𝑙(𝑐,𝑑). (24)

Here, 𝑢𝑖 and 𝑢𝑖𝑗 represent the expected frequency changes due to mutations for additive and pairwise terms, while 𝑣𝑖 represents the changes in pairwise frequencies due to recombination. These explicit expressions indicate that the 𝑢𝑖 remains the same as in the additive fitness case; therefore, Equation 17 holds. The pairwise term is given as

uij(a,b)=c=1q([μbcxij(a,c)+μacxij(c,d)][μcd+μca]xij(a,b)). (25)

and the 𝑣𝑖𝑗 is expressed as

vij(a,b)=r|ij|Cij(a,b), (26)

where 𝑟 denotes the recombination rate. In this study, we set 𝑟=105 . More detailed derivations are provided in previous studies (Sohail et al., 2022; Shimagaki and Barton, 2025b).

The technical challenge of the epistasis inference is that the diffusion matrix 𝐶 involves third- and fourth-order interactions, and the number of matrix elements scales as 𝑂((𝑞𝐿)4), while the computational cost to invert it scales as 𝑂((𝑞𝐿)6). Recently, a more efficient computational method was proposed, reducing both the necessary memory usage and computational times by 𝑂((𝑞𝐿)2) (Shimagaki and Barton, 2025b). The essential idea involves factorizing the higher-order covariance matrix using the rectangular matrix Ξ𝐷×𝑑 such that 𝐶=ΞΞ, where 𝐷 scales as 𝑂((𝑞𝐿)2) while 𝑑𝐷. This method resolves the linear equation without obtaining an explicit expression of 𝐶 and avoids any computations involving more than 𝑂((𝑞𝐿)2) operations (Shimagaki and Barton, 2025b).

Gauge transformation

Since constant shifts in fitness parameters do not affect relative fitness, it is always possible to transform fitness values without changing the resulting genotype distribution. For example, in the additive model, individual selection coefficients can be shifted as

si(a)si(a)si(g~i),

where 𝑔~𝑖 is the allele of a chosen reference genotype 𝑔~ (e.g. the TF sequence) at site 𝑖. This transformation preserves relative fitness, but it ensures that 𝑠𝑖(𝑔~𝑖)=0, making inferred selection coefficients more interpretable.

In statistical physics, such transformations, where model parameters are changed without altering the underlying probability distribution, are referred to as gauge transformations (Weigt et al., 2009; Morcos et al., 2011). Similar transformations have been employed in recent studies to improve interpretability and sparsity in epistatic models (Rizzato et al., 2020). We can apply an analogous transformation to epistatic interactions:

si(a)si(a)si(g~i)+j|ji[sij(a,g~j)sij(g~i,g~j)],sij(a,b)sij(a,b)sij(g~i,b)sij(a,g~j)+sij(g~i,g~j). (27)

Under this transformation, any selection coefficients or epistatic terms involving TF alleles are zero by definition, while relative fitness remains unchanged.

Regularization

Regularization is used to reduce the effective number of parameters in the fitness model. In our analysis, we applied strong regularization (γ=1010) to any selection or epistatic coefficients involving TF alleles, ensuring they are effectively zero under the gauge transformation. Following prior work (Shimagaki and Barton, 2025b), we penalized epistatic interactions between loci more than 50 nucleotides apart on the reference sequence with the same strong regularization. We also used the same moderate regularization value of γ=50 for all other epistatic terms and used γ=10 for selection coefficients, consistent with the additive model.

Acknowledgements

The work of KSS and JPB reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R35GM138233.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

John P Barton, Email: jpbarton@pitt.edu.

Anne-Florence Bitbol, Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland.

Joshua T Schiffer, Fred Hutch Cancer Center, United States.

Funding Information

This paper was supported by the following grant:

  • National Institutes of Health R35GM138233 to Kai S Shimagaki, John P Barton.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Investigation, Writing – original draft, Writing – review and editing.

Conceptualization, Data curation, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Additional files

Supplementary file 1. CH103 and CH235 resistance mutations.
elife-105466-supp1.xlsx (5.2KB, xlsx)
Supplementary file 2. Strain-specific antibody resistance mutations in CH505.
elife-105466-supp2.xlsx (5.3KB, xlsx)
Supplementary file 3. DH272, DH475, and strain-specific antibody resistance mutations in CH848.
elife-105466-supp3.xlsx (5.2KB, xlsx)
Supplementary file 4. Biological effects of the HIV-1 mutations inferred to be the most beneficial in CH505.
elife-105466-supp4.xlsx (6.3KB, xlsx)
Supplementary file 5. Biological effects of the HIV-1 mutations inferred to be the most beneficial in CH848.
elife-105466-supp5.xlsx (6.2KB, xlsx)
Supplementary file 6. CH505 and SHIV.CH505 sequence statistics.
Supplementary file 7. CH848 and SHIV.CH848 sequence statistics.
elife-105466-supp7.xlsx (5.2KB, xlsx)
Supplementary file 8. Biological effects of strongly selected SHIV.CH505 mutations.
elife-105466-supp8.xlsx (5.5KB, xlsx)
Supplementary file 9. Biological effects of strongly selected SHIV.CH848 mutations.
elife-105466-supp9.xlsx (6.2KB, xlsx)
Supplementary file 10. Selective advantage of mutations that confer resistance to antibodies in SHIV.CH505.
elife-105466-supp10.xlsx (5.4KB, xlsx)
Supplementary file 11. Selective advantage of mutations that confer resistance to antibodies in SHIV.CH848.
elife-105466-supp11.xlsx (5.1KB, xlsx)
Supplementary file 12. List of selected sites using LASSIE in SHIV.CH505.
elife-105466-supp12.xlsx (5.7KB, xlsx)
Supplementary file 13. List of selected sites using LASSIE in SHIV.CH848.
elife-105466-supp13.xlsx (4.9KB, xlsx)
MDAR checklist

Data availability

Data and code accompanying this manuscript is publicly available at the GitHub repository https://github.com/bartonlab/paper-HIV-coevolution (copy archived at Shimagaki and Barton, 2025a). This repository contains source files that process HIV-1 and SHIV sequences, infer selection coefficients, and identify and characterize mutations. The included Jupyter notebooks can be run to reproduce the figures presented here. The original HIV-1 sequences can be retrieved from the LANL database (https://www.hiv.lanl.gov/content/index), and SHIV sequences can be found at GenBank (https://www.ncbi.nlm.nih.gov/).

References

  1. Allen TM, Altfeld M, Geer SC, Kalife ET, Moore C, O’sullivan KM, Desouza I, Feeney ME, Eldridge RL, Maier EL, Kaufmann DE, Lahaie MP, Reyor L, Tanzi G, Johnston MN, Brander C, Draenert R, Rockstroh JK, Jessen H, Rosenberg ES, Mallal SA, Walker BD. Selective escape from CD8+ T-cell responses represents a major driving force of human immunodeficiency virus type 1 (HIV-1) sequence diversity and reveals constraints on HIV-1 evolution. Journal of Virology. 2005;79:13239–13249. doi: 10.1128/JVI.79.21.13239-13249.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Altfeld M, Allen TM. Hitting HIV where it hurts: an alternative approach to HIV vaccine design. TRENDS in Immunology. 2006;27:504–510. doi: 10.1016/j.it.2006.09.007. [DOI] [PubMed] [Google Scholar]
  3. Barton JP, Goonetilleke N, Butler TC, Walker BD, McMichael AJ, Chakraborty AK. Relative rate and location of intra-host HIV evolution to evade cellular immunity are predictable. Nature Communications. 2016;7:11660. doi: 10.1038/ncomms11660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Barton JP, Rajkoomar E, Mann JK, Murakowski DK, Toyoda M, Mahiti M, Mwimanzi P, Ueno T, Chakraborty AK, Ndung’u T. Modelling and in vitro testing of the HIV-1 Nef fitness landscape. Virus Evolution. 2019;5:vez029. doi: 10.1093/ve/vez029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bauer A, Lindemuth E, Marino FE, Krause R, Joy J, Docken SS, Mallick S, McCormick K, Holt C, Georgiev I, Felber B, Keele BF, Veazey R, Davenport MP, Li H, Shaw GM, Bar KJ. Adaptation of a transmitted/founder simian-human immunodeficiency virus for enhanced replication in rhesus macaques. PLOS Pathogens. 2023;19:e1011059. doi: 10.1371/journal.ppat.1011059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Benson DA, Karsch-Mizrachi I, Clark K, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Research. 2012;40:D48–D53. doi: 10.1093/nar/gkr1202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bollback JP, York TL, Nielsen R. Estimation of 2Nes from temporal allele frequency data. Genetics. 2008;179:497–502. doi: 10.1534/genetics.107.085019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bons E, Leemann C, Metzner KJ, Regoes RR. Long-term experimental evolution of HIV-1 reveals effects of environment and mutational history. PLOS Biology. 2020;18:e3001010. doi: 10.1371/journal.pbio.3001010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bonsignori M, Zhou T, Sheng Z, Chen L, Gao F, Joyce MG, Ozorowski G, Chuang G-Y, Schramm CA, Wiehe K, Alam SM, Bradley T, Gladden MA, Hwang K-K, Iyengar S, Kumar A, Lu X, Luo K, Mangiapani MC, Parks RJ, Song H, Acharya P, Bailer RT, Cao A, Druz A, Georgiev IS, Kwon YD, Louder MK, Zhang B, Zheng A, Hill BJ, Kong R, Soto C, NISC Comparative Sequencing Program. Mullikin JC, Douek DC, Montefiori DC, Moody MA, Shaw GM, Hahn BH, Kelsoe G, Hraber PT, Korber BT, Boyd SD, Fire AZ, Kepler TB, Shapiro L, Ward AB, Mascola JR, Liao H-X, Kwong PD, Haynes BF. Maturation pathway from germline to broad HIV-1 neutralizer of a cd4-mimic antibody. Cell. 2016;165:449–463. doi: 10.1016/j.cell.2016.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bonsignori M, Kreider EF, Fera D, Meyerhoff RR, Bradley T, Wiehe K, Alam SM, Aussedat B, Walkowicz WE, Hwang K-K, Saunders KO, Zhang R, Gladden MA, Monroe A, Kumar A, Xia S-M, Cooper M, Louder MK, McKee K, Bailer RT, Pier BW, Jette CA, Kelsoe G, Williams WB, Morris L, Kappes J, Wagh K, Kamanga G, Cohen MS, Hraber PT, Montefiori DC, Trama A, Liao H-X, Kepler TB, Moody MA, Gao F, Danishefsky SJ, Mascola JR, Shaw GM, Hahn BH, Harrison SC, Korber BT, Haynes BF. Staged induction of HIV-1 glycan-dependent broadly neutralizing antibodies. Science Translational Medicine. 2017;9:eaai7514. doi: 10.1126/scitranslmed.aai7514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Burton DR, Hangartner L. Broadly neutralizing antibodies to HIV and their role in vaccine design. Annual Review of Immunology. 2016;34:635–659. doi: 10.1146/annurev-immunol-041015-055515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Choisy M, Woelk CH, Guégan J-F, Robertson DL. Comparative study of adaptive molecular evolution in different human immunodeficiency virus groups and subtypes. Journal of Virology. 2004;78:1962–1970. doi: 10.1128/jvi.78.4.1962-1970.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cornelissen M, Euler Z, van den Kerkhof TLGM, van Gils MJ, Boeser-Nunnink BDM, Kootstra NA, Zorgdrager F, Schuitemaker H, Prins JM, Sanders RW, van der Kuyl AC. The neutralizing antibody response in an individual with triple HIV-1 infection remains directed at the first infecting subtype. AIDS Research and Human Retroviruses. 2016;32:1135–1142. doi: 10.1089/aid.2015.0324. [DOI] [PubMed] [Google Scholar]
  14. Crow JF. An Introduction to Population Genetics Theory. Scientific Publishers; 2017. [Google Scholar]
  15. Das SR, Jameel S. Biology of the HIV Nef protein. The Indian Journal of Medical Research. 2005;121:315–332. [PubMed] [Google Scholar]
  16. Doria-Rose NA, Klein RM, Daniels MG, O’Dell S, Nason M, Lapedes A, Bhattacharya T, Migueles SA, Wyatt RT, Korber BT, Mascola JR, Connors M. Breadth of human immunodeficiency virus-specific neutralizing activity in sera: clustering analysis and association with clinical variables. Journal of Virology. 2010;84:1631–1636. doi: 10.1128/JVI.01482-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Doria-Rose NA, Schramm CA, Gorman J, Moore PL, Bhiman JN, DeKosky BJ, Ernandes MJ, Georgiev IS, Kim HJ, Pancera M, Staupe RP, Altae-Tran HR, Bailer RT, Crooks ET, Cupo A, Druz A, Garrett NJ, Hoi KH, Kong R, Louder MK, Longo NS, McKee K, Nonyane M, O’Dell S, Roark RS, Rudicell RS, Schmidt SD, Sheward DJ, Soto C, Wibmer CK, Yang Y, Zhang Z, Mullikin JC, Binley JM, Sanders RW, Wilson IA, Moore JP, Ward AB, Georgiou G, Williamson C, Abdool Karim SS, Morris L, Kwong PD, Shapiro L, Mascola JR, NISC Comparative Sequencing Developmental pathway for potent V1V2-directed HIV-neutralizing antibodies. Nature. 2014;509:55–62. doi: 10.1038/nature13036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dosenovic P, von Boehmer L, Escolano A, Jardine J, Freund NT, Gitlin AD, McGuire AT, Kulp DW, Oliveira T, Scharf L, Pietzsch J, Gray MD, Cupo A, van Gils MJ, Yao K-H, Liu C, Gazumyan A, Seaman MS, Björkman PJ, Sanders RW, Moore JP, Stamatatos L, Schief WR, Nussenzweig MC. Immunization for HIV-1 broadly neutralizing antibodies in human ig knockin mice. Cell. 2015;161:1505–1515. doi: 10.1016/j.cell.2015.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Draenert R, Allen TM, Liu Y, Wrin T, Chappey C, Verrill CL, Sirera G, Eldridge RL, Lahaie MP, Ruiz L, Clotet B, Petropoulos CJ, Walker BD, Martinez-Picado J. Constraints on HIV-1 evolution and immunodominance revealed in monozygotic adult twins infected with the same virus. The Journal of Experimental Medicine. 2006;203:529–539. doi: 10.1084/jem.20052116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Escolano A, Steichen JM, Dosenovic P, Kulp DW, Golijanin J, Sok D, Freund NT, Gitlin AD, Oliveira T, Araki T, Lowe S, Chen ST, Heinemann J, Yao K-H, Georgeson E, Saye-Francisco KL, Gazumyan A, Adachi Y, Kubitz M, Burton DR, Schief WR, Nussenzweig MC. Sequential immunization elicits broadly neutralizing anti-HIV-1 antibodies in ig knockin mice. Cell. 2016;166:1445–1458. doi: 10.1016/j.cell.2016.07.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ewens WJ. Mathematical Population Genetics: Theoretical Introduction. Springer; 2004. [DOI] [Google Scholar]
  22. Feder AF, Kryazhimskiy S, Plotkin JB. Identifying signatures of selection in genetic time series. Genetics. 2014;196:509–522. doi: 10.1534/genetics.113.158220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Feder AF, Rhee SY, Holmes SP, Shafer RW, Petrov DA, Pennings PS. More effective drugs lead to harder selective sweeps in the evolution of drug resistance in HIV-1. eLife. 2016;5:e10670. doi: 10.7554/eLife.10670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Feder AF, Harper KN, Brumme CJ, Pennings PS. Understanding patterns of HIV multi-drug resistance through models of temporal and spatial drug heterogeneity. eLife. 2021;10:e69032. doi: 10.7554/eLife.69032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Ferguson AL, Mann JK, Omarjee S, Ndung’u T, Walker BD, Chakraborty AK. Translating HIV sequences into quantitative fitness landscapes predicts viral vulnerabilities for rational immunogen design. Immunity. 2013;38:606–617. doi: 10.1016/j.immuni.2012.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Foll M, Poh Y-P, Renzette N, Ferrer-Admetlla A, Bank C, Shim H, Malaspinas A-S, Ewing G, Liu P, Wegmann D, Caffrey DR, Zeldovich KB, Bolon DN, Wang JP, Kowalik TF, Schiffer CA, Finberg RW, Jensen JD. Influenza virus drug resistance: a time-sampled population genetics perspective. PLOS Genetics. 2014;10:e1004185. doi: 10.1371/journal.pgen.1004185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Gao F, Bonsignori M, Liao H-X, Kumar A, Xia S-M, Lu X, Cai F, Hwang K-K, Song H, Zhou T, Lynch RM, Alam SM, Moody MA, Ferrari G, Berrong M, Kelsoe G, Shaw GM, Hahn BH, Montefiori DC, Kamanga G, Cohen MS, Hraber P, Kwong PD, Korber BT, Mascola JR, Kepler TB, Haynes BF. Cooperation of B cell lineages in induction of HIV-1-broadly neutralizing antibodies. Cell. 2014;158:481–491. doi: 10.1016/j.cell.2014.06.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Gao Y, Barton JP. A binary trait model reveals the fitness effects of HIV-1 escape from T cell responses. PNAS. 2025;122:e2405379122. doi: 10.1073/pnas.2405379122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Garcia V, Regoes RR. The effect of interference on the CD8(+) T cell escape rates in HIV. Frontiers in Immunology. 2014;5:661. doi: 10.3389/fimmu.2014.00661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Garcia V, Feldman MW, Regoes RR. Investigating the consequences of interference between multiple CD8+ T cell escape mutations in early HIV infection. PLOS Computational Biology. 2016;12:e1004721. doi: 10.1371/journal.pcbi.1004721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Gray ES, Madiga MC, Hermanus T, Moore PL, Wibmer CK, Tumba NL, Werner L, Mlisana K, Sibeko S, Williamson C, Abdool Karim SS, Morris L, and the CAPRISA002 Study Team The neutralization breadth of HIV-1 develops incrementally over four years and is associated with CD4 + T cell decline and high viral load during acute infection. Journal of Virology. 2011;85:4828–4840. doi: 10.1128/JVI.00198-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hatziioannou T, Evans DT. Animal models for HIV/AIDS research. Nature Reviews. Microbiology. 2012;10:852–867. doi: 10.1038/nrmicro2911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Haynes BF, Kelsoe G, Harrison SC, Kepler TB. B-cell-lineage immunogen design in vaccine development with HIV-1 as a case study. Nature Biotechnology. 2012;30:423–433. doi: 10.1038/nbt.2197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Haynes BF, Shaw GM, Korber B, Kelsoe G, Sodroski J, Hahn BH, Borrow P, McMichael AJ. HIV-host interactions: implications for vaccine design. Cell Host & Microbe. 2016;19:292–303. doi: 10.1016/j.chom.2016.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Haynes BF, Wiehe K, Borrow P, Saunders KO, Korber B, Wagh K, McMichael AJ, Kelsoe G, Hahn BH, Alt F, Shaw GM. Strategies for HIV-1 vaccines that induce broadly neutralizing antibodies. Nature Reviews. Immunology. 2023;23:142–158. doi: 10.1038/s41577-022-00753-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. He Z, Dai X, Lyu W, Beaumont M, Yu F. Estimating temporally variable selection intensity from ancient DNA Data. Molecular Biology and Evolution. 2023;40:msad008. doi: 10.1093/molbev/msad008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hie B, Zhong ED, Berger B, Bryson B. Learning the language of viral evolution and escape. Science. 2021;371:284–288. doi: 10.1126/science.abd7331. [DOI] [PubMed] [Google Scholar]
  38. Hong Z, Shimagaki KS, Barton JP. popDMS infers mutation effects from deep mutational scanning data. Bioinformatics. 2024;40:btae499. doi: 10.1093/bioinformatics/btae499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hraber P, Seaman MS, Bailer RT, Mascola JR, Montefiori DC, Korber BT. Prevalence of broadly neutralizing antibody responses during chronic HIV-1 infection. AIDS. 2014;28:163–169. doi: 10.1097/QAD.0000000000000106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hraber P, Korber B, Wagh K, Giorgi EE, Bhattacharya T, Gnanakaran S, Lapedes AS, Learn GH, Kreider EF, Li Y, Shaw GM, Hahn BH, Montefiori DC, Alam SM, Bonsignori M, Moody MA, Liao H-X, Gao F, Haynes BF. Longitudinal antigenic sequences and sites from intra-Host Evolution (LASSIE) identifies immune-selected HIV Variants. Viruses. 2015;7:5443–5475. doi: 10.3390/v7102881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Illingworth CJR, Mustonen V. Distinguishing driver and passenger mutations in an evolutionary history categorized by interference. Genetics. 2011;189:989–1000. doi: 10.1534/genetics.111.133975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Illingworth CJR, Mustonen V. A method to infer positive selection from marker dynamics in an asexual population. Bioinformatics. 2012;28:831–837. doi: 10.1093/bioinformatics/btr722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Jardine JG, Sok D, Julien J-P, Briney B, Sarkar A, Liang C-H, Scherer EA, Henry Dunand CJ, Adachi Y, Diwanji D, Hsueh J, Jones M, Kalyuzhniy O, Kubitz M, Spencer S, Pauthner M, Saye-Francisco KL, Sesterhenn F, Wilson PC, Galloway DM, Stanfield RL, Wilson IA, Burton DR, Schief WR. Minimally Mutated HIV-1 broadly neutralizing antibodies to guide reductionist vaccine design. PLOS Pathogens. 2016;12:e1005815. doi: 10.1371/journal.ppat.1005815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kimura M. Diffusion models in population genetics. Journal of Applied Probability. 1964;1:177–232. doi: 10.2307/3211856. [DOI] [Google Scholar]
  45. Klein F, Mouquet H, Dosenovic P, Scheid JF, Scharf L, Nussenzweig MC. Antibodies in HIV-1 vaccine development and therapy. Science. 2013;341:1199–1204. doi: 10.1126/science.1241144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Kong R, Duan H, Sheng Z, Xu K, Acharya P, Chen X, Cheng C, Dingens AS, Gorman J, Sastry M, Shen C-H, Zhang B, Zhou T, Chuang G-Y, Chao CW, Gu Y, Jafari AJ, Louder MK, O’Dell S, Rowshan AP, Viox EG, Wang Y, Choi CW, Corcoran MM, Corrigan AR, Dandey VP, Eng ET, Geng H, Foulds KE, Guo Y, Kwon YD, Lin B, Liu K, Mason RD, Nason MC, Ohr TY, Ou L, Rawi R, Sarfo EK, Schön A, Todd JP, Wang S, Wei H, Wu W, NISC Comparative Sequencing Program. Mullikin JC, Bailer RT, Doria-Rose NA, Karlsson Hedestam GB, Scorpio DG, Overbaugh J, Bloom JD, Carragher B, Potter CS, Shapiro L, Kwong PD, Mascola JR. Antibody lineages with vaccine-induced antigen-binding hotspots develop broad HIV neutralization. Cell. 2019;178:567–584. doi: 10.1016/j.cell.2019.06.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kreer C, Lupo C, Ercanoglu MS, Gieselmann L, Spisak N, Grossbach J, Schlotz M, Schommers P, Gruell H, Dold L, Beyer A, Nourmohammad A, Mora T, Walczak AM, Klein F. Probabilities of developing HIV-1 bNAb sequence features in uninfected and chronically infected individuals. Nature Communications. 2023;14:7137. doi: 10.1038/s41467-023-42906-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Kwong PD, Mascola JR, Nabel GJ. Broadly neutralizing antibodies and the search for an HIV-1 vaccine: the end of the beginning. Nature Reviews. Immunology. 2013;13:693–701. doi: 10.1038/nri3516. [DOI] [PubMed] [Google Scholar]
  49. Lacerda M, Seoighe C. Population genetics inference for longitudinally-sampled mutants under strong selection. Genetics. 2014;198:1237–1250. doi: 10.1534/genetics.114.167957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Landais E, Moore PL. Development of broadly neutralizing antibodies in HIV-1 infected elite neutralizers. Retrovirology. 2018;15:61. doi: 10.1186/s12977-018-0443-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Lässig M, Mustonen V, Walczak AM. Predicting evolution. Nature Ecology & Evolution. 2017;1:0077. doi: 10.1038/s41559-017-0077. [DOI] [PubMed] [Google Scholar]
  52. Lee B, Quadeer AA, Sohail MS, Finney E, Ahmed SF, McKay MR, Barton JP. Inferring effects of mutations on SARS-CoV-2 transmission from genomic surveillance data. Nature Communications. 2025;16:441. doi: 10.1038/s41467-024-55593-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Li B, Gladden AD, Altfeld M, Kaldor JM, Cooper DA, Kelleher AD, Allen TM. Rapid reversion of sequence polymorphisms dominates early human immunodeficiency virus type 1 Evolution. Journal of Virology. 2007;81:193–201. doi: 10.1128/JVI.01231-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Li H, Wang S, Kong R, Ding W, Lee F-H, Parker Z, Kim E, Learn GH, Hahn P, Policicchio B, Brocca-Cofano E, Deleage C, Hao X, Chuang G-Y, Gorman J, Gardner M, Lewis MG, Hatziioannou T, Santra S, Apetrei C, Pandrea I, Alam SM, Liao H-X, Shen X, Tomaras GD, Farzan M, Chertova E, Keele BF, Estes JD, Lifson JD, Doms RW, Montefiori DC, Haynes BF, Sodroski JG, Kwong PD, Hahn BH, Shaw GM. Envelope residue 375 substitutions in simian-human immunodeficiency viruses enhance CD4 binding and replication in rhesus macaques. PNAS. 2016;113:E3413–E3422. doi: 10.1073/pnas.1606636113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Li Y, Barton JP. Estimating linkage disequilibrium and selection from allele frequency trajectories. GENETICS. 2023;223:iyac189. doi: 10.1093/genetics/iyac189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Li Y, Barton JP. Correlated allele frequency changes reveal clonal structure and selection in temporal genetic data. Molecular Biology and Evolution. 2024;41:msae060. doi: 10.1093/molbev/msae060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Liao H-X, Lynch R, Zhou T, Gao F, Alam SM, Boyd SD, Fire AZ, Roskin KM, Schramm CA, Zhang Z, Zhu J, Shapiro L, NISC Comparative Sequencing Program. Mullikin JC, Gnanakaran S, Hraber P, Wiehe K, Kelsoe G, Yang G, Xia S-M, Montefiori DC, Parks R, Lloyd KE, Scearce RM, Soderberg KA, Cohen M, Kamanga G, Louder MK, Tran LM, Chen Y, Cai F, Chen S, Moquin S, Du X, Joyce MG, Srivatsan S, Zhang B, Zheng A, Shaw GM, Hahn BH, Kepler TB, Korber BTM, Kwong PD, Mascola JR, Haynes BF. Co-evolution of a broadly neutralizing HIV-1 antibody and founder virus. Nature. 2013;496:469–476. doi: 10.1038/nature12053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Los Alamos National Laboratory Hiv sequence database. 2023a. [March 20, 2023]. https://www.hiv.lanl.gov
  59. Los Alamos National Laboratory Hivalign: Hiv sequence alignment tool. 2023b. [February 24, 2016]. https://www.hiv.lanl.gov/content/sequence/VIRALIGN/viralign.html
  60. Louie RHY, Kaczorowski KJ, Barton JP, Chakraborty AK, McKay MR. Fitness landscape of the human immunodeficiency virus envelope protein that is targeted by antibodies. PNAS. 2018;115:E564–E573. doi: 10.1073/pnas.1717765115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Łuksza M, Lässig M. A predictive fitness model for influenza. Nature. 2014;507:57–61. doi: 10.1038/nature13087. [DOI] [PubMed] [Google Scholar]
  62. Malaspinas AS, Malaspinas O, Evans SN, Slatkin M. Estimating allele age and selection coefficient from time-serial data. Genetics. 2012;192:599–607. doi: 10.1534/genetics.112.140939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Mann JK, Barton JP, Ferguson AL, Omarjee S, Walker BD, Chakraborty A, Ndung’u T. The fitness landscape of HIV-1 gag: advanced modeling approaches and validation of model predictions by in vitro testing. PLOS Computational Biology. 2014;10:e1003776. doi: 10.1371/journal.pcbi.1003776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Mathieson I, McVean G. Estimating selection coefficients in spatially structured populations from time series data of allele frequencies. Genetics. 2013;193:973–984. doi: 10.1534/genetics.112.147611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Mathieson I, Terhorst J. Direct detection of natural selection in Bronze Age Britain. Genome Research. 2022;32:2057–2067. doi: 10.1101/gr.276862.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. McCurley NP, Domi A, Basu R, Saunders KO, LaBranche CC, Montefiori DC, Haynes BF, Robinson HL. HIV transmitted/founder vaccines elicit autologous tier 2 neutralizing antibodies for the CD4 binding site. PLOS ONE. 2017;12:e0177863. doi: 10.1371/journal.pone.0177863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Moore PL, Williamson C, Morris L. Virological features associated with the development of broadly neutralizing antibodies to HIV-1. Trends in Microbiology. 2015;23:204–211. doi: 10.1016/j.tim.2014.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. PNAS. 2011;108:E1293–E1301. doi: 10.1073/pnas.1111471108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Neher RA, Leitner T. Recombination rate and selection strength in HIV intra-patient evolution. PLOS Computational Biology. 2010;6:e1000660. doi: 10.1371/journal.pcbi.1000660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Nourmohammad A, Otwinowski J, Plotkin JB. Host-pathogen coevolution and the emergence of broadly neutralizing antibodies in chronic infections. PLOS Genetics. 2016;12:e1006171. doi: 10.1371/journal.pgen.1006171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Pancera M, McLellan JS, Wu X, Zhu J, Changela A, Schmidt SD, Yang Y, Zhou T, Phogat S, Mascola JR, Kwong PD. Crystal Structure of PG16 and chimeric dissection with somatically related pg9: structure-function analysis of two quaternary-specific antibodies that effectively neutralize HIV-1. Journal of Virology. 2010;84:8098–8110. doi: 10.1128/JVI.00966-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Pandit A, de Boer RJ. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants. Retrovirology. 2014;11:56. doi: 10.1186/1742-4690-11-56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Paris C, Servin B, Boitard S. Inference of selection from genetic time series using various parametric approximations to the wright-fisher model. G3: Genes, Genomes, Genetics. 2019;9:4073–4086. doi: 10.1534/g3.119.400778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Rizzato F, Coucke A, de Leonardis E, Barton JP, Tubiana J, Monasson R, Cocco S. Inference of compressed Potts graphical models. Physical Review E. 2020;101:012309. doi: 10.1103/PhysRevE.101.012309. [DOI] [PubMed] [Google Scholar]
  75. Roark RS, Li H, Williams WB, Chug H, Mason RD, Gorman J, Wang S, Lee F-H, Rando J, Bonsignori M, Hwang K-K, Saunders KO, Wiehe K, Moody MA, Hraber PT, Wagh K, Giorgi EE, Russell RM, Bibollet-Ruche F, Liu W, Connell J, Smith AG, DeVoto J, Murphy AI, Smith J, Ding W, Zhao C, Chohan N, Okumura M, Rosario C, Ding Y, Lindemuth E, Bauer AM, Bar KJ, Ambrozak D, Chao CW, Chuang G-Y, Geng H, Lin BC, Louder MK, Nguyen R, Zhang B, Lewis MG, Raymond DD, Doria-Rose NA, Schramm CA, Douek DC, Roederer M, Kepler TB, Kelsoe G, Mascola JR, Kwong PD, Korber BT, Harrison SC, Haynes BF, Hahn BH, Shaw GM. Recapitulation of HIV-1 Env-antibody coevolution in macaques leading to neutralization breadth. Science. 2021;371:eabd2638. doi: 10.1126/science.abd2638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Romero EV, Feder AF. Elevated HIV viral load is associated with higher recombination rate in vivo. Molecular Biology and Evolution. 2024;41:msad260. doi: 10.1093/molbev/msad260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Rouzine IM, Weinberger LS. The quantitative theory of within-host viral evolution. Journal of Statistical Mechanics. 2013;2013:01009. doi: 10.1088/1742-5468/2013/01/P01009. [DOI] [Google Scholar]
  78. Ruxton GD, Neuhäuser M. Good practice in testing for an association in contingency tables. Behavioral Ecology and Sociobiology. 2010;64:1505–1513. doi: 10.1007/s00265-010-1014-0. [DOI] [Google Scholar]
  79. Saunders KO, Edwards RJ, Tilahun K, Manne K, Lu X, Cain DW, Wiehe K, Williams WB, Mansouri K, Hernandez GE, Sutherland L, Scearce R, Parks R, Barr M, DeMarco T, Eater CM, Eaton A, Morton G, Mildenberg B, Wang Y, Rountree RW, Tomai MA, Fox CB, Moody MA, Alam SM, Santra S, Lewis MG, Denny TN, Shaw GM, Montefiori DC, Acharya P, Haynes BF. Stabilized HIV-1 envelope immunization induces neutralizing antibodies to the CD4bs and protects macaques against mucosal infection. Science Translational Medicine. 2022;14:eabo5598. doi: 10.1126/scitranslmed.abo5598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Schraiber JG, Evans SN, Slatkin M. Bayesian inference of natural selection from allele frequency time series. Genetics. 2016;203:493–511. doi: 10.1534/genetics.116.187278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Shaffer JS, Moore PL, Kardar M, Chakraborty AK. Optimal immunization cocktails can promote induction of broadly neutralizing Abs against highly mutable pathogens. PNAS. 2016;113:E7039–E7048. doi: 10.1073/pnas.1614940113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Shimagaki K, Barton JP. Bézier interpolation improves the inference of dynamical models from data. Physical Review. E. 2023;107:024116. doi: 10.1103/PhysRevE.107.024116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Shimagaki KS, Barton JP. Paper-HIV-coevolution. swh:1:rev:02c8ee1b5251f3c9845c8b7f23f06b24d849af45Software Heritage. 2025a https://archive.softwareheritage.org/swh:1:dir:52a20181fdba7e0d804be70d40a8fe266a95ee02;origin=https://github.com/bartonlab/paper-HIV-coevolution;visit=swh:1:snp:5256292424b78685de10cb36d8c31a864c9ac8ae;anchor=swh:1:rev:02c8ee1b5251f3c9845c8b7f23f06b24d849af45
  84. Shimagaki KS, Barton JP. Efficient epistasis inference via higher-order covariance matrix factorization. Genetics. 2025b;230:iyaf118. doi: 10.1093/genetics/iyaf118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Sohail MS, Louie RHY, McKay MR, Barton JP. MPL resolves genetic linkage in fitness inference from complex evolutionary histories. Nature Biotechnology. 2021;39:472–479. doi: 10.1038/s41587-020-0737-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Sohail MS, Louie RHY, Hong Z, Barton JP, McKay MR. Inferring epistasis from genetic time-series data. Molecular Biology and Evolution. 2022;39:msac199. doi: 10.1093/molbev/msac199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Sok D, Burton DR. Recent progress in broadly neutralizing antibodies to HIV. Nature Immunology. 2018;19:1179–1188. doi: 10.1038/s41590-018-0235-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Sprenger KG, Louveau JE, Murugan PM, Chakraborty AK. Optimizing immunization protocols to elicit broadly neutralizing antibodies. PNAS. 2020;117:20077–20087. doi: 10.1073/pnas.1919329117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Steinrücken M, Bhaskar A, Song YS. A novel spectral method for inferring general diploid selection from time series genetic data. The Annals of Applied Statistics. 2014;8:2203. doi: 10.1214/14-AOAS764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Tataru P, Simonsen M, Bataillon T, Hobolth A. Statistical inference in the wright-fisher model using allele frequency data. Systematic Biology. 2017;66:e30–e46. doi: 10.1093/sysbio/syw056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Terhorst J, Schlötterer C, Song YS. Multi-locus analysis of genomic time series data from experimental evolution. PLOS Genetics. 2015;11:e1005069. doi: 10.1371/journal.pgen.1005069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Tomaras GD, Yates NL, Liu P, Qin L, Fouda GG, Chavez LL, Decamp AC, Parks RJ, Ashley VC, Lucas JT, Cohen M, Eron J, Hicks CB, Liao H-X, Self SG, Landucci G, Forthal DN, Weinhold KJ, Keele BF, Hahn BH, Greenberg ML, Morris L, Karim SSA, Blattner WA, Montefiori DC, Shaw GM, Perelson AS, Haynes BF. Initial B-cell responses to transmitted human immunodeficiency virus type 1: virion-binding immunoglobulin M (IgM) and IgG antibodies followed by plasma anti-gp41 antibodies with ineffective control of initial viremia. Journal of Virology. 2008;82:12449–12463. doi: 10.1128/JVI.01708-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Wang S, Mata-Fink J, Kriegsman B, Hanson M, Irvine DJ, Eisen HN, Burton DR, Wittrup KD, Kardar M, Chakraborty AK. Manipulating the selection forces during affinity maturation to generate cross-reactive HIV antibodies. Cell. 2015;160:785–797. doi: 10.1016/j.cell.2015.01.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Wei X, Decker JM, Wang S, Hui H, Kappes JC, Wu X, Salazar-Gonzalez JF, Salazar MG, Kilby JM, Saag MS, Komarova NL, Nowak MA, Hahn BH, Kwong PD, Shaw GM. Antibody neutralization and escape by HIV-1. Nature. 2003;422:307–312. doi: 10.1038/nature01470. [DOI] [PubMed] [Google Scholar]
  95. Weigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein–protein interaction by message passing. PNAS. 2009;106:67–72. doi: 10.1073/pnas.0805923106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Wensing AM, Calvez V, Günthard HF, Johnson VA, Paredes R, Pillay D, Shafer RW, Richman DD. 2017 Update of the drug resistance mutations in HIV-1. Topics in Antiviral Medicine. 2016;24:132–133. [PMC free article] [PubMed] [Google Scholar]
  97. Williams KA, Pennings P. Drug resistance evolution in HIV in the late 1990s: hard sweeps, soft sweeps, clonal interference and the accumulation of drug resistance mutations. G3: Genes, Genomes, Genetics. 2020;10:1213–1223. doi: 10.1534/g3.119.400772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Williams WB, Alam SM, Ofek G, Erdmann N, Montefiori D, Seaman MS, Wagh K, Korber B, Edwards RJ, Mansouri K, Eaton A, Cain DW, Martin M, Parks R, Barr M, Foulger A, Anasti K, Patel P, Sammour S, Parsons RJ, Huang X, Lindenberger J, Fetics S, Janowska K, Niyongabo A, Janus BM, Astavans A, Fox CB, Mohanty I, Evangelous T, Chen Y, Berry M, Kirshner H, Van Itallie E, Saunders K, Wiehe K, Cohen KW, McElrath MJ, Corey L, Acharya P, Walsh SR, Baden LR, Haynes BF. Vaccine Induction of Heterologous HIV-1 Neutralizing Antibody B Cell Lineages in Humans. bioRxiv. 2023 doi: 10.1101/2023.03.09.23286943. [DOI]
  99. Zanini F, Brodin J, Thebo L, Lanz C, Bratt G, Albert J, Neher RA. Population genomics of intrapatient HIV-1 evolution. eLife. 2015;4:e11282. doi: 10.7554/eLife.11282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Zhou T, Zhu J, Wu X, Moquin S, Zhang B, Acharya P, Georgiev IS, Altae-Tran HR, Chuang G-Y, Joyce MG, Kwon YD, Longo NS, Louder MK, Luongo T, McKee K, Schramm CA, Skinner J, Yang Y, Yang Z, Zhang Z, Zheng A, Bonsignori M, Haynes BF, Scheid JF, Nussenzweig MC, Simek M, Burton DR, Koff WC, NISC Comparative Sequencing Program. Mullikin JC, Connors M, Shapiro L, Nabel GJ, Mascola JR, Kwong PD. Multidonor analysis reveals structural elements, genetic determinants, and maturation pathway for HIV-1 neutralization by VRC01-class antibodies. Immunity. 2013;39:245–258. doi: 10.1016/j.immuni.2013.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife Assessment

Anne-Florence Bitbol 1

In this important quantitative study of HIV-1 evolution in humans and rhesus macaques, selection coefficients are inferred at scale over the HIV genome. Selection coefficients are similar in humans and macaques, providing compelling evidence that these coefficients are representative of the fitness landscapes of these viruses within hosts. This work will be of interest to the community working on quantitative evolution and fitness landscape inference, and the finding that rapid fitness gains in the HIV population predict bNAb emergence has significant implications for HIV vaccine design.

Reviewer #1 (Public review):

Anonymous

Summary:

The present work studies the coevolution of HIV-1 and the immune response in clinical patient data. Using the Marginal Path Likelihood (MPL) framework, they infer selection coefficients for HIV mutations from time-series data of virus sequences as they evolve in a given patient.

Strengths:

The authors analyze data from two human patients, consisting of HIV population sequence samples at various points in time during the infection. They inferred selection coefficients from the observed changes in sequence abundance using MPL. Most beneficial mutations appear in viral envelop proteins. The authors also analyze SHIV samples in rhesus macaques, and find selection coefficients that are compatible with those found in the corresponding human samples.

The manuscript is well written and organized.

Comments on revisions:

In their revised version the authors have addressed most of these points satisfactorily.

Reviewer #2 (Public review):

Anonymous

This paper combines a biological topic of interest with the demonstration of important theoretical/methodological advances. Fitness inference is the foundation of the quantitative analysis of adapting systems. It is a hard and important problem and this paper highlights a compelling approach (MPL) first presented in (1) and refined in (2), roughly summarized in equation 3.

The authors find that positive selection shapes the variable regions of env in shared patterns across two patient donors. The patterns of positive selection are interesting in and of themselves, they confirm the intuition that hyper-variation in env is the result of immune evasion rather than a broadly neutral landscape (flatness). They show that the immune evasion patterns due to CD8 T and naive B-cell selection are shared across patients. Furthermore, they suggest that a particular evolutionary history (larger flux to high fitness states) is associated with bNAb emergence. Mimicking this evolutionary pattern in vaccine design may help us elicit bNAbs in patients in the future.

The fitness landscape of env in multiple hosts is immensely valuable especially because of how often SHIV has used as proxy for HIV. The strength of reversion-to-consensus selection is a known pattern of HIV post-infection populations but they are nicely quantified here. Agreement between SHIV and HIV evolution is shown. They find selection is larger for autologous antibodies than the bNAbs themselves (perhaps bNAbs are just too small a component of the host response to drive the bulk of selection?), and that big fitness increases precede antibody breadth in rhesus-macaques, suggesting that this fitness increase is the immune challenge required to draw forth a bNAb. All of high interest to HIV researchers.

(1) Sohail, M. S., Louie, R. H., McKay, M. R. & Barton, J. P. Mpl resolves genetic linkage in fitness inference from complex evolutionary histories. Nature biotechnology 39, 472-479 (2021).

(2) Shimagaki, K. & Barton, J. P. Bézier interpolation improves the inference of dynamical models from data. Physical Review E 107, 024116 (2023).

Strength of evidence:

Equation 3 is a beautiful and intuitive tool that accounts for linkage and can be solved precisely even in the presence of detailed mutational and selection models. They have addressed my earlier concerns the effects of incomplete observations of the frequency bias fitness inference on rare sites.

Whether the fact that fitness increases occured before or after the presence of the bnab remains incompletely known. bNAb detection is different from bNAb presence and the possibility that fitness increases occurred after the bNAbs appeared remains. Still, their conclusion is plausible and fits in with the other observations which form a coherent and compelling picture.

Overall this is a convincing paper. It is a valuable introduction to a practical method of fitness inference at the scale of the entire env gene and how this information can be leveraged to learn some interesting biology.

Reviewer #3 (Public review):

Anonymous

Summary:

Shimagaki et al. investigate the virus-antibody coevolutionary processes that drive the development of broadly neutralizing antibodies (bnAbs). The study's primary goal is to characterize the evolutionary dynamics of HIV-1 within hosts that accompany the emergence of bnAbs, with a particular focus on inferring the landscape of selective pressures shaping viral evolution. To assess the generality of these evolutionary patterns, the study extends its analysis to rhesus macaques (RMs) infected with simian-human immunodeficiency viruses (SHIV) incorporating HIV-1 Env proteins derived from two human individuals.

Strengths:

A key strength of the study is its rigorous assessment of the similarity in evolutionary trajectories between humans and macaques. This cross-species comparison is particularly compelling, as it quantitatively establishes a shared pattern of viral evolution using a sophisticated inference method. The finding that similar selective pressures operate in both species adds robustness to the study's conclusions and suggests broader biological relevance. In the revised version, the Authors included a simple but clear explanation of the statistical method for inferring the model's parameters in the main text. Moreover, I find the potential implications of the methodology absent in the original submission very interesting.

Conclusions:

Overall, the study presents a compelling analysis of HIV-1 evolution and its parallels in SHIV-infected macaques.

eLife. 2025 Nov 10;14:RP105466. doi: 10.7554/eLife.105466.3.sa4

Author response

Kai S Shimagaki 1, Rebecca Lynch 2, John P Barton 3

The following is the authors’ response to the original reviews.

Reviewer #1 (Public review):

Summary:

The present work studies the coevolution of HIV-1 and the immune response in clinical patient data. Using the Marginal Path Likelihood (MPL) framework, they infer selection coefficients for HIV mutations from time-series data of virus sequences as they evolve in a given patient.

Strengths:

The authors analyze data from two human patients, consisting of HIV population sequence samples at various points in time during the infection. They infer selection coefficients from the observed changes in sequence abundance using MPL. Most beneficial mutations appear in viral envelop proteins. The authors also analyze SHIV samples in rhesus macaques, and find selection coefficients that are compatible with those found in the corresponding human samples.

Weaknesses:

The MPL method used by the authors considers only additive effects of mutations, thus ignoring epistasis.

As suggested, we have now addressed this limitation by inferring epistatic fitness landscapes for CH505, CH848, SHIV.CH505, and SHIV.CH848. Indeed, the computational burden of the epistasis inference procedure was one constraint that motivated us to consider only additive fitness in the previous version of our paper. The original approach developed by Sohail et al. (2022) tested only sequences with <50 sites due to this limitation, far smaller than the ones we consider. Beyond this computational constraint, we also believed that (1) an additive fitness model may suffice to capture local fitness landscapes, and practically, (2) epistatic interactions are more challenging to validate than the effects of individual mutations, making the interpretation of the model more complex.

However, after performing the analyses described in this paper, we developed a new approach for identifying epistatic interactions that can scale to much longer sequences (Shimagaki et al., Genetics, in press). We therefore applied this method to infer an epistatic fitness landscape for the HIV and SHIV data sets that we studied. As in that work, we focused on short-range (<50 bp) interactions which we could more confidently estimate from data. We have added a section in the SI describing the epistatic fitness model and our analysis.

Overall, we found substantial agreement between the epistatic and purely additive models in terms of the estimated fitness effects of individual mutations (new Supplementary Fig. 8) and overall fitness (Supplementary Fig. 9). Consistent with our prior work, we did not find substantial evidence for very strong epistatic interactions (Supplementary Fig. 10). This does not necessarily mean that strong epistatic interactions do not exist; rather, this shows that strong interactions don’t substantially improve the fit of the model to data, and thus many are regularized toward zero. While the biological validation of epistatic interactions is challenging, we found that the largest epistatic interactions, which we defined as the top 1% of all shortrange interactions, were modestly but significantly enriched in the CD4 binding site, V1 and V5 regions for CH505 and in the CD4 binding site, V4, and V5 for CH848. In addition, mutation pairs N280S/V281A and E275K/V281G, which confer resistance to CH235, ranked in the top 15% of all epistatic interactions in CH505.

We have now included an additional section in the Results, “Robustness of inferred selection to changes in the fitness model and finite sampling”, which discusses our epistatic analyses (page 6, lines 415-464), along with the above Supplementary Figures and a technical section in the SI summarizing the epistasis inference approach.

Although the evolution of broadly neutralizing antibodies (bnAbs) is a motivating question in the introduction and discussion sections (and the title), the relevance of the analysis and results to better understanding how bnAbs arise is not clear. The only result presented in direct connection to bnAbs is Figure 6.

It is true that, while bnAb development is a major motivator of our study, our analysis focuses on HIV-1 and does not directly consider antibody evolution. We have now brought attention to this point as a limitation directly in the Discussion. Following the suggestion below in the “Recommendations for the authors,” we have edited our manuscript to place more emphasis on viral fitness and somewhat reduce the emphasis on bnAbs, though this remains an important motivating factor. Specifically, the Abstract now begins

Human immunodeficiency virus (HIV)-1 evolves within individual hosts to escape adaptive immune responses while maintaining its capacity for replication. Coevolution between the HIV-1 and the immune system generates extraordinary viral genetic diversity. In some individuals, this process also results in the development of broadly neutralizing antibodies (bnAbs) that can neutralize many viral variants, a key focus of HIV-1 vaccine design. However, a general understanding of the forces that shape virusimmune coevolution within and across hosts remains incomplete. Here we performed a quantitative study of HIV-1 evolution in humans and rhesus macaques, including individuals who developed bnAbs.

We have similarly modified the Discussion to focus first on viral fitness. In response to comments from Reviewer 3, we have also more clearly articulated how our work might contribute to the understanding of bnAb development in the Discussion.

Questions or suggestions for further discussion:

I list here a number of points for which I believe the paper would benefit if additional discussion/results were included.

The MPL method used by the authors considers only additive effects of mutations, thus ignoring epistasis. In Sohail et al (2022) MBE 39(10), p. msac199 (https://doi.org/10.1093/molbev/msac199) an extension of MPL is developed allowing one to infer epistasis. Can the authors comment on why this was not attempted here?

I presume one possible reason is that epistasis inference requires considerably more computational effort (and more data). However, since the authors find most beneficial mutations occurring in Env, perhaps restricting the analysis to Env genes only (e.g. the trimer shown in Figure 2) can lead to tractable inference of epistasis within this segment (instead of the full genome).

As described above, we have now addressed this comment by inferring epistatic fitness landscapes for the data sets that we consider. Our overall results using the epistatic fitness model are consistent with the ones that we previously obtained with an additive model.

Do the authors find correlations in the inferred selection coefficients of the two samples CH505 and CH848? I could not find any discussion of this in the manuscript. Only correlations between Humans and RM are discussed.

To address this question, we compared the fitness values and individual selection coefficients across CH505 and CH848 data sets. We found little correlation between CH505 and CH848 fitness values (shown in a new Supplementary Fig. 6) or selection coefficients. We found only 199 common mutations between HIV-1 amino acid sequences from CH505 and CH848 out of 868 and 1,406 total mutations, respectively. Thus, we were not surprised to find no strong relationship between fitness estimates from CH505 and CH848 data sets.

Reviewer #2 (Public review):

Summary:

This paper combines a biological topic of interest with the demonstration of important theoretical/methodological advances. Fitness inference is the foundation of the quantitative analysis of adapting systems. It is a hard and important problem and this paper highlights a compelling approach (MPL) first presented in (1) and refined in (2), roughly summarized in equation 12.

(1) Sohail, M. S., Louie, R. H., McKay, M. R. & Barton, J. P. Mpl resolves genetic linkage in fitness inference from complex evolutionary histories. Nature biotechnology 39, 472-479 (2021).

(2) Shimagaki, K. & Barton, J. P. Bézier interpolation improves the inference of dynamical models from data. Physical Review E 107, 024116 (2023).

The authors find that positive selection shapes the variable regions of env in shared patterns across two patient donors. The patterns of positive selection are interesting in and of themselves, they confirm the intuition that hyper-variation in env is the result of immune evasion rather than a broadly neutral landscape (flatness). They show that the immune evasion patterns due to CD8 T and naive B-cell selection are shared across patients. Furthermore, they suggest that a particular evolutionary history (larger flux to high fitness states) is associated with bNAb emergence. Mimicking this evolutionary pattern in vaccine design may help us elicit bNAbs in patients in the future.

There is a lot of information to be found in the full fitness landscape of env. The enormous strength of reversion-to-consensus in the patterns is a known pattern of HIV post-infection populations but they are nicely quantified here. Agreement between SHIV and HIV evolution is shown. They find selection is larger for autologous antibodies than the bNAbs themselves (perhaps bNAbs are just too small a component of the host response to drive the bulk of selection?), and that big fitness increases precede antibody breadth in rhesus macaques, suggesting that this fitness increase is the immune challenge required to draw forth a bNAb. This is all of high interest to HIV researchers.

Strength of evidence:

One limitation is, of course, that the fitness model is constant in time when the immune challenge is variable and changing. This simplification may complicate some interpretations.

We agree that this is a limitation of our current approach. In prior work, we have found that the constant fitness effects of mutations that we infer typically reflect the time-averaged fitness effect when the selection changes over time (Gao and Barton, PNAS 2025; Lee et al., Nat Commun 2025). It could be difficult, however, to capture changes in selection that fluctuate rapidly with underlying immune responses. We have added a new paragraph in the Discussion that more clearly sets out some of the limitations of our analysis, including our assumption of constant selection coefficients.

There are additional methodological and technical limitations that should be considered in the interpretation of our results. Most notably, we assume that the viral fitness landscape is static in time. While we do not expect selection for effective replication (“intrinsic” fitness) to change substantially over time, pressure for immune escape could vary along with the immune responses that drive them. In prior work, we have found that constant selection coefficients typically reflect the average fitness effect of a mutation when its true contribution to fitness is time-varying [42,43]. This may not adequately description mutational effects that undergo large or rapid shifts in time. Future work should also examine temporal patterns in selection for individual mutations.

Equation 12 in the methods is really a beautiful tool because it is so simple, but accounts for linkage and can be solved precisely even in the presence of detailed mutational and selection models. However, the reliance on incomplete observations of the frequency leads to complications that must be carefully (re)addressed here.

For instance, the consistent finding of strong selection in hypervariable regions is biologically intuitive but so striking, that I worry that it might be the result of a bias for selection in high entropy regions.

Thank you for this suggestion. We agree that it is important to carefully interrogate these results. To assess the effects of general sequence variability on inferred selection, we first computed a position-specific entropy measure, Hi, for each site i. We first defined the time-dependent entropy Hi(t) = - ∑a xi (a, t) log xi (a, t), where xi (a, t) represents the frequency of amino acid/nucleotide a at position i and time t, at each sample time. We then computed Hi as the average of Hi(t) across all sample times. A new Supplementary Fig. 1 plots the entropy against the inferred selection coefficients. Although some sequence variation must be observed in order for us to infer that a mutation is beneficial, we did not find a systematic bias toward larger (more beneficial) selection coefficients at more variable sites. Overall, we found only a modest correlation between inferred selection coefficients and entropy (Pearson’s r = 0.33 and 0.29 for CH505 and CH848, respectively), which appears to be partly driven by the tendency for mutations inferred to be significantly deleterious to occur at sites with low entropy. In addition to the new Supplementary Figure, we have added a reference to this analysis in the main text:

To test whether our results might be biased by overall sequence variability, we examined the relationship between our inferred selection coefficients and entropy, a common measure of sequence variability. Overall, we found only a modest correlation between selection and entropy, suggesting that the signs of selection that we observe are not due to increased sequence variability alone (Supplementary Fig. 1).

Mutational and covariance terms in equation 12 might be underestimated, due to finite sampling effect in highly diverse populations. Sampling effects lead to zeros in x(t) when actual frequency zeros might be rare at the population sizes of HIV viral loads and mutation rates. Both mutational flux and C underestimation will bias selection upward in eq. 12.

The prior papers (1) and (2) seem to show robustness to finite sampling effects, but, again, more care needs to be shown that this robustness transfers to the amino acid inference under these conditions. That synonymous sites are rarely selected for in the nucleotide level is a good sign, and it may be a matter of simply fully explaining the amino-acid level model.

As above, we agree that these tests are important. To assess the robustness of our results to finite sampling, we performed bootstrap sampling on the viral sequences and inferred selection coefficients using the resampled sequences. Specifically, we resampled the same number of sequences as in the original data at each time point and repeated this for all time points across all HIV-1 and SHIV data sets. A new Supplementary Fig. 11 shows a typical comparison of the original selection coefficients vs. those obtained through bootstrap resampling. Overall, we observe a high degree of consistency between the selection coefficients in each case, which is surely aided by the long time series in these data sets. As pointed out by the reviewer, uncertainty in low-frequency mutations is a particular concern, though the effects on inferred selection are mitigated by regularization.

We have added a section in the Results, “Robustness of inferred selection to changes in the fitness model and finite sampling”, which includes this analysis:

Finite sampling of sequence data could also affect our analyses. To further test the robustness of our results, we inferred selection coefficients using bootstrap resampling, where we resample sequences from the original ensemble, maintaining the same number of sequences for each time point and subject. The selection coefficients from the bootstrap samples are consistent with the original data (see Supplementary Fig. 11), with Pearson’s r values of around 0.85 for HIV-1 data sets and 0.95 for SHIV data sets, respectively.

Uncertainty propagates to the later parts of the paper, eg. HIV and SIV shared patterns might be the result of shared biases in the method application. However, this worry does not extend to the apples-to-apples comparison of fitness trajectories across individuals (Figures 5 and 6) which I think are robust (for these sample sizes).

One way to address this uncertainty is to compare the fitness values and individual selection coefficients across CH505 and CH848 data sets, which was also requested by Reviewer 1. Overall, we found little correlation between CH505 and CH848 fitness values (shown in a new Supplementary Fig. 6) or selection coefficients. This suggests that similarities between HIV-1 and SHIV landscapes are not solely determined by potential biases in the inference approach. We have now added a reference to this point in the main text:

In contrast, the inferred fitness landscapes of CH505 and CH848, which share few mutations in common, are poorly correlated (Supplementary Fig. 6). This suggests that the similarities between viral fitness values in humans and RMs are not artifacts of the model, but rather stem from similarities in underlying evolutionary drivers.

The timing evidence is slightly weakened by the fact that bNAb detection is different from bNAb presence and the possibility that fitness increases occurred after the bNAbs appeared remains. Still, their conclusion is plausible and fits in with the other observations which form a coherent and compelling picture.

Yes, we agree that this is a limitation of our analysis — bNAbs may have been present at low levels before they were detected, and we cannot definitively reject selection by bNAbs. Nonetheless, in at least one case (RM5695), rapid fitness gains were substantially separated in time from bNAb detection (roughly 2 weeks after infection vs. 16 weeks, respectively). We have now added this point in a new paragraph in the Discussion:

While we found a strong relationship between viral fitness dynamics and the emergence of bnAbs, it may not be true that the former stimulates the latter. For example, bnAbs may have been present within each host before they were experimentally detected. Rapid viral fitness gains within hosts that developed broad antibody responses could then have been driven by undetected bnAb lineages. However, we did not find strong selection for known bnAb resistance mutations, and in at least one case (RM5695), rapid fitness gains (roughly 2 weeks after infection) substantially preceded bnAb detection (16 weeks). Still, given the limited size of the data set that we studied, it is unclear the extent to which our results will transfer to larger and broader data sets.

Overall thisrpretations could provide valuable insights into the broader significance of these results. is a convincing paper, part of a larger admirable project of accurately inferring complete fitness landscapes.

Reviewer #3 (Public review):

Summary:

Shimagaki et al. investigate the virus-antibody coevolutionary processes that drive the development of broadly neutralizing antibodies (bnAbs). The study's primary goal is to characterize the evolutionary dynamics of HIV-1 within hosts that accompany the emergence of bnAbs, with a particular focus on inferring the landscape of selective pressures shaping viral evolution. To assess the generality of these evolutionary patterns, the study extends its analysis to rhesus macaques (RMs) infected with simianhuman immunodeficiency viruses (SHIV) incorporating HIV-1 Env proteins derived from two human individuals.

Strengths:

A key strength of the study is its rigorous assessment of the similarity in evolutionary trajectories between humans and macaques. This cross-species comparison is particularly compelling, as it quantitatively establishes a shared pattern of viral evolution using a sophisticated inference method. The finding that similar selective pressures operate in both species adds robustness to the study's conclusions and suggests broader biological relevance.

Weaknesses:

However, the study has some limitations. The most significant weakness is that the authors do not sufficiently discuss the implications of the observed similarities. While the identification of shared evolutionary patterns (e.g., Figure 5) is intriguing, the study would benefit from a more explicit discussion of what these findings mean for instance, in the context of HIV vaccine design, immunotherapy, or fundamental viral-host interactions. Even speculative inte

Thank you for this suggestion. We have now clarified the potential implications of our work in several areas. While speculative, one possible application is in vaccine design: it may be beneficial to design sequential immunogens to mimic the patterns of viral evolution associated with rapid fitness gains. This “population-based” design principle is different from typical approaches, which have focused on molecular details of virus surface proteins.

We have extended our discussion of our results in the context of viral evolution within and across hosts and related host species. Overall, our work suggests that there may be relatively few paths to significantly higher viral fitness in vivo. Evolutionary “contingencies” such as shifting immune pressure or epistatic interactions could influence the direction of evolution, but not so dramatically that the dynamics that we see in different hosts are not comparable. We have also connected our work more broadly to the literature in evolutionary parallelism in HIV-1 in different contexts.

A secondary, albeit less critical, limitation is the placement of methodological details in the Supplementary Information. While it is understandable that the authors focus on results in the main text - especially since the methodology is not novel and has been previously described in earlier publications - some readers might benefit from a more thorough presentation of the method within the main paper.

We have now modified the main text to add a new section, “Model overview,” that lays out the key steps of our approach. While we reserve technical details for the Methods, we believe that this new section provides more intuition about how our results were obtained (including a discussion of the important Eq. 12, now Eq. 3 in the main text) and our underlying assumptions.

Conclusions:

Overall, the study presents a compelling analysis of HIV-1 evolution and its parallels in SHIV-infected macaques. While the quantitative comparison between species is a notable contribution, a deeper discussion of its broader implications would strengthen the paper's impact.

Reviewer #1 (Recommendations for the authors):

I suggest de-emphasizing bnAbs and focusing on selection landscape inference, which seems to be the actual focus of the paper.

While we do not directly study antibody development in this work, bnAb development is certainly an important motivating factor. As described in the responses above, we have now modified the Abstract and Discussion to place relatively more emphasis on fitness comparisons and to relatively less focus on bnAb development.

Reviewer #2 (Recommendations for the authors):

Please make sure that the MPL method is defined in this paper and its limitations are at least partially repeated.

As noted in responses above, we have now included more methodological details in the main text of the paper, which we hope will make the intuition and assumptions involved in our analysis clearer.

I'd like the code to better show or describe the model, I could not figure out the model details by looking at the code. It seems mostly just to be csv exporting for use with preexisting MPL code. A longer code readme would be helpful.

We have now updated the README on GitHub to include a conceptual overview of our inference approach, which references how each step is implemented in the code.

Reviewer #3 (Recommendations for the authors):

Try to give some more details (not necessarily giving the full mathematical derivation) on the statistical method utilized.

As noted above, we have now expanded our discussion of the statistical methods and assumptions in the main text.

Figures 3 and 4 are somewhat 'messy'. Although I do not have a constructive suggestion here, I feel that with a little more effort maybe the authors could come up with something more clean.

It is true that the mutation frequency dynamics are somewhat “choppy” and difficult to follow intuitively. To attempt to make these figures easier to parse visually, we have increased the transparency on the lines and added exponential smoothing to the mutation frequencies, resulting in smoother trajectories. The trajectories without smoothing are retained in Supplementary Fig. 3. Here we also note that this smoothing is for visual purposes only; we use the original frequency trajectories for inference, rather than the smoothed ones.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Supplementary file 1. CH103 and CH235 resistance mutations.
    elife-105466-supp1.xlsx (5.2KB, xlsx)
    Supplementary file 2. Strain-specific antibody resistance mutations in CH505.
    elife-105466-supp2.xlsx (5.3KB, xlsx)
    Supplementary file 3. DH272, DH475, and strain-specific antibody resistance mutations in CH848.
    elife-105466-supp3.xlsx (5.2KB, xlsx)
    Supplementary file 4. Biological effects of the HIV-1 mutations inferred to be the most beneficial in CH505.
    elife-105466-supp4.xlsx (6.3KB, xlsx)
    Supplementary file 5. Biological effects of the HIV-1 mutations inferred to be the most beneficial in CH848.
    elife-105466-supp5.xlsx (6.2KB, xlsx)
    Supplementary file 6. CH505 and SHIV.CH505 sequence statistics.
    Supplementary file 7. CH848 and SHIV.CH848 sequence statistics.
    elife-105466-supp7.xlsx (5.2KB, xlsx)
    Supplementary file 8. Biological effects of strongly selected SHIV.CH505 mutations.
    elife-105466-supp8.xlsx (5.5KB, xlsx)
    Supplementary file 9. Biological effects of strongly selected SHIV.CH848 mutations.
    elife-105466-supp9.xlsx (6.2KB, xlsx)
    Supplementary file 10. Selective advantage of mutations that confer resistance to antibodies in SHIV.CH505.
    elife-105466-supp10.xlsx (5.4KB, xlsx)
    Supplementary file 11. Selective advantage of mutations that confer resistance to antibodies in SHIV.CH848.
    elife-105466-supp11.xlsx (5.1KB, xlsx)
    Supplementary file 12. List of selected sites using LASSIE in SHIV.CH505.
    elife-105466-supp12.xlsx (5.7KB, xlsx)
    Supplementary file 13. List of selected sites using LASSIE in SHIV.CH848.
    elife-105466-supp13.xlsx (4.9KB, xlsx)
    MDAR checklist

    Data Availability Statement

    Data and code accompanying this manuscript is publicly available at the GitHub repository https://github.com/bartonlab/paper-HIV-coevolution (copy archived at Shimagaki and Barton, 2025a). This repository contains source files that process HIV-1 and SHIV sequences, infer selection coefficients, and identify and characterize mutations. The included Jupyter notebooks can be run to reproduce the figures presented here. The original HIV-1 sequences can be retrieved from the LANL database (https://www.hiv.lanl.gov/content/index), and SHIV sequences can be found at GenBank (https://www.ncbi.nlm.nih.gov/).


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES