Skip to main content
eLife logoLink to eLife
. 2024 Sep 25;13:RP91849. doi: 10.7554/eLife.91849

Antigenic drift and subtype interference shape A(H3N2) epidemic dynamics in the United States

Amanda C Perofsky 1,2,, John Huddleston 3, Chelsea L Hansen 1,2, John R Barnes 4, Thomas Rowe 4, Xiyan Xu 4, Rebecca Kondor 4, David E Wentworth 4, Nicola Lewis 5, Lynne Whittaker 5, Burcu Ermetal 5, Ruth Harvey 5, Monica Galiano 5, Rodney Stuart Daniels 5, John W McCauley 5, Seiichiro Fujisaki 6, Kazuya Nakamura 6, Noriko Kishida 6, Shinji Watanabe 6, Hideki Hasegawa 6, Sheena G Sullivan 7, Ian G Barr 7, Kanta Subbarao 7, Florian Krammer 8,9, Trevor Bedford 2,3,10,11, Cécile Viboud 1
Editors: Talía Malagón12, Diane M Harper13
PMCID: PMC11424097  PMID: 39319780

Abstract

Influenza viruses continually evolve new antigenic variants, through mutations in epitopes of their major surface proteins, hemagglutinin (HA) and neuraminidase (NA). Antigenic drift potentiates the reinfection of previously infected individuals, but the contribution of this process to variability in annual epidemics is not well understood. Here, we link influenza A(H3N2) virus evolution to regional epidemic dynamics in the United States during 1997—2019. We integrate phenotypic measures of HA antigenic drift and sequence-based measures of HA and NA fitness to infer antigenic and genetic distances between viruses circulating in successive seasons. We estimate the magnitude, severity, timing, transmission rate, age-specific patterns, and subtype dominance of each regional outbreak and find that genetic distance based on broad sets of epitope sites is the strongest evolutionary predictor of A(H3N2) virus epidemiology. Increased HA and NA epitope distance between seasons correlates with larger, more intense epidemics, higher transmission, greater A(H3N2) subtype dominance, and a greater proportion of cases in adults relative to children, consistent with increased population susceptibility. Based on random forest models, A(H1N1) incidence impacts A(H3N2) epidemics to a greater extent than viral evolution, suggesting that subtype interference is a major driver of influenza A virus infection ynamics, presumably via heterosubtypic cross-immunity.

Research organism: Human, Virus

eLife digest

Seasonal influenza (flu) viruses cause outbreaks every winter. People infected with influenza typically develop mild respiratory symptoms. But flu infections can cause serious illness in young children, older adults and people with chronic medical conditions. Infected or vaccinated individuals develop some immunity, but the viruses evolve quickly to evade these defenses in a process called antigenic drift. As the viruses change, they can re-infect previously immune people. Scientists update the flu vaccine yearly to keep up with this antigenic drift.

The immune system fights flu infections by recognizing two proteins, known as antigens, on the virus’s surface, called hemagglutinin (HA) and neuraminidase (NA). However, mutations in the genes encoding these proteins can make them unrecognizable, letting the virus slip past the immune system. Scientists would like to know how these changes affect the size, severity and timing of annual influenza outbreaks.

Perofsky et al. show that tracking genetic changes in HA and NA may help improve flu season predictions. The experiments compared the severity of 22 flu seasons caused by the A(H3N2) subtype in the United States with how much HA and NA had evolved since the previous year. The A(H3N2) subtype experiences the fastest rates of antigenic drift and causes more cases and deaths than other seasonal flu viruses. Genetic changes in HA and NA were a better predictor of A(H3N2) outbreak severity than the blood tests for protective antibodies that epidemiologists traditionally use to track flu evolution. However, the prevalence of another subtype of influenza A circulating in the population, called A(H1N1), was an even better predictor of how severe A(H3N2) outbreaks would be.

Perofsky et al. are the first to show that genetic changes in NA contribute to the severity of flu seasons. Previous studies suggested a link between genetic changes in HA and flu season severity, and flu vaccines include the HA protein to help the body recognize new influenza strains. The results suggest that adding the NA protein to flu vaccines may improve their effectiveness. In the future, flu forecasters may want to analyze genetic changes in both NA and HA to make their outbreak predictions. Tracking how much of the A(H1N1) subtype is circulating may also be useful for predicting the severity of A(H3N2) outbreaks.

Introduction

Influenza viruses continually accumulate genetic changes in epitopes of two major surface proteins, hemagglutinin (HA) and neuraminidase (NA), in a process known as ‘antigenic drift’. Alhough individual hosts develop long-lasting immunity to specific influenza virus strains after infection, antigenic drift helps the virus to escape immune recognition, leaving previously exposed hosts susceptible to reinfection and necessitating regular updates to the antigens included in the influenza vaccine (Gerdil, 2003). While antigenic drift aids immune escape, prospective cohort studies and modeling of surveillance data also indicate that reinfection by antigenically homologous viruses occurs on average every 1–4 years, due to the waning of protection over time (He et al., 2015; Wraith et al., 2022).

Among the influenza virus types that routinely co-circulate in humans (A and B), type A viruses, particularly subtype A(H3N2), experience the fastest rates of antigenic evolution and cause the most substantial morbidity and mortality (Bedford et al., 2015; Bedford et al., 2014; Ferguson et al., 2005; Hay et al., 2001). Seasonal influenza A viruses (IAV) cause annual winter epidemics in temperate zones of the Northern and Southern Hemispheres and circulate year-round in tropical regions (Simonsen, 1999). Influenza A epidemic burden fluctuates substantially from year to year (Viboud et al., 2004), and there is much scientific interest in disentangling the relative roles of viral evolution, prior immunity, human behavior, and climatic factors in driving this seasonal variability. Climatic factors, such as humidity and temperature, have been implicated in the seasonality and timing of winter outbreaks in temperate regions (Chattopadhyay et al., 2018; Kramer and Shaman, 2019; Lee et al., 2018; Shaman and Kohn, 2009; Shaman et al., 2010), while contact and mobility patterns contribute to the seeding of new outbreaks and geographic spread (Bedford et al., 2010; Bedford et al., 2015; Charu et al., 2017; Chattopadhyay et al., 2018; Geoghegan et al., 2018; Pei et al., 2018; Viboud et al., 2006). A principal requirement for the recurrence of epidemics is a sufficient and continuous source of susceptible individuals, which is determined by the degree of cross-immunity between the surface antigens of currently circulating viruses and functional antibodies elicited by prior infection or vaccination in a population.

Because mutations to the HA1 region of the HA protein are considered to drive the majority of antigenic drift (Nelson and Holmes, 2007; Wiley et al., 1981), influenza virus genetic and antigenic surveillance have focused primarily on HA, and official influenza vaccine formulations prescribe the amount of HA (Fiore et al., 2009). Yet, evidence for the effect of HA drift on influenza epidemic dynamics remains conflicting. Theoretical and empirical studies have shown that HA drift between currently circulating viruses and the previous season’s viruses is expected to cause earlier, larger, more severe, or more synchronized epidemics; however, the majority of these studies were limited to the pre 2009 influenza pandemic period (Bedford et al., 2014; Boni et al., 2004; Geoghegan et al., 2018; Greene et al., 2006; Koelle et al., 2006; Koelle et al., 2009; Wolf et al., 2010; Wu et al., 2010). Information on HA evolution has been shown to improve forecasts of seasonal influenza dynamics in Israel (Axelsen et al., 2014) and the United States (Du et al., 2017), but recent research has also found that HA evolution is not predictive of epidemic size in Australia (Lam et al., 2020) or epidemic timing in the United States (Charu et al., 2017). A caveat is that many of these studies used binary indicators to study seasonal antigenic change, defined as seasons in which circulating viruses were antigenically distinct from the vaccine reference strain (Charu et al., 2017; Geoghegan et al., 2018; Greene et al., 2006; Lam et al., 2020; Smith et al., 2004). This may obscure epidemiologically relevant patterns, as positive selection in HA and NA is both episodic and continuous (Bedford et al., 2011; Bedford et al., 2014; Bhatt et al., 2011; Huddleston et al., 2020; Shih et al., 2007; Smith et al., 2004; Suzuki, 2008). Past research has also typically focused on serological and sequence-based measures of viral evolution in isolation, and the relative importance of these two approaches in predicting epidemic dynamics has not been systematically assessed. Further, to the best of our knowledge, the epidemiologic impact of NA evolution has not been explored.

There has been recent recognition of NA’s role in virus inhibiting antibodies and its potential as a vaccine target (Chen et al., 2018; Eichelberger et al., 2018; Wohlbold et al., 2015). Although antibodies against NA do not prevent influenza infection, NA immunity attenuates the severity of infection by limiting viral replication (Brett and Johansson, 2005; Couch et al., 1974; Johansson et al., 1993; Kilbourne, 1976; Murphy et al., 1972; Schulman et al., 1968), and NA-specific antibody titers are an independent correlate of protection in both field studies and human challenge trials (Couch et al., 2013; Memoli et al., 2016; Monto et al., 2015). Lastly, the phenomenon of interference between influenza A subtypes, modulated by immunity to conserved T-cell epitopes (Grebe et al., 2008; Sridhar et al., 2013; Ulmer et al., 1998), has long been debated (Epstein, 2006; Sonoguchi et al., 1985). Interference effects are most pronounced during pandemic seasons, leading to troughs or even replacement of the resident subtype in some pandemics (Ferguson et al., 2003), but the contribution of heterosubtypic interference to annual dynamics is unclear (Cowling et al., 2014; Gatti et al., 2022; Goldstein et al., 2011; He et al., 2015; Steinhoff et al., 1993).

Here, we link A(H3N2) virus evolutionary dynamics to epidemiologic surveillance data in the United States over the course of 22 influenza seasons prior to the coronavirus disease 2019 (COVID-19) pandemic, considering the full diversity of viruses circulating in this period. We analyze a variety of antigenic and genetic markers of HA and NA evolution against multiple indicators characterizing the epidemiology and disease burden of annual outbreaks. Rather than characterize in situ evolution of A(H3N2) lineages circulating in the U.S., we study the epidemiological impacts of antigenic drift once A(H3N2) variants have arrived on U.S. soil and managed to establish and circulate at relatively high levels. We find a signature of both HA and NA antigenic drift in surveillance data, with a more pronounced relationship in epitope change rather than the serology-based indicator, along with a major effect of subtype interference. Our study has implications for surveillance of evolutionary indicators that are most relevant for population impact and for the prediction of influenza burden on inter-annual timeframes.

Methods

Our study focuses on the impact of A(H3N2) virus evolution on seasonal epidemics from seasons 1997–1998 to 2018–2019 in the U.S.; whenever possible, we make use of regionally disaggregated indicators and analyses. We start by identifying multiple indicators of influenza evolution each season based on changes in HA and NA. Next, we compile influenza virus subtype-specific incidence time series for U.S. Department of Health and Human Service (HHS) regions and estimate multiple indicators characterizing influenza A(H3N2) epidemic dynamics each season, including epidemic burden, severity, type/subtype dominance, timing, and the age distribution of cases. We then assess univariate relationships between national indicators of evolution and regional epidemic characteristics. Lastly, we use multivariable regression models and random forest models to measure the relative importance of viral evolution, heterosubtypic interference, and prior immunity in predicting regional A(H3N2) epidemic dynamics.

Influenza epidemic timing and burden

Epidemiological data processing and analysis were performed using R version 4.3 (R Development Core Team, 2023).

Influenza-like illness and virological surveillance data

We obtained weekly epidemiological and virological data for influenza seasons 1997–1998 to 2018–2019, at the U.S. HHS region level. We defined influenza seasons as calendar week 40 in a given year to calendar week 20 in the following year, with the exception of the 2008–2009 season, which ended in 2009 week 16 due to the emergence of the A(H1N1)pdm09 virus (Goldstein et al., 2011).

We extracted syndromic surveillance data for the 10 HHS regions from the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet) (Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases, 2023a). ILINet consists of approximately 3200 sentinel outpatient healthcare providers throughout the U.S. that report the total number of consultations for any reason and the number of consultations for influenza-like illness (ILI) every week. ILI is defined as fever (temperature of 100 °F [37.8 °C] or greater) and a cough and/or a sore throat. ILI rates are based on the weekly proportion of outpatient consultations for influenza-like illness and are available weighted or unweighted by regional population size. The number of ILI encounters by age group are also provided (0–4, 5–24, 25–64, and ≥65), but these data are not weighted by total encounters or population size.

We obtained data on weekly influenza virus type and subtype circulation from the U.S. CDC’s WHO Collaborating Center for Surveillance, Epidemiology and Control of Influenza (World Health Organization, 2023). Approximately 100 public health laboratories and 300 clinical laboratories located throughout the U.S. report influenza test results to the U.S. CDC, through either the U.S. WHO Collaborating Laboratories Systems or the National Respiratory and Enteric Virus Surveillance System (NREVSS). Clinical laboratories test respiratory specimens for diagnostic purposes whereas public health laboratories primarily test specimens to characterize influenza virus type, subtype, and lineage circulation. Public health laboratories often receive samples that have already tested positive for influenza at a clinical laboratory.

We estimated the weekly number of respiratory samples testing positive for influenza A(H3N2), A(H1N1), A(H1N1)pdm09, or B at the HHS region level. We combined pre-2009 seasonal A(H1N1) and A(H1N1)pdm09 as influenza A(H1N1) and the Victoria and Yamagata lineages of influenza B as influenza B. Beginning in the 2015/2016 season, reports from public health and clinical laboratories are presented separately in the CDC’s weekly influenza updates. From 2015 week 40 onwards, we used clinical laboratory data to estimate the proportion of respiratory samples testing positive for any influenza type/subtype and the proportion of samples testing positive for influenza A or B. We used public health laboratory data to estimate the proportion of influenza A isolates typed as A(H3N2) or A(H1N1) in each week. Untyped influenza A-positive isolates were assigned to either A(H3N2) or A(H1N1) according to their proportions among typed isolates.

We defined influenza A subtype dominance in each season based on the proportion of influenza A virus (IAV) positive samples typed as A(H3N2). Specifically, we categorized seasons as A(H3N2) or A(H1N1) dominant when ≥70% of IAV positive samples were typed as one IAV subtype and co-dominant when one IAV subtype comprised 50–69% of IAV positive samples. We applied a strict threshold for subtype dominance because seasons with <70% samples typed as one IAV subtype tended to have greater geographic heterogeneity in circulation, resulting in regions with dominant subtypes that were not nationally dominant.

For each HHS region, we estimated weekly incidences of influenza A(H3N2), A(H1N1), and B by multiplying the percentage of influenza-like illness among outpatient visits, weighted by regional population size, with the percentage of respiratory samples testing positive for each type/subtype (Figure 1, Figure 1—figure supplement 1). ILI × percent positive (ILI+) is considered a robust estimate of influenza activity and has been used in multiple prior modeling studies (Bedford et al., 2014; Goldstein et al., 2011; Pei et al., 2018). We used linear interpolation to estimate missing values for time spans of up to 4 consecutive weeks.

Figure 1. Annual influenza A(H3N2) epidemics in the United States, 1997 – 2019.

(A) Weekly incidence of influenza A(H1N1) (blue), A(H3N2) (red), and B (green) averaged across 10 HHS regions (Region 1: Boston; Region 2: New York City; Region 3: Washington, DC; Region 4: Atlanta; Region 5: Chicago; Region 6: Dallas, Region 7: Kansas City; Region 8: Denver; Region 9: San Francisco; Region 10: Seattle). Incidences are the proportion of influenza-like illness (ILI) visits among all outpatient visits, multiplied by the proportion of respiratory samples testing positive for each influenza type/subtype. Time series are 95% confidence intervals of regional incidence estimates. Vertical dashed lines indicate January 1 of each year. (B) Intensity of weekly influenza A(H3N2) incidence in 10 HHS regions. White tiles indicate weeks when influenza-like-illness data or virological data were not reported. Data for Region 10 are not available in seasons prior to 2009.

Figure 1.

Figure 1—figure supplement 1. Annual influenza A(H1N1) and influenza B epidemics in the United States, 1997 - 2019.

Figure 1—figure supplement 1.

Intensity of weekly (A) influenza A(H1N1) and (B) influenza B incidence in 10 HHS regions. Incidences are the proportion of influenza-like illness (ILI) visits among all outpatient visits, multiplied by the proportion of respiratory samples testing positive for each influenza type/subtype. Seasonal and pandemic A(H1N1) are combined as influenza A(H1N1), and the Victoria and Yamagata lineages of influenza B are combined as influenza B. White tiles indicate weeks when either influenza-like-illness cases or virological data were not reported. Data for Region 10 are not available in seasons prior to 2009.
Figure 1—figure supplement 2. Influenza test volume systematically increases in all HHS regions after the 2009 A(H1N1) pandemic.

Figure 1—figure supplement 2.

Each point represents the total number of influenza tests in each HHS region in each season, as reported by the U.S. CDC WHO Collaborating Center for Surveillance, Epidemiology and Control of Influenza. In each boxplot, the whiskers extend to the first and third quartiles of the distribution, and the centre bar represents the median number of specimens. Data for Region 10 are not available in seasons prior to 2009.
Figure 1—figure supplement 3. Pairwise correlations between seasonal influenza A(H3N2), A(H1N1), and B epidemic metrics.

Figure 1—figure supplement 3.

Spearman’s rank correlations among indicators of A(H3N2) epidemic timing, including onset week, peak week, regional variation (s.d.) in onset and peak timing, the number of days from epidemic onset to peak incidence, and seasonal duration, indicators of A(H3N2) epidemic magnitude, including epidemic intensity (i.e. the ‘sharpness’ of the epidemic curve), transmissibility (maximum effective reproduction number, Rt), subtype dominance, epidemic size, and peak incidence. Correlations between the circulation of other influenza types/subtypes and A(H3N2) epidemic burden and timing are also included. The color of each circle indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation). Stars within circles indicate statistical significance (adjusted p<0.05). The Benjamini and Hochberg method was used to adjust p-values for multiple testing.

The emergence of the A(H1N1)pdm09 virus in 2009 altered influenza testing and reporting patterns (Figure 1—figure supplement 2). Specifically, the U.S. CDC and WHO increased laboratory testing capacity and strengthened epidemiological networks, which led to substantial improvements to influenza surveillance that are still in place today (Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases, 2023b). For each HHS region, we adjusted weekly incidences for increases in reporting rates during the post-pandemic period – defined as the weeks after 2010 week 33 – by scaling pre-pandemic incidences by the ratio of mean weekly ILI+ in the post-pandemic period to that of the pre-pandemic period (1997 week 40–2009 week 17). Incidences for HHS Region 10 were not adjusted for pre- and post-pandemic reporting because surveillance data for this region were not available prior to 2009. To account for differences in reporting rates across HHS regions, we next scaled each region’s type/subtype incidences by its mean weekly ILI+ for the entire study period. Scaled incidences were used in all downstream analyses of epidemic burden and timing.

Characteristics of seasonal influenza epidemics

Epidemic burden

We considered three complementary indicators of epidemic burden, separately for each influenza type/subtype, HHS region, and season. We defined peak incidence as the maximum weekly scaled incidence and epidemic size as the cumulative weekly scaled incidence. We estimated epidemic intensity based on a method previously developed to study variation in the shape (i.e. sharpness) of influenza epidemics across U.S. cities (Dalziel et al., 2018). Epidemic intensity increases when incidence is more concentrated in particular weeks and decreases when incidence is more evenly spread across weeks. Specifically, we defined the incidence distribution pij as the fraction of influenza incidence in season j that occurred during week i in a given region, and epidemic intensity vj as the inverse of the Shannon entropy of the weekly incidence distribution:

vj=(ipijlnpij)1 (1)

Epidemic intensity is intended to measure the shape and spread of an epidemic, regardless of the actual volume of cases in a given region or season. Following the methodology of Dalziel et al., epidemic intensity values were normalized to fall between 0 and 1 so that epidemic intensity is invariant to differences in reporting rates and/or attack rates across regions and seasons.

Transmission intensity

For each region in each season, we used semi-mechanistic epidemiological models to estimate A(H3N2) virus time-varying (instantaneous) reproduction numbers, Rt, by date of infection (Epidemia R package; Bhatt et al., 2023; Scott et al., 2021). Epidemia implements a Bayesian approach using the probabilistic programming language Stan (Carpenter et al., 2017). Prior to Rt estimation, we computed daily A(H3N2) case counts by disaggregating weekly incidence rates to daily rates (tempdisagg R package; Sax and Steiner, 2013) and rounding the resultant values to integers.

Model specifications

Formally, Rt is modeled as:

Rt=exp(βo+ϵt1), (2)
βoNormal(log(Ro),0.2), (3)
ϵt1Normal(0,σϵ), (4)
σϵHalfNormal(0,0.01), (5)

where exp is the exponential function, the mean of the prior for the intercept βo is the natural log of the basic reproduction number Ro of A(H3N2) virus (1.3) (Biggerstaff et al., 2014a), and ϵt1 is a daily random walk process. The steps of the daily walks ϵt1 are independent and centered around 0 with standard deviation σϵ.

Instead of using a renewal process to propagate infections, we modeled new infections it as unknown latent parameters it`, because the additional variance around infections can account for uncertainty in initial growth rates, as well as superspreading events (Bhatt et al., 2023; Scott et al., 2021):

itNormal(it,d), (6)
dNormal(10,2), (7)

where d is the coefficient of dispersion. This prior assumes that infections have conditional variance around 10 times the conditional mean (Scott et al., 2021).

The generation interval distribution gk is the probability that s days separate the moment of infection in an index case and in an offspring case. For the generation interval, we assumed a discretized Weibull distribution with mean 3.6 days and s.d. 1.6 days (Cowling et al., 2009).

Given the generation interval distribution gk, the number of new infections on day t is given by the convolution function:

it=Rts<tisgts, (8)

where Rt is the non-negative instantaneous reproduction number. Rt can be expressed as the number of new infections on day t relative to the cumulative sum of individuals infected s days before day t, weighted by the current infectiousness of those individuals (Cori et al., 2013; Gostic et al., 2020):

Rt=its<tisgts (9)

The model is initialized with seeded infections iv:0,v<0, which are treated as unknown parameters (Bhatt et al., 2023; Scott et al., 2021). The prior on iv:0 assumes that daily seeds are constant over a seeding period of 6 days:

i6:0Exponential(τ1), (10)
τExponential(λ0), (11)

where λ0>0 is a rate hyperparameter. λ0 is given an uninformative prior (0.03) so that seeds are primarily determined by initial transmission rates and the chosen start date of the epidemic (Bhatt et al., 2023; Scott et al., 2021).

Daily case counts Yt are modeled as deriving from past new infections is,s<t, assuming a negative binomial observation model with mean yt and overdispersion parameter ϕ and a constant infection ascertainment rate α of 0.45 (Biggerstaff et al., 2014b). The expected number of observed cases on day t was mapped to past infections by convolving over the time distribution of infection to case observation πk:

YtNegativeBinomial(yt,ϕ) (12)
ϕNormal(10,5) (13)
logit(yt)=α(stisπts) (14)

We estimated πk by summing the incubation period distribution and the reporting delay distribution (i.e. the time period from symptom onset to case observation), assuming a lognormal-distributed incubation period with mean 1.4 days and s.d. 1.5 days (Lessler et al., 2009) and a lognormal-distributed reporting delay with mean 2 days and s.d. 1.5 days (Russell et al., 2018). Thus, the time distribution for infection-to-case-observation was:

π lognormal(1.4,1.5)+lognormal(2,1.5) (15)

Epidemic trajectories for each region and season were fit independently using Stan’s Hamiltonian Monte Carlo sampler (Hoffman and Gelman, 2014). For each model, we ran four chains, each for 10,000 iterations (including a burn-in period of 2000 iterations that was discarded), producing a total posterior sample size of 32,000. We verified convergence by confirming that all parameters had sufficiently low R-hat values (all R-hat <1.1⁠) and sufficiently large effective sample sizes (>15% of the total sample size).

To generate seasonal indicators of transmission intensity, we extracted posterior draws of daily Rt estimates for each region and season, calculated the median value for each day, and averaged daily median values by epidemic week. For each region and season, we averaged Rt estimates from the weeks spanning epidemic onset to epidemic peak (initial Rt) and averaged the two highest Rt estimates (maximum Rt). Initial Rt and maximum Rt produced qualitatively equivalent results in downstream analyses, so we opted to report results for maximum Rt.

Excess pneumonia and influenza deaths attributable to A(H3N2)

To measure the epidemic severity each season, we obtained estimates of seasonal excess mortality attributable to influenza A(H3N2) infections (Hansen et al., 2022). Excess mortality is a measure of the mortality burden of a given pathogen in excess of a seasonally adjusted baseline, obtained by regressing weekly deaths from broad disease categories against indicators of influenza virus circulation. Hansen et al. used pneumonia and influenza (P&I) excess deaths, which are considered the most specific indicator of influenza burden (Simonsen and Viboud, 2012). Deaths with a mention of P&I (ICD-10 codes J00-J18) were aggregated by week and age group (<1, 1–4, 5–49, 50–64, and ≥65) for seasons 1998–1999 to 2017–2018. Age-specific generalized linear models were fit to observed weekly P&I death rates, while accounting for influenza and respiratory syncytial virus (RSV) activity and seasonal and temporal trends. The weekly national number of excess A(H3N2)-associated deaths were estimated by subtracting the baseline death rate expected in the absence of A(H3N2) virus circulation (A(H3N2) model terms set to zero) from the observed P&I death rate. We summed the number of excess A(H3N2) deaths per 100,000 people from October to May to obtain seasonal age-specific estimates.

Epidemic timing

Epidemic onset and peak timing

We estimated the regional onsets of A(H3N2) virus epidemics by detecting breakpoints in A(H3N2) incidence curves at the beginning of each season. The timing of the breakpoint in incidence represents epidemic establishment (i.e. sustained transmission) rather than the timing of influenza introduction or arrival (Charu et al., 2017). We used two methods to estimate epidemic onsets: (1) piecewise regression, which models non-linear relationships with break points by iteratively fitting linear models to each segment (segmented R package; Muggeo, 2008; Muggeo, 2003), and (2) a Bayesian ensemble algorithm (BEAST – a Bayesian estimator of Abrupt change, Seasonal change, and Trend) that explicitly accounts for the time series nature of incidence data and allows for complex, non-linear trajectories interspersed with change points (Rbeast R package) (Zhao et al., 2019). For each region in each season, we limited the time period of breakpoint detection to epidemic week 40 to the first week of maximum incidence and did not estimate epidemic onsets for regions with insufficient signal, which we defined as fewer than three weeks of consecutive incidence and/or greater than 30% of weeks with missing data. We successfully estimated A(H3N2) onset timing for most seasons, except for three A(H1N1) dominant seasons: 2000–2001 (0 regions), 2002–2003 (3 regions), and 2009–2010 (0 regions). Estimates of epidemic onset weeks were similar when using piecewise regression versus the BEAST method, and downstream analyses of correlations between viral fitness indicators and onset timing produced equivalent results. We therefore report results from onsets estimated via piecewise regression. We defined epidemic peak timing as the first week of maximum incidence.

Epidemic speed

To measure spatiotemporal synchrony of regional epidemic dynamics, we calculated the standard deviation (s.d.) of regional onset and peak timing in each season (Viboud et al., 2006; Wolf et al., 2010). To measure the speed of viral spread in each region in each season, we measured the number of days spanning onset and peak weeks and seasonal duration (the number of weeks of non-zero incidence). We used two-sided Wilcoxon rank-sum tests to compare the distributions of epidemic timing metrics between A(H3N2) and A(H1N1) dominant seasons.

Wavelet analysis

As a sensitivity analysis, we used wavelets to estimate timing differences between A(H3N2), A(H1N1), and B epidemics in each HHS region. Incidence time series were square root transformed and normalized and then padded with zeros to reduce edge effects. Wavelet coherence was used to determine the degree of synchrony between A(H3N2) versus A(H1N1) incidence and A(H3N2) versus B incidence within each region at multi-year time scales. Statistical significance was assessed using 10,000 Monte Carlo simulations. Coherence measures time- and frequency-specific associations between two wavelet transforms, with high coherence indicating that two non-stationary signals (time series) are associated at a particular time and frequency (Johansson et al., 2009).

Following methodology developed for influenza and other viruses (Grenfell et al., 2001; Johansson et al., 2009; Liebhold et al., 2004; Viboud et al., 2006; Weinberger et al., 2012), we used continuous wavelet transformations (Morlet) to calculate the phase of seasonal A(H3N2), A(H1N1), and B epidemics. We reconstructed weekly time series of phase angles using wavelet reconstruction (Torrence and Compo, 1998; Viboud et al., 2006) and extracted the major one-year seasonal component (period 0.8–1.2 years) of the Morlet decomposition of A(H3N2), A(H1N1), and B time series. To estimate the relative timing of A(H3N2) and A(H1N1) incidence or A(H3N2) and B incidence in each region, phase angle differences were calculated as phase in A(H3N2) minus phase in A(H1N1) (or B), with a positive value indicating that A(H1N1) (or B) lags A(H3N2).

Influenza-like illness age patterns

We calculated the seasonal proportion of ILI encounters in each age group (0–4 years, 5–24 years, 25–64 years, and ≥65 years). Data for more narrow age groups are available after 2009, but we chose these four categories to increase the number of seasons in our analysis.

Influenza vaccination coverage and A(H3N2) vaccine effectiveness

Influenza vaccination coverage and effectiveness vary between years and would be expected to affect the population impact of seasonal outbreaks, and in turn our epidemiologic indicators. We obtained seasonal estimates of national vaccination coverage for adults 18–49 years and adults ≥65 years from studies utilizing vaccination questionnaire data collected by the National Health Interview Survey (Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases, 2023b; Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases, 2019; Jang and Kang, 2021; Lu et al., 2019; Lu et al., 2013; National Health Interview Survey, 2008; Ward et al., 2015; Ward et al., 2016). We did not consider the effects of vaccination coverage in children, due to our inability to find published estimates for most influenza seasons in our study.

We obtained seasonal estimates of adjusted A(H3N2) vaccine effectiveness (VE) from 32 observational studies (Belongia et al., 2011; Bridges et al., 2000; Castilla et al., 2016; Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases, 2023b; Centers for Disease Control and Prevention (CDC), 2004; Flannery et al., 2019; Flannery et al., 2020; Flannery et al., 2016; Jackson et al., 2017; Janjua et al., 2012; Kawai et al., 2003; Kissling et al., 2013; Lester et al., 2003; McLean et al., 2014; Ohmit et al., 2014; Pebody et al., 2017; Rolfes et al., 2019; Simpson et al., 2015; Public Health Agency of Canada, 2005; Skowronski et al., 2017a; Skowronski et al., 2016; Skowronski et al., 2017b; Skowronski et al., 2010; Skowronski et al., 2009; Skowronski et al., 2014a; Skowronski et al., 2012; Skowronski et al., 2014b; Skowronski et al., 2022; Skowronski et al., 2007; Treanor et al., 2012; Valenciano et al., 2018; van Doorn et al., 2017; Zimmerman et al., 2016). Most studies had case-control test-negative designs (N=30) and took place in North America (N=25) or Europe (N=6). When possible, we limited VE estimates to those for healthy adults or general populations. When multiple VE studies were available for a given season, we calculated mean VE as the weighted average of m different VE point estimates:

i=1mδVEi1/2VEii=1mδVEi1/2, (16)

wherein δVE denotes the width of the 95% confidence interval (CI) for VEi (Ndifon et al., 2009).

The 95% CI for the weighted mean VE was calculated as:

1mi=1m(δVEi)2 (17)

Correlations between seasonal epidemic metrics

We used Spearman’s rank correlation coefficients to measure pairwise relationships between A(H3N2), A(H1N1), and B epidemiological indictors. We adjusted p-values for multiple testing using the Benjamini and Hochberg method (Benjamini and Hochberg, 1995).

Indicators of influenza A(H3N2) evolution

We considered multiple indicators of influenza evolution based on genetic and phenotypic (serologic) data, separately for HA and NA (Figure 2, Table 1). Our choice of evolutionary indicators builds on earlier studies that found hemagglutination inhibition (HI) phenotype or HA sequence data beneficial in forecasting seasonal influenza virus evolution (Huddleston et al., 2020; Luksza and Lässig, 2014; Neher et al., 2016; Neher et al., 2014) or annual epidemic dynamics (Axelsen et al., 2014; Du et al., 2017; Wolf et al., 2010; Table 1).

Figure 2. Antigenic and genetic evolution of seasonal influenza A(H3N2) viruses, 1997 – 2019.

(A–B) Temporal phylogenies of (A) hemagglutinin (H3) and (B) neuraminidase (N2) gene segments. Tip color denotes the Hamming distance from the root of the tree, based on the number of substitutions at epitope sites in H3 (N=129 sites) and N2 (N=223 sites). Black ‘X’ marks indicate the phylogenetic positions of U.S. recommended vaccine strains. (C–D) Seasonal genetic and antigenic distances are the mean distance between A(H3N2) viruses circulating in the current season t and viruses circulating in the prior season (t – 1), measured by (C) five sequence-based metrics (HA epitope (N=129), HA receptor binding site (RBS) (N=7), HA stalk footprint (N=34), NA epitope (N=223 or N=53)) and (D) hemagglutination inhibition (HI) titer measurements. (E) The Shannon diversity of H3 and N2 local branching index (LBI) values in each season. Vertical bars in (C), (D), and (E) are 95% confidence intervals of seasonal estimates from five bootstrapped phylogenies.

Figure 2—source data 1. A/H3 sequence counts in five subsampled datasets.
We downloaded all H3 sequences and associated metadata from the GISAID EpiFlu database and focused our analysis on complete H3 sequences that were sampled between January 1, 1997, and October 1, 2019. To account for variation in sequence availability across global regions, we subsampled the selected sequences five times to representative sets of no more than 50 viruses per month, with preferential sampling for North America. Each month up to 25 viruses were selected from North America (when available) and up to 25 viruses were selected from nine other global regions (when available), with even sampling across the other global regions (China, Southeast Asia, West Asia, Japan and Korea, South Asia, Oceania, Europe, South America, and Africa).
Figure 2—source data 2. A/N2 sequence counts in five subsampled datasets.
We downloaded all N2 sequences and associated metadata from the GISAID EpiFlu database and focused our analysis on complete N2 sequences that were sampled between January 1, 1997, and October 1, 2019. To account for variation in sequence availability across global regions, we subsampled the selected sequences five times to representative sets of no more than 50 viruses per month, with preferential sampling for North America. Each month up to 25 viruses were selected from North America (when available) and up to 25 viruses were selected from nine other global regions (when available), with even sampling across the other global regions (China, Southeast Asia, West Asia, Japan and Korea, South Asia, Oceania, Europe, South America, and Africa).

Figure 2.

Figure 2—figure supplement 1. The number of A/H3 sequences in five subsampled datasets in each month and in each influenza season.

Figure 2—figure supplement 1.

In each figure, the five subsampled datasets are plotted individually but individual time series are difficult to discern due to minor differences in sequence counts across the datasets. (A) The number of sequences in subsampled datasets in each month collected in North America (purple) versus nine other world regions combined (dark green). (B) The total number of sequences in subsampled datasets collected in each month in all world regions combined. (C) The number of sequences in subsampled datasets in each season collected in North America (purple) versus nine other world regions combined (dark green). (D) The total number of sequences in subsampled datasets collected in each season in all world regions combined.
Figure 2—figure supplement 2. The number of A/N2 sequences in five subsampled datasets in each month and in each influenza season.

Figure 2—figure supplement 2.

In each figure, the five subsampled datasets are plotted individually but individual time series are difficult to discern due to minor differences in sequence counts across the datasets. (A) The number of sequences in subsampled datasets in each month collected in North America (purple) versus nine other world regions combined (dark green). (B) The total number of sequences in subsampled datasets collected in each month in all world regions combined. (C) The number of sequences in subsampled datasets in each season collected in North America (purple) versus nine other world regions combined (dark green). (D) The total number of sequences in subsampled datasets in each season in all world regions combined.
Figure 2—figure supplement 3. Comparison of seasonal antigenic drift measured by substitutions at H3 epitope sites and HI log2 titer measurements, from seasons 1997–1998 to 2018–2019.

Figure 2—figure supplement 3.

Spearman’s rank correlations between H3 epitope distance and HI log2 titer distance at (A) one-season lags and (B) two-season lags. Correlation coefficients and associated p-values are shown in the top right section of each plot. Seasonal antigenic distance is the mean distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1 year, one-season lags) or two prior seasons ago (t – 2 years, two-season lags). Seasonal distances are scaled because H3 epitope distance and HI log2 titer distance use different units of measurement. Point labels indicate the current influenza season, and point color denotes the relative timing of influenza seasons, with earlier seasons shaded dark purple (e.g. 1997–1998) and later seasons shaded light yellow (e.g. 2018–2019). H3 epitope distance and HI log2 titer distance at two-season lags capture expected ‘jumps’ in antigenic drift during key seasons previously associated with major antigenic transitions (Smith et al., 2004), such as the SY97 cluster seasons (1997–1998, 1998–1999, 1999–2000), the FU02 cluster season (2003–2004), and the CA04 cluster season (2004–2005).
Figure 2—figure supplement 4. Pairwise correlations between H3 and N2 evolutionary indicators (one-season lags).

Figure 2—figure supplement 4.

Spearman’s rank correlations between seasonal measures of H3 and N2 evolution, including H3 RBS distance, H3 epitope distance, H3 non-epitope distance, H3 stalk footprint distance, HI log2 titer distance, N2 epitope distance based on 223 or 53 epitope sites, N2 non-epitope distance, and the standard deviation (s.d.) and Shannon diversity of H3 and N2 local branching index (LBI) values in the current season t. Seasonal distances were estimated as the mean distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1). The color of each circle indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation). Stars within circles indicate statistical significance (adjusted p<0.05). The Benjamini and Hochberg method was used to adjust p-values for multiple testing.
Figure 2—figure supplement 5. Pairwise correlations between H3 and N2 evolutionary indicators (two-season lags).

Figure 2—figure supplement 5.

We measured Spearman’s rank correlations between seasonal measures of H3 and N2 evolution, including H3 RBS distance, H3 epitope distance, H3 non-epitope distance, H3 stalk footprint distance, HI log2 titer distance, N2 epitope distance based on 223 or 53 epitope sites, N2 non-epitope distance, and the standard deviation (s.d.) and Shannon diversity of H3 and N2 local branching index (LBI) values in the current season t. Seasonal distances were estimated as the mean distance between viruses circulating in the current season t and viruses circulating two prior seasons ago (t – 2). The color of each circle indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation). Stars within circles indicate statistical significance (adjusted p<0.05). The Benjamini and Hochberg method was used to adjust p-values for multiple testing.
Figure 2—figure supplement 6. Pairwise correlations between H3 and N2 evolutionary indicators (one- and two-season lags).

Figure 2—figure supplement 6.

We measured Spearman’s rank correlations between seasonal measures of H3 and N2 evolution, including H3 RBS distance, H3 epitope distance, H3 non-epitope distance, H3 stalk footprint distance, HI log2 titer distance, N2 epitope distance based on 223 or 53 epitope sites, N2 non-epitope distance, and the standard deviation (s.d.) and Shannon diversity of H3 and N2 local branching index (LBI) values in the current season t. Seasonal distances were estimated as the mean distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1) or two prior seasons ago (t – 2). The color of each circle indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation). Stars within circles indicate statistical significance (adjusted p<0.05). The Benjamini and Hochberg method was used to adjust p-values for multiple testing.
Figure 2—figure supplement 7. Comparison of seasonal antigenic drift measured by substitutions at H3 and N2 epitope sites, from seasons 1997–1998 to 2018–2019.

Figure 2—figure supplement 7.

Spearman’s rank correlations between H3 epitope distance and N2 epitope distance at (A) one-season lags and (B) two-season lags. Correlation coefficients and associated p-values are shown in the top right section of each plot. Seasonal epitope distance is the mean distance between viruses circulating in the current season t and viruses circulating in the prior season t – 1 (one-season lag) or two prior seasons ago t – 2 (two-season lag). Point labels indicate the current influenza season, and point color denotes the relative timing of influenza seasons, with earlier seasons shaded dark purple (e.g. 1997–1998) and later seasons shaded light yellow (e.g. 2018–2019). H3 epitope distance at two-season lags and N2 epitope distance at one-season lags capture expected ‘jumps’ in antigenic drift during key seasons previously associated with major antigenic transitions (Smith et al., 2004), such as the SY97 cluster seasons (1997–1998, 1998–1999, 1999–2000), the FU02 cluster season (2003–2004), and the CA04 cluster season (2004–2005).

Table 1. Evolutionary indicators of seasonal viral fitness.

Evolutionary indicators are labeled by the influenza gene for which data are available (hemagglutinin, HA or neuraminidase, NA), the type of data they are based on, and the component of influenza fitness they represent.

Evolutionary indicator Influenza gene Data type Fitness category Citations
HI log2 titer distance from the prior season HA Hemagglutination inhibition measurements using ferret sera Antigenic drift Huddleston et al., 2020; Neher et al., 2016
Epitope distance from the prior season HA and NA Sequences Antigenic drift Bhatt et al., 2011; Bush et al., 1999; Krammer, 2023; Webster and Laver, 1980; Wiley et al., 1981; Wilson and Cox, 1990; Wolf et al., 2010
Receptor binding site distance from the prior season HA Sequences Antigenic drift Koel et al., 2013
Mutational load (non-epitope distance from the prior season) HA and NA Sequences Functional constraint Luksza and Lässig, 2014
Stalk ‘footprint‘ distance from the prior season HA Sequences Negative control Kirkpatrick et al., 2018
Local branching index HA and NA Sequences Rate of recent phylogenetic branching Huddleston et al., 2020; Neher et al., 2014

Table format is adapted from Huddleston et al., 2020.

HA and NA sequence data

We downloaded all H3 sequences and associated metadata from the Global Initiative on Sharing All Influenza Data (GISAID) EpiFlu database (Shu and McCauley, 2017). We focused our analysis on complete H3 sequences that were sampled between January 1, 1997, and October 1, 2019. We prioritized viruses with corresponding HI titer measurements provided by the WHO Global Influenza Surveillance and Response System (GISRS) Collaborating Centers and excluded all egg-passaged viruses and sequences with ambiguous year, month, and day annotations. To account for variation in sequence availability across global regions, we subsampled the selected sequences five times to representative sets of no more than 50 viruses per month, with preferential sampling for North America. Each month up to 25 viruses were selected from North America (when available) and up to 25 viruses were selected from nine other global regions (when available), with even sampling across the other global regions (Africa, Europe, China, South Asia, Japan and Korea, Oceania, South America, Southeast Asia, and West Asia; Figure 2—figure supplement 1). To ensure proper topology early in the phylogeny, we included reference strains that had been collected no earlier than 5 years prior to January 1, 1997. The resultant sets of H3 sequences included 10,060–10,062 sequences spanning December 25, 1995 – October 1, 2019 (Figure 2—source data 1). Although our subsampling scheme entailed selecting up to 50 viruses per month, with up to 25 viruses per month collected in North America, each replicate dataset was comprised of approximately 40% North American sequences across all seasons combined (Figure 2—source data 1), due to low sequence volumes in the early years of our study.

As with the H3 analysis, we downloaded all N2 sequences and associated metadata from GISAID and selected complete N2 sequences that were sampled between January 1, 1997, and October 1, 2019. We excluded all sequences with ambiguous year, month, and day annotations, forced the inclusion of reference strains collected no earlier than 5 years prior to January 1, 1997, and compiled five replicate subsampled datasets with preferential sampling for North America (8815–8816 sequences; June 8, 1995 – October 1, 2019; Figure 2—figure supplement 2, Figure 2—source data 2). Similar to the H3 sequence datasets, each replicate dataset was comprised of approximately 40% North American sequences across all seasons combined (Figure 2—source data 2).

HA serologic data

Hemagglutination inhibition (HI) measurements from ferret sera were provided by WHO GISRS Collaborating Centers in London, Melbourne, Atlanta, and Tokyo. We converted raw two-fold dilution measurements to log2 titer drops normalized by the corresponding log2 autologous measurements (Huddleston et al., 2020; Neher et al., 2016).

Although a phenotypic assay exists for NA, NA inhibiting antibody titers are not routinely measured for influenza surveillance. Therefore, we could not include a phenotypic marker of NA evolution in our study.

Phylogenetic inference

For each set of H3 and N2 sequences, we aligned sequences with the augur align command (Hadfield et al., 2018) and MAFFT v7.407 (Katoh et al., 2002). We inferred initial phylogenies with IQ-TREE v1.6.10 (Nguyen et al., 2015). To reconstruct time-resolved phylogenies, we applied TreeTime v0.5.6 (Sagulenko et al., 2018) with the augur refine command (Huddleston et al., 2021).

Viral fitness metrics

We defined the following fitness metrics for each influenza season:

Antigenic drift

We estimated antigenic drift of each H3 sequence using either serologic or genetic data.

Historically, HI serological assays were considered the ‘gold standard’ for measuring immune cross-reactivity between viruses, yet measurements are available for only a subset of viruses. To overcome this limitation, we used a computational approach that maps HI titer measurements onto the HA phylogenetic tree to infer antigenic phenotypes (Huddleston et al., 2020; Neher et al., 2016). Importantly, this model infers the antigenicity of virus isolates that lack HI titer measurements, which comprise the majority of HA sequences in GISAID. To estimate antigenic drift with hemagglutination inhibition (HI) titer data, hereon HI log2 titer distance, we applied the phylogenetic tree model from Neher et al., 2016 to the H3 phylogeny and the available HI data for its sequences. The tree model estimates the antigenic drift per branch in units of log2 titer change.

Our sequence-based measures of drift counted substitutions at putative epitope sites in the globular head domains of HA and NA, identified through monoclonal antibody escape or protein crystal structure: 129 sites in HA epitope regions A to E (Bush et al., 1999; Webster and Laver, 1980; Wiley et al., 1981; Wilson and Cox, 1990; Wolf et al., 2006) (HA epitope distance), 7 sites adjacent to the HA receptor binding site (RBS) (Koel et al., 2013) (HA RBS distance), and 223 or 53 sites in NA epitope regions A to C (Bhatt et al., 2011; Krammer, 2023) (NA epitope distance). We also counted the number of substitutions at epitope sites in the HA stalk domain (HA stalk footprint distance) (Kirkpatrick et al., 2018). Although the majority of the antibody-mediated response to HA is directed to the immunodominant HA head, antibodies towards the highly conserved immunosubdominant stalk domain of HA are widely prevalent in older individuals, although at low levels (Krammer, 2019; Margine et al., 2013; Nachbagauer et al., 2016). We considered stalk footprint distance to be our ‘control’ metric for drift, given the HA stalk evolves at a significantly slower rate than the HA head (Kirkpatrick et al., 2018).

Mutational load

To estimate mutational load for each H3 and N2 sequence, an inverse proxy of viral fitness (Huddleston et al., 2020; Luksza and Lässig, 2014), we implemented metrics that count substitutions at putative non-epitope sites in HA (N=200) and NA (N=246), hereon HA non-epitope distance and NA non-epitope distance. Mutational load produces higher values for viruses that are less fit compared to previously circulating strains.

Clade growth

The local branching index (LBI) measures the relative fitness of co-circulating clades, with high LBI values indicating recent rapid phylogenetic branching (Huddleston et al., 2020; Neher et al., 2014). To calculate LBI for each H3 and N2 sequence, we applied the LBI heuristic algorithm as originally described by Neher et al., 2014 to H3 and N2 phylogenetic trees, respectively. We set the neighborhood parameter τ to 0.4 and only considered viruses sampled between the current season t and the previous season t – 1 as contributing to recent clade growth in the current season t.

Variation in the phylogenetic branching rates of co-circulating A(H3N2) clades may affect the magnitude, intensity, onset, or duration of seasonal epidemics. For example, we expected that seasons dominated by a single variant with high fitness might have different epidemiological dynamics than seasons with multiple co-circulating clades with varying seeding and establishment times. We measured the diversity of clade growth rates of viruses circulating in each season by measuring the standard deviation (s.d.) and Shannon diversity of LBI values in each season. Given that LBI measures relative fitness among co-circulating clades, we did not compare overall clade growth rates (e.g. mean LBI) across seasons.

Each season’s distribution of LBI values is right-skewed and does not follow a normal distribution. We therefore bootstrapped the LBI values of each season in each replicate dataset 1000 times (1000 samples with replacement) and estimated the seasonal standard deviation of LBI from resamples, rather than directly from observed LBI values. We also tested the seasonal standard deviation of LBI from log transformed LBI values, which produced qualitatively equivalent results to bootstrapped LBI values in downstream analyses.

As an alternative measure of seasonal LBI diversity, we binned raw H3 and N2 LBI values into categories based on their integer values (e.g. an LBI value of 0.5 is assigned to the (0,1] bin) and estimated the exponential of the Shannon entropy (Shannon diversity) of LBI categories (Hill, 1973; Shannon, 1948). The Shannon diversity of LBI considers both the richness and relative abundance of viral clades with different growth rates in each season and is calculated as follows:

P1D=exp(i=1Rpilnpi), (18)

where PqD is the effective number of categories or Hill numbers of order q (here, clades with different growth rates), with q defining the sensitivity of the true diversity to rare versus abundant categories (Hill, 1973). exp is the exponential function, pi is the proportion of LBI values belonging to the ith category, and R is richness (the total number of categories). Shannon diversity P1D (q=1) estimates the effective number of categories in an assemblage using the geometric mean of their proportional abundances (Hill, 1973).

Because ecological diversity metrics are sensitive to sampling effort, we rarefied H3 and N2 sequence datasets prior to estimating Shannon diversity so that seasons had the same sample size. For each season in each replicate dataset, we constructed rarefaction and extrapolation curves of LBI Shannon diversity and extracted the Shannon diversity estimate of the sample size that was twice the size of the reference sample size (the smallest number of sequences obtained in any season during the study) (iNEXT R package; Chao et al., 2014). Chao et al. found that their diversity estimators work well for rarefaction and short-range extrapolation when the extrapolated sample size is up to twice the reference sample size. For H3, we estimated seasonal diversity using replicate datasets subsampled to 360 sequences/season; For N2, datasets were subsampled to 230 sequences/season.

Antigenic and genetic distance relative to prior seasons

For each replicate dataset, we estimated national-level genetic and antigenic distances between influenza viruses circulating in consecutive seasons by calculating the mean distance between viruses circulating in the current season t and viruses circulating during the prior season (t – 1 year; one-season lag) or two prior seasons ago (t – 2 years; two-season lag). We then averaged seasonal mean distances across the five replicate datasets. Seasonal genetic and antigenic distances are greater when currently circulating strains are more antigenically distinct from previously circulating strains. We used Spearman’s rank correlation coefficients to measure pairwise relationships between scaled H3 and N2 evolutionary indicators. We adjusted p-values for multiple testing using the Benjamini and Hochberg method (Benjamini and Hochberg, 1995).

Univariate relationships between viral fitness, (sub)type interference and A(H3N2) epidemic impact

We measured univariate associations between national indicators of A(H3N2) viral fitness and regional A(H3N2) epidemic parameters: peak incidence, epidemic size, transmissibility (effective Rt), epidemic intensity, subtype dominance, excess P&I deaths, onset timing, peak timing, spatiotemporal synchrony, the number of weeks from onset to peak, and seasonal duration. All predictors were centered and scaled prior to measuring correlations or fitting regression models.

We first measured Spearman’s rank correlation coefficients between pairs of scaled evolutionary indicators and epidemic metrics using 1000 bootstrap replicates of the original dataset (1000 samples with replacement). Next, we fit regression models with different distribution families (Gaussian or Gamma) and link functions (identity, log, or inverse) to observed data and used Bayesian information criterion (BIC) to select the best fit model, with lower BIC values indicating a better fit to the data. For subtype dominance, epidemic intensity, and age-specific proportions of ILI cases, we fit Beta regression models with logit links. Beta regression models are appropriate when the variable of interest is continuous and restricted to the interval (0, 1) (Ferrari and Cribari-Neto, 2004). For each epidemic metric, we fit the best-performing regression model to 1000 bootstrap replicates of the original dataset.

To measure the effects of sub(type) interference on A(H3N2) epidemics, the same approach was applied to measure the univariate relationships between A(H1N1) or B epidemic size and A(H3N2) peak incidence, epidemic size, effective Rt, epidemic intensity, and excess mortality. As a sensitivity analysis, we evaluated univariate relationships between A(H3N2) epidemic metrics and A(H1N1) epidemic size during pre-2009 seasons (seasonal A(H1N1) viruses) and post-2009 seasons (A(H1N1)pdm09 viruses) separately.

Selecting relevant predictors of A(H3N2) epidemic impact

Next, we explored multivariable approaches that would shed light on the potential mechanisms driving annual epidemic impact. Considering that we had many predictors and relatively few observations (22 seasons × 9–10 HHS regions), several covariates were collinear, and our goal was explicative rather than predictive, we settled on methods that tend to select few covariates: conditional inference random forests and LASSO (least absolute shrinkage and selection operator) regression models. All predictors were centered and scaled prior to fitting models.

Preprocessing of predictor data

The starting set of candidate predictors included all viral fitness metrics: genetic and antigenic distances between current and previously circulating viruses and the standard deviation and Shannon diversity of H3 and N2 LBI values in the current season. To account for potential type or subtype interference, we included A(H1N1) or A(H1N1)pdm09 epidemic size and B epidemic size in the current and prior season and the dominant IAV subtype in the prior season (Lee et al., 2018). We included A(H3N2) epidemic size in the prior season as a proxy for prior natural immunity to A(H3N2). To account for vaccine-induced immunity, we considered four categories of predictors and included estimates for the current and prior seasons: national vaccination coverage among adults (18–49 years coverage × ≥65 years coverage), adjusted A(H3N2) vaccine effectiveness (VE), a combined metric of vaccination coverage and A(H3N2) VE (18–49 years coverage × ≥65 years coverage × VE), and H3 and N2 epitope distances between naturally circulating A(H3N2) viruses and the U.S. A(H3N2) vaccine strain in each season. We could not include a predictor for vaccination coverage in children or consider clade-specific VE estimates because these data were not available for most seasons in our study.

Random forest and LASSO regression models are not sensitive to redundant (highly collinear) features (Kuhn and Johnson, 2019), but we chose to downsize the original set of candidate predictors to minimize the impact of multicollinearity on variable importance scores. For both types of models, if there are highly collinear variables that are useful for predicting the target variable, the predictor chosen by the model becomes a random selection (Kuhn and Johnson, 2019). In random forest models, these highly collinear variables will be used in all splits across the forest of decision trees, and this redundancy dilutes variable importance scores (Kuhn and Johnson, 2019). We first confirmed that none of the candidate predictors had zero variance or near-zero variance. Because seasonal lags of each viral fitness metric are highly collinear, we included only one lag of each evolutionary predictor, with a preference for the lag that had the strongest univariate correlations with various epidemic metrics. We checked for multicollinearity among the remaining predictors by examining Spearman’s rank correlation coefficients between all pairs of predictors. If a particular pair of predictors was highly correlated (Spearman’s ρ>0.8), we retained only one predictor from that pair, with a preference for the predictor that had the strongest univariate correlations with various epidemic metrics. Lastly, we performed QR decomposition of the matrix of remaining predictors to determine if the matrix is full rank and identify sets of columns involved in linear dependencies. This step did not eliminate any additional predictors, given that we had already removed pairs of highly collinear variables based on Spearman correlation coefficients.

After these preprocessing steps, our final set of model predictors included 21 variables, including 8 viral evolutionary indicators: H3 epitope distance (t – 2), HI log2 titer distance (t – 2), H3 RBS distance (t – 2), H3 non-epitope distance (t – 2), N2 epitope distance (t – 1), N2 non-epitope distance (t – 1), and H3 and N2 LBI diversity (s.d.) in the current season; 6 proxies for type/subtype interference and prior immunity: A(H1N1) and B epidemic sizes in the current and prior season, A(H3N2) epidemic size in the prior season, and the dominant IAV subtype in the prior season; and 7 proxies for vaccine-induced immunity: A(H3N2) VE in the current and prior season, H3 and N2 epitope distances between circulating viruses and the vaccine strain in each season, the combined metric of adult vaccination coverage × VE in the current and prior season, and adult vaccination coverage in the prior season.

Random forest models

We used conditional inference random forest models to select relevant predictors of A(H3N2) epidemic size, peak incidence, transmissibility (effective Rt), epidemic intensity, and subtype dominance (party and caret R packages; Hothorn et al., 2006; Kuhn, 2008; Strobl et al., 2008; Strobl et al., 2007). We did not conduct variable selection analysis for excess A(H3N2) mortality due to data limitations (one national estimate per season). Metrics related to epidemic timing were also excluded from this analysis because we found weak or non-statistically significant associations with most viral fitness metrics in univariate analyses. Lastly, we could not separate our analysis into pre- and post-2009 pandemic periods due to small sample sizes.

We created each forest by generating 3000 regression trees. To determine the best performing model for each epidemic metric, we used leave-one-season-out (jackknife) cross-validation to train models and measure model performance, wherein each ‘assessment’ set is one season of data predicted by the model, and the corresponding ‘analysis’ set contains the remaining seasons. This approach is roughly analogous to splitting data into training and test sets, but all seasons are used at some point in the training of each model (Kuhn and Johnson, 2019). Due to the small size of our dataset (~20 seasons), evaluating the predictive accuracy of random forest models on a quasi-independent test set of 2–3 seasons produced unstable estimates. Instead of testing model performance on an independent test set, we generated 10 bootstrap resamples (‘repeats’) of each analysis set (‘fold’) and averaged the predictions of models trained on resamples (Kuhn and Johnson, 2013; Kuhn and Johnson, 2019). For each epidemic metric, we report the mean root mean squared error (RMSE) and R2 of predictions from the best tuned model. We used permutation importance (N=50 permutations) to estimate the relative importance of each predictor in determining target outcomes. Permutation importance is the decrease in prediction accuracy when a single feature (predictor) is randomly permuted, with larger values indicating more important variables. Because many features were collinear, we used conditional permutation importance to compute feature importance scores, rather than the standard marginal procedure (Altmann et al., 2010; Debeer and Strobl, 2020; Strobl et al., 2008; Strobl et al., 2007).

Regression models

As an alternative method for variable selection, we performed LASSO regression on the same cross-validated dataset and report the mean RMSE and R2 of predictions from the best tuned model (glmnet and caret R packages; Friedman et al., 2010; Kuhn, 2008). Unlike random forest models, this modeling approach assumes linear relationships between predictors and the target variable. LASSO models (L1 penalty) are more restrictive than ridge models (L2 penalty) and elastic net models (combination of L1 and L2 penalties) and will arbitrarily retain one variable from a set of collinear variables.

To further reduce the set of predictors for each epidemic metric, we performed model selection with linear regression models that considered all combinations of the top 10 ranked predictors from conditional inference random forest models. Candidate models could include up to three predictors, and models were compared using BIC. We did not include HHS region or season as fixed or random effects because these variables either did not improve model fit (region) or caused overfitting and convergence issues (season).

Results

Indicators of influenza A(H3N2) evolution

We characterized seasonal patterns of genetic and antigenic evolution among A(H3N2) viruses circulating during 1997–2019, using HA and NA sequence data shared via the GISAID EpiFlu database (Shu and McCauley, 2017) and ferret hemagglutination inhibition (HI) assay data shared by WHO GISRS Collaborating Centers in London, Melbourne, Atlanta, and Tokyo. Time-resolved phylogenies of HA and NA genes are shown in Figure 2. Although our study is U.S.-focused, we used a global dataset because U.S.-collected sequences and HI titers were sometimes sparse during the earlier seasons of the study (Figure 2—figure supplements 1 and 2).

To measure antigenic distances between consecutive seasons, we calculated mean genetic distances at epitope sites or mean log2 titer distances from HI titer measurements (Figure 2), between viruses circulating in the current season t and the prior season t – 1 year (one-season lag) or two prior seasons ago t – 2 years (two-season lag). These time windows generated seasonal antigenic distances consistent with empirical and theoretical studies characterizing transitions between H3 or N2 antigenic clusters (Bedford et al., 2014; Ferguson et al., 2003; Huddleston et al., 2020; Neher et al., 2014; Sandbulte et al., 2011; Smith et al., 2004), with H3 epitope distance and HI log2 titer distance, at two-season lags, and N2 epitope distance, at one-season lags, capturing expected ‘jumps’ in antigenic drift during key seasons that have been previously associated with major antigenic transitions (Smith et al., 2004), such as the seasons dominated by A/Sydney/5/1997-like strains (SY97) (1997–1998, 1998–1999, 1999–2000) and the 2003–2004 season dominated by A/Fujian/411/2002-like strains (FU02) (Figure 2—figure supplements 3 and 7). Prior studies explicitly linking antigenic drift to epidemic size or severity also support a 1-year (Bedford et al., 2014) or 2-year time window of drift (Koelle et al., 2006; Wolf et al., 2010). Given that protective immunity to homologous strains wanes after 1–4 years (He et al., 2015; Wraith et al., 2022), we would also expect these timeframes to return the greatest signal in epidemiological surveillance data.

We measured pairwise correlations between seasonal indicators of HA and NA evolution to assess their degree of concordance. As expected, we found moderate-to-strong associations between HA epitope distance and HI log2 titer distance (Figure 2—figure supplements 36) and HA RBS distance and HI log2 titer distance (Figure 2—figure supplements 46). Consistent with prior serological studies (Eichelberger et al., 2018; Kilbourne et al., 1990; Schulman and Kilbourne, 1969), epitope distances in HA and NA were not correlated at one-season lags (Spearman’s ρ=0.25, p=0.3) or two-season lags (ρ=0.15, p=0.5) (Figure 2—figure supplements 47). The seasonal diversity of HA and NA LBI values was negatively correlated with NA epitope distance (Figure 2—figure supplements 5 and 6), with high antigenic novelty coinciding with low genealogical diversity. This association suggests that selective sweeps tend to follow the emergence of drifted variants with high fitness, resulting in seasons dominated by a single A(H3N2) variant rather than multiple co-circulating clades.

Associations between A(H3N2) evolution and epidemic dynamics

We explored relationships between viral evolution and variation in A(H3N2) epidemic dynamics from seasons 1997–1998 to 2018–2019, excluding the 2009 A(H1N1) pandemic, using syndromic and virologic surveillance data collected by the U.S. CDC and WHO. We estimated weekly incidences of influenza A(H3N2), A(H1N1), and B in 10 HHS regions by multiplying the influenza-like illness (ILI) rate – the proportion of outpatient encounters for ILI, weighted by regional population size – by the regional proportion of respiratory samples testing positive for each influenza type/subtype (percent positive). Figure 1 and Figure 1—figure supplement 1 show variability in the timing and intensity of annual epidemics of A(H3N2), A(H1N1), and B viruses. Based on these incidence time series, we measured indicators of epidemic burden, intensity, severity, subtype dominance, timing, and age-specific patterns during each non-pandemic season (Table 2) and assessed their univariate relationships with each indicator of HA and NA evolution. Figure 1—figure supplement 3 shows pairwise correlations between epidemic metrics.

Table 2. Seasonal metrics of A(H3N2) epidemic dynamics.

Epidemic metrics are defined and labeled by which outcome category they represent.

Epidemic Outcome Definition Outcome category Citations
Epidemic size Cumulative weekly incidence Burden
Peak incidence Maximum weekly incidence Burden
Maximum time-varying effective reproduction number, Rt The number of secondary cases arising from a symptomatic index case, assuming conditions remain the same Transmissibility Scott et al., 2021; Bhatt et al., 2023
Epidemic intensity Inverse Shannon entropy of the weekly incidence distribution (i.e. the spread of incidence across the season) Sharpness of the epidemic curve Dalziel et al., 2018
Subtype dominance The proportion of influenza positive samples typed as A(H3N2) Viral activity
Excess pneumonia and influenza mortality attributable to A(H3N2) virus Mortality burden in excess of a seasonally adjusted baseline Severity Hansen et al., 2022; Simonsen and Viboud, 2012
Onset week Winter changepoint in incidence Timing Charu et al., 2017
Peak week First week of maximum incidence Timing
Spatiotemporal synchrony Regional variation (s.d.) in onset or peak timing Speed Viboud et al., 2006
Onset to peak Number of days between onset week and peak week Speed
Seasonal duration Number of weeks with non-zero incidence Speed

Two sequence-based measures based on broad sets of epitope sites exhibited stronger relationships with seasonal A(H3N2) epidemic burden and transmissibility than the serology-based measure, HI log2 titer distance. Both H3 epitope distance (t – 2) and N2 epitope distance (t – 1) correlated with increased epidemic size (H3, adjusted R2=0.37, p=0.03; N2: R2=0.26, p=0.08) and peak incidence (H3: R2=0.4, p=0.02; N2: R2=0.33, p=0.04) and higher effective reproduction numbers, Rt (H3, R2=0.37, p=0.06; N2, R2=0.33, p=0.03; regression results: Figure 3; Spearman correlations: Figure 3—figure supplement 1). Excess pneumonia and influenza mortality attributable to A(H3N2) increased with H3 epitope distance, though this relationship was not statistically significant (Figure 3—figure supplement 2). HI log2 titer distance (t – 2) exhibited positive but non-significant associations with different measures of epidemic impact (Figure 3, Figure 3—figure supplement 1). Effective Rt and epidemic intensity were greater in seasons with low LBI diversity (Figure 3—figure supplement 1; Figure 3—figure supplement 3 and Figure 3—figure supplement 4). The remaining indicators of viral evolution, including H3 and N2 non-epitope distance (mutational load), H3 RBS distance, and H3 stalk footprint distance had weaker, non-statistically significant correlations with epidemic impact (Figure 3—figure supplement 1).

Figure 3. Influenza A(H3N2) antigenic drift correlates with larger, more intense annual epidemics.

A(H3N2) epidemic size, peak incidence, transmissibility (effective reproduction number, Rt), and epidemic intensity increase with antigenic drift, measured by (A) hemagglutinin (H3) epitope distance, (B) neuraminidase (N2) epitope distance, and (C) hemagglutination inhibition (HI) log2 titer distance. Seasonal antigenic drift is the mean titer distance or epitope distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1) or two prior seasons ago (t – 2). Distances are scaled to aid in direct comparison of evolutionary indicators. Point color indicates the dominant influenza A virus (IAV) subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean A(H3N2) epidemic metric values were fit as a function of antigenic or genetic distance using LMs (epidemic size, peak incidence), Gaussian GLMs (effective Rt: inverse link), or Beta GLMs (epidemic intensity: logit link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top left section of each plot.

Figure 3.

Figure 3—figure supplement 1. Univariate correlations between influenza A(H3N2) evolutionary indictors and epidemic impact.

Figure 3—figure supplement 1.

Mean Spearman’s rank correlation coefficients, 95% confidence intervals of correlation coefficients, and corresponding p-values of bootstrapped (N=1000) evolutionary indicators (rows) and epidemic metrics (columns). Point color indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation), and stars indicate statistical significance (* p<0.05, ** p<0.01, *** p<0.001). Abbreviations: t – 1, one-season lag; t – 2, two-season lag; RBS, receptor binding site; HI, hemagglutination inhibition; s.d., standard deviation; LBI, local branching index.
Figure 3—figure supplement 2. Excess influenza A(H3N2) mortality increases with H3 and N2 epitope distance, but correlations are not statistically significant.

Figure 3—figure supplement 2.

Relationships between seasonal excess influenza A(H3N2) mortality and epitope distance are organized by gene segment and age group: (A) H3 epitope distance and all age groups, (B) H3 epitope distance and individuals aged ≥65 years, (C) N2 epitope distance and all age groups, and (D) N2 epitope distance and individuals aged ≥65 years. The number of excess influenza deaths attributable to A(H3N2) (per 100,000 people) were estimated from a seasonal regression model fit to weekly pneumonia and influenza-coded deaths in the United States (Hansen et al., 2022). Seasonal epitope distance is the mean distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1) or two prior seasons ago (t – 2). Distances are scaled to aid in direct comparison of evolutionary indicators. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of excess mortality model estimates. Seasonal national excess mortality estimates were fit as a function of H3 or N2 epitope distance using Gaussian GLMs (log link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top left section of each plot.
Figure 3—figure supplement 3. Low seasonal diversity in the clade growth rates of circulating A(H3N2) viruses, as measured by the standard deviation of local branching index values, correlates with higher transmissibility and greater epidemic intensity.

Figure 3—figure supplement 3.

A(H3N2) effective Rt and epidemic intensity negatively correlate with the seasonal diversity of local branching index (LBI) values among circulating A(H3N2) lineages in the current season, measured by the standard deviation (s.d.) of (A) H3 LBI values, and (B) N2 LBI values. LBI values are scaled to aid in direct comparisons of H3 and N2 s.d. LBI values. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean A(H3N2) epidemic metric values were fit as a function of H3 or N2 LBI diversity using Gaussian GLMs (effective Rt: inverse link) or Beta GLMs (epidemic intensity: logit link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top right section of each plot.
Figure 3—figure supplement 4. Low seasonal diversity in the clade growth rates of circulating A(H3N2) viruses, as measured by the Shannon diversity of local branching index values, correlates with higher transmissibility and greater epidemic intensity.

Figure 3—figure supplement 4.

A(H3N2) effective Rt and epidemic intensity negatively correlate with the seasonal diversity of local branching index (LBI) values among circulating A(H3N2) lineages in the current season, measured by the Shannon diversity of (A) H3 LBI values, and (B) N2 LBI values. LBI values are scaled to aid in direct comparisons of H3 and N2 LBI diversity values. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean A(H3N2) epidemic metric values were fit as a function of H3 or N2 LBI diversity using Gaussian GLMs (effective Rt: inverse link) or Beta GLMs (epidemic intensity: logit link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top right section of each plot.

We explored whether evolutionary changes in A(H3N2) may predispose this subtype to dominate influenza virus circulation in a given season. A(H3N2) subtype dominance – the proportion of influenza positive samples typed as A(H3N2) – increased with H3 epitope distance (t – 2) (R2=0.32, p=0.05) and N2 epitope distance (t – 1) (R2=0.34, p=0.03) (regression results: Figure 4; Spearman correlations: Figure 3—figure supplement 1). Figure 4 illustrates this relationship at the regional level across two seasons in which A(H3N2) was nationally dominant, but where antigenic change differed. In 2003–2004, we observed widespread dominance of A(H3N2) viruses after the emergence of the novel antigenic cluster, FU02 (A/Fujian/411/2002-like strains). In contrast, there was substantial regional heterogeneity in subtype circulation during 2007–2008, a season in which A(H3N2) viruses were antigenically similar to those circulating in the previous season. Patterns in type/subtype circulation across all influenza seasons in our study period are shown in Figure 4—figure supplement 1. As observed for the 2003–2004 season, widespread A(H3N2) dominance tended to coincide with major antigenic transitions (e.g. A/Sydney/5/1997 (SY97) seasons, 1997–1998 to 1999–2000; A/California/7/2004 (CA04) season, 2004–2005), although this was not universally the case (e.g. A/Perth/16/2009 (PE09) season, 2010–2011).

Figure 4. The proportion of influenza positive samples typed as A(H3N2) increases with antigenic drift.

(A-B) Seasonal A(H3N2) subtype dominance increases with (A) hemagglutinin (H3) and (B) neuraminidase (N2) epitope distance. Seasonal epitope distance is the mean epitope distance between viruses circulating in the current season t and viruses circulating in the prior season (t - 1) or two prior seasons ago (t - 2). Distances were scaled to aid in direct comparison of evolutionary indicators. Point color indicates the dominant influenza A virus (IAV) subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean A(H3N2) dominance was fit as a function of H3 or N2 epitope distance using Beta GLMs with 1000 bootstrap resamples. In (A) and (B), the dashed black line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the bottom right section of each plot. (C–D) Regional patterns of influenza type and subtype incidence during two seasons when A(H3N2) was nationally dominant. Pie charts represent the proportion of influenza positive samples typed as A(H3N2) (red), A(H1N1) (blue), or B (green) in each HHS region. The sizes of regional pie charts are proportional to the total number of influenza positive samples. Data for Region 10 (purple) are not available for seasons prior to 2009. (C) Widespread A(H3N2) dominance during 2003–2004 after the emergence of a novel antigenic cluster, FU02 (A/Fujian/411/2002-like strains). (D) Spatial heterogeneity in subtype circulation during 2007–2008, a season with low A(H3N2) antigenic novelty relative to the prior season.

Figure 4.

Figure 4—figure supplement 1. Regional patterns of influenza type and subtype circulation during seasons 1997–1998 to 2018–2019.

Figure 4—figure supplement 1.

Pie charts represent the proportion of influenza positive samples that were typed as A(H3N2), A(H1N1) or A(H1N1)pdm09, and B in each HHS region. Data for Region 10 (purple) are not available for seasons prior to 2009.

After the 2009 A(H1N1) pandemic, A(H3N2) dominant seasons still occurred more frequently than A(H1N1) dominant seasons, but the mean fraction of influenza positive cases typed as A(H3N2) in A(H3N2) dominant seasons was lower compared to A(H3N2) dominant seasons prior to 2009 (Figure 4—figure supplement 1). Antigenically distinct 3 c.2a and 3 c.3a viruses began to co-circulate in 2012 and underwent further diversification during subsequent seasons in our study (https://nextstrain.org/seasonal-flu/h3n2/ha/12y@2024-05-13; Dhanasekaran et al., 2022; Huddleston et al., 2020; Yan et al., 2019). The decline in A(H3N2) predominance during the post-2009 period may be linked to the genetic and antigenic diversification of A(H3N2) viruses, wherein multiple lineages with similar fitness co-circulated in each season.

Next, we tested for associations between A(H3N2) evolution and various measures of epidemic timing (Table 2). Seasonal duration increased with H3 and N2 LBI diversity in the current season (H3, LBI Shannon diversity: R2=0.37; p=0.04; LBI s.d.: R2=0.3; p=0.09; N2, LBI Shannon diversity: R2=0.38; p=0.04; LBI s.d.: R2=0.36; p=0.06; regression results: Figure 5; Spearman correlations: Figure 5—figure supplement 1), while the number of days from epidemic onset to peak incidence shortened with increasing N2 epitope distance (t – 1) (R2=0.38, p=0.03; Figure 5—figure supplement 2). Onset and peak timing tended to be earlier in seasons with increased H3 and N2 antigenic novelty, but correlations between antigenic change and epidemic timing were not statistically significant (Figure 5—figure supplement 3). A(H3N2) evolution did not correlate with the degree of spatiotemporal synchrony across HHS regions (Figure 5—figure supplement 1).

Figure 5. Influenza A(H3N2) seasonal duration increases with the diversity of hemagglutinin (H3) and neuraminidase (N2) clade growth rates in each season.

Seasonal diversity of clade growth rates is measured as the (A) Shannon diversity or (B) standard deviation (s.d.) of H3 and N2 local branching index (LBI) values of viruses circulating in each season. LBI values are scaled to aid in direct comparisons of different LBI diversity metrics. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Mean seasonal duration was fit as a function of H3 or N2 LBI diversity using Gaussian GLMs (inverse link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top left section of each plot.

Figure 5.

Figure 5—figure supplement 1. Univariate correlations between influenza A(H3N2) evolutionary indicators and epidemic timing.

Figure 5—figure supplement 1.

Mean Spearman’s rank correlation coefficients, 95% confidence intervals of correlation coefficients, and corresponding p-values of bootstrapped (N=1000) evolutionary indicators (columns) and epidemic timing metrics (rows). Epidemic timing metrics are the week of epidemic onset, regional variation (s.d.) in onset timing, the week of epidemic peak, regional variation (s.d.) in peak timing, the number of days between epidemic onset and peak, and seasonal duration. Color indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation), and stars indicate statistical significance (* p<0.05, ** p<0.01, *** p<0.001). Abbreviations: t – 1, one-season lag; t – 2, two-season lag; RBS, receptor binding site; HI, hemagglutination inhibition; s.d., standard deviation; LBI, local branching index.
Figure 5—figure supplement 2. Epidemic speed increases with N2 antigenic drift.

Figure 5—figure supplement 2.

N2 epitope distance significantly correlates with fewer days from epidemic onset to peak (A), while the relationship between H3 epitope distance and epidemic speed is weaker (B). Seasonal epitope distance is the mean distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1) or two prior seasons ago (t – 2). Distances are scaled to aid in direct comparison of evolutionary indicators. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). The seasonal mean number of days from onset to peak was fit as a function of H3 or N2 epitope distance using Gamma GLMs (inverse link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top right section of each plot.
Figure 5—figure supplement 3. Influenza A(H3N2) epidemic onsets and peaks are earlier in seasons with high antigenic novelty, but correlations are not statistically significant.

Figure 5—figure supplement 3.

(A) Epidemic onsets are earlier in seasons with increased H3 epitope distance (t – 2), but the correlation is not statistically significant. (B) Epidemic peaks are earlier in seasons with increased H3 epitope distance (t – 2) and N2 epitope distance (t – 1), but correlations are not statistically significant. Seasonal epitope distance is the mean distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1) or two prior seasons ago (t – 2). Distances are scaled to aid in direct comparison of evolutionary indicators. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean epidemic onsets and peaks were fit as a function of H3 or N2 epitope distance using Gaussian GLMs (inverse link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top left section of each plot.

Lastly, we considered the effects of antigenic change on the age distribution of outpatient ILI cases, with the expectation that the proportion of cases in children would decrease in seasons with greater antigenic novelty, due to drifted variants’ increased ability to infect more immunologically experienced adults (Bedford et al., 2015; Gostic et al., 2019). Consistent with this hypothesis, N2 epitope distance was negatively correlated with the fraction of cases in children aged <5 years (one-season lag: R2=0.29, p=0.1; two-season lag: R2=0.59, p=0.003) and individuals aged 5–24 years (one-season lag: R2=0.38, p=0.04; two-season lag: R2=0.17, p=0.18) and positively correlated with the fraction of cases in adults aged 25–64 years (one-season lag: R2=0.36, p=0.05; two-season lag: R2=0.49, p=0.01) and ≥65 years (one-season lag: R2=0.39, p=0.01; two-season lag: R2=0.33, p=0.05) (regression results: Figure 6; Spearman correlations: Figure 6—figure supplement 1). Antigenic drift in H3 exhibited similar associations with age patterns of ILI cases, but correlations were weaker and non-significant (Figure 6, Figure 6—figure supplement 1).

Figure 6. The proportion of outpatient influenza-like illness (ILI) cases in adults increases with neuraminidase (N2) antigenic novelty.

N2 epitope distance, but not H3 epitope distance, significantly correlates with the age distribution of outpatient ILI cases. Seasonal epitope distance is the mean distance between viruses circulating in current season t and viruses circulating in the prior season (t – 1) or two prior seasons ago (t – 2). Distances are scaled to aid in direct comparison of evolutionary indicators. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of regional age distribution estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). The seasonal mean fraction of cases in each age group were fit as a function of H3 or N2 epitope distance using Beta GLMs (logit link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top right section of each plot.

Figure 6.

Figure 6—figure supplement 1. Univariate correlations between A(H3N2) antigenic change and the age distribution of outpatient influenza-like illness (ILI) cases.

Figure 6—figure supplement 1.

Mean Spearman’s rank correlation coefficients, 95% confidence intervals of correlation coefficients, and corresponding p-values of bootstrapped (N=1000) evolutionary indicators (rows) and the proportion of ILI cases in individuals aged <5 years, 5–24 years, 25–64 years, and ≥65 years (columns). Color indicates the strength and direction of the association, from dark red (strong positive correlation) to dark blue (strong negative correlation), and stars indicate statistical significance (* p<0.05, ** p<0.01, *** p<0.001). Abbreviations: t – 1, one-season lag; t – 2, two-season lag; RBS, receptor binding site; HI, hemagglutination inhibition.

Effects of heterosubtypic viral interference on A(H3N2) epidemic burden and timing

We investigated the effects of influenza type/subtype interference – proxied by influenza A(H1N1) and B epidemic size – on A(H3N2) incidence during annual outbreaks. Across the entire study period, we observed moderate-to-strong, non-linear relationships between A(H1N1) epidemic size and A(H3N2) epidemic size (R2=0.65, p=0.01; Figure 7), peak incidence (R2=0.66, p=0.02; Figure 7), and excess mortality (R2=0.57, p=0.01; Figure 7—figure supplement 1), wherein A(H3N2) epidemic burden and excess mortality decreased as A(H1N1) incidence increased. A(H1N1) epidemic size was also significantly correlated with A(H3N2) transmissibility (effective Rt), exhibiting a negative, approximately linear relationship (R2=0.46, p=0.01; Figure 7). A(H3N2) epidemic intensity was negatively associated with A(H1N1) epidemic size, but this relationship was not statistically significant (R2=0.21, p=0.15; Figure 7). Influenza B epidemic size was not significantly correlated with any A(H3N2) epidemic metrics (Figure 7, Figure 7—figure supplement 1).

Figure 7. The effects of influenza A(H1N1) and B epidemic size on A(H3N2) epidemic burden.

(A) Influenza A(H1N1) epidemic size negatively correlates with A(H3N2) epidemic size, peak incidence, transmissibility (effective reproduction number, Rt), and epidemic intensity. (B) Influenza B epidemic size does not significantly correlate with A(H3N2) epidemic metrics. Point color indicates the dominant influenza A virus (IAV) subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical and horizontal bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean A(H3N2) epidemic metrics were fit as a function of mean A(H1N1) or B epidemic size using Gaussian GLMs (epidemic size and peak incidence: inverse link; effective Rt: log link) or Beta GLMs (epidemic intensity: logit link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top left section of each plot.

Figure 7.

Figure 7—figure supplement 1. National excess influenza A(H3N2) mortality decreases with A(H1N1) epidemic size but not B epidemic size.

Figure 7—figure supplement 1.

Relationships between seasonal excess influenza A(H3N2) mortality and the circulation of A(H1N1) or B viruses are organized by influenza type/subtype and age group: (A) A(H1N1) epidemic size and all age groups, (B) A(H1N1) epidemic size and individuals aged ≥65 years, (C) B epidemic size and all age groups, and (D) B epidemic size and individuals aged ≥65 years. Excess influenza deaths attributable to A(H3N2) (per 100,000 people) were estimated from a seasonal regression model fit to weekly pneumonia and influenza-coded deaths. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical bars are 95% confidence intervals of excess mortality model estimates. Seasonal national excess mortality estimates were fit as a function of A(H1N1) or B epidemic size using Gaussian GLMs (log link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top section of each plot.
Figure 7—figure supplement 2. The effect of influenza A(H1N1) epidemic size on A(H3N2) epidemic burden during the entire study period, pre-2009 seasons, and post-2009 seasons.

Figure 7—figure supplement 2.

Influenza A(H1N1) epidemic size negatively correlates with A(H3N2) epidemic size, peak incidence, transmissibility (maximum effective reproduction number, Rt), and epidemic intensity during (A) the entire study period (1997 – 2019), (B) pre-2009 seasons, and (C) post-2009 seasons. Point color indicates the dominant influenza A virus (IAV) subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), and vertical and horizontal bars are 95% confidence intervals of regional estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). Seasonal mean A(H3N2) epidemic metrics were fit as a function of A(H1N1) epidemic size using Gaussian GLMs (epidemic size, peak incidence: inverse link; effective Rt: log link) or Beta GLMs (epidemic intensity: logit link) with 1000 bootstrap resamples. In each plot, the black dashed line represents the mean regression fit, and the gray shaded band shows the 95% confidence interval, based on 1000 bootstrap resamples. The R2 and associated p-value from the mean regression fit are in the top left section of each plot.
Figure 7—figure supplement 3. Wavelet analysis of influenza A(H3N2), A(H1N1), and B epidemic timing.

Figure 7—figure supplement 3.

(A) A(H3N2) incidence precedes A(H1N1) incidence in most seasons. Although A(H1N1) incidence sometimes leads or is in phase with A(H3N2) incidence (negative or zero phase lags), the direction of seasonal phase lags is not clearly associated with A(H1N1) epidemic size. (B) A(H3N2) incidence leads B incidence (positive phase lag) during every season, irrespective of B epidemic size. Point color indicates the dominant influenza A subtype based on CDC influenza season summary reports (red: A(H3N2), blue: A(H1N1), purple: A(H1N1)pdm09, orange: A(H3N2)/A(H1N1)pdm09 co-dominant), vertical bars are 95% confidence intervals (CIs) of regional phase lag estimates, and horizontal bars are 95% CIs of regional epidemic size estimates (pre-2009 seasons: 9 regions; post-2009 seasons: 10 regions). To estimate the relative timing of influenza subtype incidences, phase angle differences were calculated as phase in A(H3N2) minus phase in A(H1N1) (or B), with a positive value indicating that A(H1N1) (or B) incidence lags A(H3N2) incidence. To calculate seasonal phase lags, we averaged pairwise phase angle differences from epidemic week 40 to epidemic week 20. Seasonal phase lags were fit as a function of A(H1N1) or B epidemic size using LMs with 1000 bootstrap resamples. In each plot, the R2 and associated p-value from the mean regression fit are in the top right section, and the black dashed line shows y=0 (the two time series are in phase).

The internal gene segments NS, M, NP, PA, and PB2 of A(H3N2) viruses and pre-2009 seasonal A(H1N1) viruses share a common ancestor (Webster et al., 1992) whereas A(H1N1)pdm09 viruses have a combination of gene segments derived from swine and avian reservoirs that were not reported prior to the 2009 pandemic (Garten et al., 2009; Smith et al., 2009). Non-glycoprotein genes are highly conserved between influenza A viruses and elicit cross-reactive antibody and T cell responses (Grebe et al., 2008; Sridhar, 2016). Because pre-2009 seasonal A(H1N1) viruses and A(H3N2) are more closely related, we hypothesized that seasonal A(H1N1) viruses could potentially limit the circulation of A(H3N2) viruses to a greater extent than A(H1N1)pdm09 viruses, due to greater T cell-mediated cross-protective immunity. As a sensitivity analysis, we measured correlations between A(H1N1) incidence and A(H3N2) epidemic metrics separately for pre- and post-2009 pandemic time periods. Relationships between different A(H3N2) epidemic metrics and A(H1N1) epidemic size were broadly similar for both periods, with slightly stronger correlations observed during the pre-2009 period (Figure 7—figure supplement 2).

We compared A(H3N2) epidemic timing across A(H3N2) and A(H1N1) dominant seasons, which we defined as when ≥70% of influenza A positive samples are typed as A(H3N2) or A(H1N1), respectively. A(H3N2) epidemic onsets and peaks occurred, on average, 3–4 weeks earlier in A(H3N2) dominant seasons (Wilcoxon test, p<0.0001; Table 3). In A(H1N1) dominant seasons, regional A(H3N2) epidemics exhibited greater heterogeneity in epidemic timing (Wilcoxon tests, p<0.0001; Table 3) and were shorter in duration compared to A(H3N2) dominant seasons (median duration: 21.5 weeks versus 28 weeks; Wilcoxon test, p<0.0001; Table 3).

Table 3. Comparison of influenza A(H3N2) epidemic timing between A(H3N2) and A(H1N1) dominant seasons.

We used two-sided Wilcoxon rank-sum tests to compare the distributions of epidemic timing metrics between A(H3N2) and A(H1N1) dominant seasons. We categorized seasons as A(H3N2) or A(H1N1) dominant when ≥70% of IAV positive samples were typed as one IAV subtype.

A(H3N2) timing metric DominantIAV subtype Wilcoxon test
H3N2 H1N1 W p-value
Median onset week
(from EW40)
8 11 3590 2.95×10–7
Median peak week
(from EW40)
17 20.5 5294.5 3.5×10–9
Regional variation (s.d.) in onset timing 9.6 16.3 4095 1.61×10–5
Regional variation (s.d.) in peak timing 12 22.6 6166 6.43×10–18
Seasonal duration 28 21.5 1977.5 6.25×10–6

Abbreviations: IAV, influenza A virus; EW40, epidemic week 40 (the start of the influenza season); s.d., standard deviation.

We applied a wavelet approach to weekly time series of incidences to measure more fine-scale differences in the relative timing of type/subtype circulation (Figure 7—figure supplement 3). A(H3N2) incidence preceded A(H1N1) incidence during most seasons prior to 2009 and during the two seasons in which A(H1N1)pdm09 was dominant, potentially because A(H3N2) viruses are more globally prevalent and migrate between regions more frequently than A(H1N1) viruses (Bedford et al., 2015). There was not a clear relationship between the direction of seasonal phase lags and A(H1N1) epidemic size (R2=0.23, p=0.1; Figure 7—figure supplement 3). A(H3N2) incidence led influenza B incidence in all influenza seasons (positive phase lag), irrespective of influenza B epidemic size (R2=0.05, p=0.5; Figure 7—figure supplement 3).

The relative impacts of viral evolution, heterosubtypic interference, and prior immunity on A(H3N2) epidemic dynamics

We implemented conditional inference random forest models to assess the relative importance of viral evolution, type/subtype co-circulation, prior population immunity, and vaccine-related parameters in predicting regional A(H3N2) epidemic metrics (Figure 8).

Figure 8. Variable importance rankings from conditional inference random forest models predicting seasonal region-specific influenza A(H3N2) epidemic dynamics.

Ranking of variables in predicting regional A(H3N2) (A) epidemic size, (B) peak incidence, (C) transmissibility (maximum effective reproduction number, Rt), (D) epidemic intensity, and (E) subtype dominance. Each forest was created by generating 3000 regression trees from a repeated leave-one-season-out cross-validated sample of the data. Variables are ranked by their conditional permutation importance, with differences in prediction accuracy scaled by the total (null model) error. Black error bars are 95% confidence intervals of conditional permutation scores (N=50 permutations). Abbreviations: t – 1, one-season lag; t – 2, two-season lag; IAV, influenza A virus subtype; s.d., standard deviation; HI, hemagglutination inhibition; LBI, local branching index; distance to vaccine, epitope distance between currently circulating viruses and the recommended vaccine strain; VE, vaccine effectiveness.

Figure 8.

Figure 8—figure supplement 1. Variable importance rankings from LASSO regression models predicting seasonal region-specific influenza A(H3N2) epidemic dynamics.

Figure 8—figure supplement 1.

Ranking of variables in predicting regional A(H3N2) (A) epidemic size, (B) peak incidence, (C) transmissibility (maximum effective reproduction number, Rt), (D) epidemic intensity, and (E) subtype dominance. Models were tuned using a repeated leave-one-season-out cross-validated sample of the data. Variables are ranked by their coefficient estimates, with differences in prediction accuracy scaled by the total (null model) error. Abbreviations: t – 1, one-season lag; t – 2, two-season lag; IAV, influenza A virus subtype; s.d., standard deviation; HI, hemagglutination inhibition; LBI, local branching index; distance to vaccine, epitope distance between currently circulating viruses and the recommended vaccine strain; VE, vaccine effectiveness.

Based on variable importance scores, A(H1N1) epidemic size in the current season was the most informative predictor of A(H3N2) epidemic size and peak incidence, followed by H3 epitope distance (t – 2) and the dominant IAV subtype in the previous season or N2 epitope distance (t – 1) (Figure 8). For A(H3N2) subtype dominance, the highest ranked predictors were N2 epitope distance (t – 1), the dominant IAV subtype in the previous season, and H3 epitope distance (t – 2) (Figure 8). We note that we did not include A(H1N1) epidemic size as a predictor in this model, due to its confounding with the target outcome. For models of A(H3N2) transmissibility (effective Rt) and epidemic intensity, we observed less discernable differences in variable importance scores across the set of candidate predictors (Figure 8). For the model of effective Rt, A(H1N1) epidemic size in the current season, adult vaccination coverage in the previous season, and N2 epitope distance between circulating viruses and the vaccine strain were the highest ranked variables, while the most important predictors of epidemic intensity were vaccination coverage in the previous season, N2 epitope distance between circulating viruses and the vaccine strain, and N2 epitope distance (t – 1). Variable importance rankings from LASSO models were qualitatively similar to those from random forest models, with A(H1N1) epidemic size in the current season, H3 and N2 epitope distance, and the dominant IAV subtype in the previous season consistently retained across the best-tuned models of epidemic size, peak incidence, and subtype dominance (Figure 8—figure supplement 1). Vaccine-related parameters and H3 antigenic drift (either H3 epitope distance or HI log2 titer distance) were retained in the best-tuned LASSO models of effective Rt and epidemic intensity (Figure 8—figure supplement 1).

We measured correlations between observed values and model-predicted values at the HHS region level. Among the various epidemic metrics, random forest models produced the most accurate predictions of A(H3N2) subtype dominance (Spearman’s ρ=0.95, regional range = 0.85–0.97), peak incidence (ρ=0.91, regional range = 0.72–0.95), and epidemic size (ρ=0.9, regional range = 0.74–0.95), while predictions of effective Rt and epidemic intensity were less accurate (ρ=0.81, regional range = 0.65–0.91; ρ=0.78, regional range = 0.63–0.92, respectively) (Figure 9). Random forest models tended to underpredict most epidemic targets in seasons with substantial H3 antigenic transitions, in particular the SY97 cluster seasons (1998–1999, 1999–2000) and the FU02 cluster season (2003–2004) (Figure 9).

Figure 9. Observed versus predicted values of seasonal region-specific influenza A(H3N2) epidemic metrics from conditional inference random forest models.

(A) Epidemic size, (B) peak incidence, (C) transmissibility (maximum effective reproduction number, Rt), (D) epidemic intensity, and (E) subtype dominance. Results are facetted by HHS region and epidemic metric. Point color and size corresponds to the mean H3 epitope distance between viruses circulating in the current season t and viruses circulating two prior seasons ago (t – 2). Large, yellow points indicate seasons with high antigenic novelty, and small blue points indicate seasons with low antigenic novelty. In each facet, the Spearman’s rank correlation coefficient and associated p-value are in the top left section, and the black dashed line shows y=x.

Figure 9.

Figure 9—figure supplement 1. Relationships between the predictive accuracy of random forest models and seasonal H3 epitope distance.

Figure 9—figure supplement 1.

Root mean squared errors between observed and model-predicted values were averaged across regions for each season, and results are facetted according to epidemic metric. Point color corresponds to the mean H3 epitope distance between viruses circulating in the current season t and viruses circulating two prior seasons ago (t – 2), with bright yellow points indicating seasons with greater antigenic novelty. In each facet, the Spearman’s rank correlation coefficient and associated p-value are in the top left section, and the black dashed line represents the linear regression fit.
Figure 9—figure supplement 2. Relationships between the predictive accuracy of random forest models and seasonal N2 epitope distance.

Figure 9—figure supplement 2.

Root mean squared errors between observed and model-predicted values were averaged across regions for each season, and results are facetted according to epidemic metric. Point color corresponds to the mean N2 epitope distance between viruses circulating in the current season t and viruses circulating in the prior season (t – 1), with bright yellow points indicating seasons with greater antigenic novelty. In each facet, the Spearman’s rank correlation coefficient and associated p-value are in the top left section, and the black dashed line represents the linear regression fit.

For epidemic size and peak incidence, seasonal predictive error – the root-mean-square error (RMSE) across all regional predictions in a season – increased with H3 epitope distance (epidemic size, Spearman’s ρ=0.51, p=0.02; peak incidence, ρ=0.63, p=0.004) and N2 epitope distance (epidemic size, ρ=0.48, p=0.04; peak incidence, ρ=0.48, p=0.03) (Figure 9—figure supplements 1 and 2). For models of epidemic intensity, seasonal RMSE increased with N2 epitope distance (ρ=0.64, p=0.004) but not H3 epitope distance (ρ=0.06, p=0.8) (Figure 9—figure supplements 1 and 2). Seasonal RMSE of effective Rt and subtype dominance predictions did not correlate with H3 or N2 epitope distance (Figure 9—figure supplements 1 and 2).

To further refine our set of informative predictors, we performed multivariable regression with the top 10 ranked predictors from each random forest model and used BIC to select the best fit model for each epidemic metric, allowing each metric’s regression model to include up to three independent variables. This additional step of variable selection demonstrated that models with few predictors fit the observed data relatively well (epidemic size, adjusted R2=0.69; peak incidence, R2=0.63; effective Rt, R2=0.63; epidemic intensity, R2=0.75), except for subtype dominance (R2=0.48) (Table 4). The set of variables retained after model selection were similar to those with high importance rankings in random forest models and LASSO regression models, with the exception that HI log2 titer distance, rather than H3 epitope distance, was included in the minimal models of effective Rt and epidemic intensity.

Table 4. Predictors of seasonal A(H3N2) epidemic size, peak incidence, transmissibility, epidemic intensity, and subtype dominance.

Variables retained in the best fit model for each epidemic outcome were determined by BIC.

Outcome Best Minimal Model1 R2 Adj. R2 RMSE
Epidemic Size H3 epitope distance (t – 2) +
H1 epidemic size +
H3 epidemic size (t – 1)
0.74 0.69 9.88
Peak Incidence H3 epitope distance (t – 2) +
H1 epidemic size +
Dominant IAV Subtype (t – 1)
0.69 0.63 2.09
Effective Rt HI log2 titer distance (t – 2) +
H1 epidemic size +
N2 distance to vaccine strain
0.69 0.63 0.11
Epidemic Intensity HI log2 titer distance (t – 2) +
N2 distance to vaccine strain +
vaccination coverage (t – 1)
0.79 0.75 0.07
Subtype Dominance H3 epitope distance (t – 2) +
N2 epitope distance (t – 1) +
Dominant IAV Subtype (t – 1)
0.56 0.48 0.2

1Candidate models were limited to three independent variables and considered all combinations of the top 10 ranked predictors from conditional inference random forest models (Figure 8).

Discussion

Antigenic drift between currently circulating influenza viruses and the previous season’s viruses is expected to confer increased viral fitness, leading to earlier, larger, or more severe epidemics. However, prior evidence for the impact of antigenic drift on seasonal influenza outbreaks is mixed. Here, we systematically compare experimental and sequence-based measures of A(H3N2) evolution in predicting regional epidemic dynamics in the United States across 22 seasons, from 1997 to 2019. We also consider the effects of other co-circulating influenza viruses, prior immunity, and vaccine-related parameters, including vaccination coverage and effectiveness, on A(H3N2) incidence. Our findings indicate that evolution in both major surface proteins – hemagglutinin (HA) and neuraminidase (NA) – contributes to variability in epidemic magnitude across seasons, though viral fitness appears to be secondary to subtype interference in shaping annual outbreaks.

The first question of this study sought to determine which metrics of viral fitness have the strongest relationships with A(H3N2) epidemic burden and timing. Among our set of candidate evolutionary predictors, genetic distances based on broad sets of epitope sites (HA = 129 sites; NA = 223 epitope sites) had the strongest, most consistent associations with A(H3N2) epidemic size, transmission rate, severity, subtype dominance, and age-specific patterns. Increased epitope distance in both H3 and N2 correlated with larger epidemics and increased transmissibility, with univariate analyses finding H3 distance more strongly correlated with epidemic size, peak incidence, transmissibility, and excess mortality, and N2 distance more strongly correlated with epidemic intensity (i.e. the ‘sharpness’ of the epidemic curve) and subtype dominance patterns. However, we note that minor differences in correlative strength between H3 and N2 epitope distance are not necessarily biologically relevant and could be attributed to noise in epidemiological or virological data or the limited number of influenza seasons in our study. The fraction of ILI cases in children relative to adults was negatively correlated with N2 epitope distance, consistent with the expectation that cases are more restricted to immunologically naive children in seasons with low antigenic novelty (Bedford et al., 2015; Gostic et al., 2019). Regarding epidemic timing, the number of days from epidemic onset to peak (a proxy for epidemic speed) decreased with N2 epitope distance, but other measures of epidemic timing, such as peak week, onset week, and spatiotemporal synchrony across HHS regions, were not significantly correlated with H3 or N2 antigenic change.

The local branching index (LBI) is traditionally used to predict the success of individual clades, with high LBI values indicating high viral fitness (Huddleston et al., 2020; Neher et al., 2014). In our epidemiological analysis, low diversity of H3 or N2 LBI in the current season correlated with greater epidemic intensity, higher transmission rates, and shorter seasonal duration. These associations suggest that low LBI diversity is indicative of a rapid selective sweep by one successful clade, while high LBI diversity is indicative of multiple co-circulating clades with variable seeding and establishment times over the course of an epidemic. A caveat is that LBI estimation is more sensitive to sequence sub-sampling schemes than strain-level measures. If an epidemic is short and intense (e.g. 1–2 months), a phylogenetic tree with our sub-sampling scheme (50 sequences per month) may not incorporate enough sequences to capture the true diversity of LBI values in that season.

Positive associations between H3 antigenic drift and population-level epidemic burden are consistent with previous observations from theoretical models (Bedford et al., 2012; Koelle et al., 2006; Koelle et al., 2009). For example, phylodynamic models of punctuated antigenic evolution have reproduced key features of A(H3N2) phylogenetic patterns and case dynamics, such as the sequential replacement of antigenic clusters, the limited standing diversity in HA after a cluster transition, and higher incidence and attack rates in cluster transition years (Bedford et al., 2012; Koelle et al., 2006; Koelle et al., 2009). Our results also corroborate empirical analyses of surveillance data (Bedford et al., 2014; Wilson and Cox, 1990; Wolf et al., 2010; Wu et al., 2010) and forecasting models of annual epidemics (Axelsen et al., 2014; Du et al., 2017) that found direct, quantitative links between HA antigenic novelty and the number of influenza cases or deaths in a season. Moving beyond the paradigm of antigenic clusters, Wolf et al., 2010 and Bedford et al., 2014 demonstrated that smaller, year-to-year changes in H3 antigenic drift also correlate with seasonal severity and incidence (Bedford et al., 2014; Wolf et al., 2010). A more recent study did not detect an association between antigenic drift and city-level epidemic size in Australia (Lam et al., 2020), though the authors used a binary indicator to signify seasons with major HA antigenic transitions and did not consider smaller, more gradual changes in antigenicity. While Lam and colleagues did not observe a consistent effect of antigenic change on epidemic magnitude, they found a negative relationship between the cumulative prior incidence of an antigenic variant and its probability of successful epidemic initiation in a city.

We did not observe a clear relationship between H3 receptor binding site (RBS) distance and epidemic burden, even though single substitutions at these seven amino acid positions are implicated in major antigenic transitions (Koel et al., 2013; Petrova and Russell, 2018). The outperformance of the RBS distance metric by a broader set of epitope sites could be attributed to the tempo of antigenic cluster changes. A(H3N2) viruses are characterized by both continuous and punctuated antigenic evolution, with transitions between antigenic clusters occurring every 2–8 years (Bedford et al., 2011; Bedford et al., 2014; Koel et al., 2013; Koelle et al., 2006; Koelle and Rasmussen, 2015; Shih et al., 2007; Smith et al., 2004; Suzuki, 2008; Wolf et al., 2006). Counting substitutions at only a few sites may fail to capture more modest, gradual changes in antigenicity that are on a time scale congruent with annual outbreaks. Further, a broader set of epitope sites may better capture the epistatic interactions that underpin antigenic change in HA (Kryazhimskiy et al., 2011). Although the seven RBS sites were responsible for the majority of antigenic phenotype in Koel and colleagues’ experimental study (Koel et al., 2013), their findings do not necessarily contradict studies that found broader sets of sites associated with antigenic change. Mutations at other epitope sites may collectively add to the decreased recognition of antibodies or affect viral fitness through alternate mechanisms (e.g. compensatory or permissive mutations) (Gong et al., 2013; Koel et al., 2013; Koelle et al., 2006; Kryazhimskiy et al., 2011; Myers et al., 2013; Neher et al., 2014; Shih et al., 2007; Smith et al., 2004).

A key result from our study is the direct link between NA antigenic drift and A(H3N2) incidence patterns. Although HA and NA both contribute to antigenicity (Nelson and Holmes, 2007; Webster et al., 1982) and undergo similar rates of positive selection (Bhatt et al., 2011), we expected antigenic change in HA to exhibit stronger associations with seasonal incidence, given its immunodominance relative to NA (Altman et al., 2015). H3 and N2 epitope distance were both moderately correlated with epidemic size, peak incidence, and subtype dominance patterns, but, except for subtype dominance, H3 epitope distance had higher variable importance rankings in random forest models and N2 epitope distance was not retained after post-hoc model selection of top ranked random forest features. However, N2 epitope distance but not H3 epitope distance was associated with faster epidemic speed and a greater fraction of ILI cases in adults relative to children. Antigenic changes in H3 and N2 were independent across the 22 seasons of our study, consistent with previous research (Bhatt et al., 2011; Sandbulte et al., 2011; Schulman and Kilbourne, 1969). Thus, the similar predictive performance of HA and NA epitope distance for some epidemic metrics does not necessarily stem from the coevolution of HA and NA.

HI log2 titer distance was positively correlated with different measures of epidemic impact yet underperformed in comparison to H3 and N2 epitope distances. This outcome was surprising given that we expected our method for generating titer distances would produce more realistic estimates of immune cross-protection between viruses than epitope-based measures. Our computational approach for inferring HI phenotype dynamically incorporates newer titer measurements and assigns antigenic weight to phylogenetic branches rather than fixed sequence positions (Huddleston et al., 2020; Neher et al., 2016). In contrast, our method for calculating epitope distance assumes that the contributions of specific sites to antigenic drift are constant through time, even though beneficial mutations previously observed at these sites are contingent on historical patterns of viral fitness and host immunity (Huddleston et al., 2020; Koelle et al., 2006; Neher et al., 2014). HI titer measurements have been more useful than epitope substitutions in predicting future A(H3N2) viral populations (Huddleston et al., 2020) and vaccine effectiveness (Ndifon et al., 2009), with the caveat that these targets are more proximate to viral evolution than epidemic dynamics.

HI titer measurements may be more immunologically relevant than epitope-based measures, yet several factors could explain why substitutions at epitope sites outperformed HI titer distances in epidemiological predictions. First, epitope distances may capture properties that affect viral fitness (and in turn outbreak intensity) but are unrelated to immune escape, such as intrinsic transmissibility, ability to replicate, or epistatic interactions. A second set of factors concern methodological issues associated with HI assays. The reference anti-sera for HI assays are routinely produced in ferrets recovering from their first influenza virus infection. Most humans are infected by different influenza virus strains over the course of their lifetimes, and one’s immune history influences the specificity of antibodies generated against drifted influenza virus strains (Hensley, 2014; Lee et al., 2019; Li et al., 2013; Miller et al., 2013). Thus, human influenza virus antibodies, especially those of adults, have more heterogeneous specificities than anti-sera from immunologically naive ferrets (Hensley, 2014).

A related methodological issue is that HI assays disproportionately measure anti-HA antibodies that bind near the receptor binding site and, similar to the RBS distance metric, may capture only a partial view of the antigenic change occurring in the HA protein (Gostic et al., 2019; Henry et al., 2019; Lam et al., 2020; Ranjeva et al., 2019). A recent study of longitudinal serological data found that HI titers are a good correlate of protective immunity for children, while time since infection is a better predictor of protection for adults (Ranjeva et al., 2019). This outcome is consistent with the concept of antigenic seniority, in which an individual’s first exposure to influenza virus during childhood leaves an immunological ‘imprint’, and exposure to new strains ‘back boosts’ one’s antibody response to strains of the same subtype encountered earlier in life (Cobey and Hensley, 2017; Gostic et al., 2019; Zhang et al., 2019). Ranjeva et al.’s study and others suggest that human influenza virus antibodies shift focus from the HA head to other more conserved epitopes as individuals age (Gostic et al., 2019; Henry et al., 2019). Given that HI assays primarily target epitopes adjacent to the RBS, HI assays using ferret or human serological data are not necessarily suitable for detecting the broader immune responses of adults. A third explanation for the underperformance of HI titers concerns measurement error. Recent A(H3N2) viruses have reduced binding efficiency in HI assays, which can skew estimates of immune cross-reactivity between viruses (Zost et al., 2017). These combined factors could obfuscate the relationship between the antigenic phenotypes inferred from HI assays and population-level estimates of A(H3N2) incidence.

Novel antigenic variants are expected to have higher infectivity in immune populations, leading to earlier epidemics and more rapid geographic spread (Viboud et al., 2006), but few studies have quantitatively linked antigenic drift to epidemic timing or geographic synchrony. Previous studies of pneumonia and influenza-associated mortality observed greater severity or geographic synchrony in seasons with major antigenic transitions (Greene et al., 2006; Wiley et al., 1981). A more recent Australian study of lab-confirmed cases also noted greater spatiotemporal synchrony during seasons when novel H3 antigenic variants emerged, although their assessment was based on virus typing alone (i.e. influenza A or B; Geoghegan et al., 2018). A subsequent Australian study with finer-resolution data on subtype incidence and variant circulation determined that more synchronous epidemics were not associated with drifted A(H3N2) strains (Lam et al., 2020), and a U.S.-based analysis of ILI data also failed to detect a relationship between HA antigenic cluster transitions and geographic synchrony (Charu et al., 2017). In our study, the earliest epidemics tended to occur in seasons with transitions between H3 antigenic clusters (e.g. the emergence of the FU02 cluster in 2003–2004) or vaccine mismatches (e.g. N2 mismatch in 1999–2000, H3 mismatch in 2014–2015; Sandbulte et al., 2011; Smith et al., 2004; Xie et al., 2015), but there was not a statistically significant correlation between antigenic change and earlier epidemic onsets or peaks. Regarding epidemic speed, the length of time from epidemic onset to peak decreased with N2 epitope distance but not H3 epitope distance. The relationship between antigenic drift and epidemic timing may be ambiguous because external seeding events or climatic factors, such as temperature and absolute humidity, are more important in driving influenza seasonality and the onsets of winter epidemics (Bedford et al., 2015; Charu et al., 2017; Chattopadhyay et al., 2018; Kramer and Shaman, 2019; Lee et al., 2018; Shaman and Kohn, 2009; Shaman et al., 2010). Alternatively, the resolution of our epidemiological surveillance data (HHS regions) may not be granular enough to detect a signature of antigenic drift in epidemic timing, though studies of city-level influenza dynamics were also unable to identify a clear relationship (Charu et al., 2017; Lam et al., 2020).

After exploring individual correlations between evolutionary indicators and annual epidemics, we considered the effects of influenza A(H1N1) incidence and B incidence on A(H3N2) virus circulation within a season. We detected strong negative associations between A(H1N1) incidence and A(H3N2) epidemic size, peak incidence, transmissibility, and excess mortality, consistent with previous animal, epidemiological, phylodynamic, and theoretical studies that found evidence for cross-immunity between IAV subtypes (Cowling et al., 2010; Epstein, 2006; Ferguson et al., 2003; Gatti et al., 2022; Goldstein et al., 2011; Sonoguchi et al., 1985). For example, individuals recently infected with seasonal influenza A viruses are less likely to become infected during subsequent pandemic waves (Cowling et al., 2010; Epstein, 2006; Fox et al., 2017; Laurie et al., 2015; Sridhar et al., 2013), and the early circulation of one influenza virus type or subtype is associated with a reduced total incidence of the other type/subtypes within a season (Goldstein et al., 2011; Lam et al., 2020). Due to the shared evolutionary history of their internal genes (Webster et al., 1992) and in turn greater T cell-mediated cross-protective immunity, pre-2009 seasonal A(H1N1) viruses may impact A(H3N2) virus circulation to a greater extent than A(H1N1)pdm09 viruses, which have a unique combination of genes that were not identified in animals or humans prior to 2009 (Garten et al., 2009; Smith et al., 2009). We observed similar relationships between A(H3N2) epidemic metrics and A(H1N1) incidence during pre- and post-2009 pandemic seasons, with slightly stronger correlations observed during the pre-2009 period. However, given the small sample size (12 pre-2009 seasons and 9 post-2009 seasons), we cannot fully answer this question.

In our study, univariate correlations between A(H1N1) and A(H3N2) incidence were more pronounced than those observed between A(H3N2) incidence and evolutionary indicators, and A(H1N1) epidemic size was the highest ranked feature by random forest models predicting epidemic size, peak incidence, and transmissibility (effective Rt). Consequently, interference between the two influenza A subtypes may be more impactful than viral evolution in determining the size of annual A(H3N2) outbreaks. Concerning epidemic timing, we did not detect a relationship between A(H3N2) antigenic change and the relative timing of A(H3N2) and A(H1N1) cases; specifically, A(H3N2) incidence did not consistently lead A(H1N1) incidence in seasons with greater H3 or N2 antigenic change. Overall, we did not find any indication that influenza B incidence affects A(H3N2) epidemic burden or timing, which is not unexpected, given that few T and B cell epitopes are shared between the two virus types (Terajima et al., 2013).

Lastly, we used random forest models and multivariable linear regression models to assess the relative importance of viral evolution, prior population immunity, co-circulation of other influenza viruses, and vaccine-related parameters in predicting regional A(H3N2) epidemic dynamics. We chose conditional inference random forest models as our primary method of variable selection because several covariates were collinear, relationships between some predictors and target variables were nonlinear, and our goal was inferential rather than predictive. We performed leave-one-season-out cross-validation to tune each model, but, due to the limited number of seasons in our dataset, we were not able to test predictive performance on an independent test set. With the caveat that models were likely overfit to historical data, random forest models produced accurate predictions of regional epidemic size, peak incidence, and subtype dominance patterns, while predictions of epidemic intensity and transmission rates were less exact. The latter two measures could be more closely tied to climatic factors, the timing of influenza case importations from abroad, or mobility patterns (Bedford et al., 2015; Charu et al., 2017; Shaman and Kohn, 2009; Shaman et al., 2010) or they may be inherently more difficult to predict because their values are more constrained. Random forest models tended to underpredict epidemic burden in seasons with major antigenic transitions, particularly the SY97 seasons (1998–1999, 1999–2000) and the FU02 season (2003–2004), potentially because antigenic jumps of these magnitudes were infrequent during our 22-season study period. An additional step of post-hoc model selection demonstrated that models with only three covariates could also produce accurate fits to observed epidemiological data.

Our study is subject to several limitations, specifically regarding geographic resolution and data availability. First, our analysis is limited to one country with a temperate climate and its findings concerning interactions between A(H3N2), A(H1N1), and type B viruses may not be applicable to tropical or subtropical countries, which experience sporadic epidemics of all three viruses throughout the year (Yang et al., 2020). Second, our measure of population-level influenza incidence is derived from regional CDC outpatient data because those data are publicly available starting with the 1997–1998 season. State level outpatient data are not available until after the 2009 A(H1N1) pandemic, and finer resolution data from electronic health records are accessible in theory but not in the public domain. Access to ILI cases aggregated at the state or city level, collected over the course of decades, would increase statistical power, and enable us to add more location-specific variables to our analysis, such as climatic and environmental factors. A third limitation is that we measured influenza incidence by multiplying the rate of influenza-like illness by the percentage of tests positive for influenza, which does not completely eliminate the possibility of capturing the activity of other co-circulating respiratory pathogens (Kramer and Shaman, 2019). Surveillance data based on more specific diagnosis codes would ensure the exclusion of patients with non-influenza respiratory conditions. Fourth, our data on the age distribution of influenza cases are derived from ILI encounters across four broad age groups and do not include test positivity status, virus type/subtype, or denominator information. Despite the coarseness of these data, we found statistically significant correlations in the expected directions between N2 antigenic change and the fraction of cases in children relative to adults. Lastly, a serological assay exists for NA, but NA titer measurements are not widely available because the assay is labor-intensive and inter-lab variability is high. Thus, we could not test the performance of NA antigenic phenotype in predicting epidemic dynamics.

Beginning in early 2020, non-pharmaceutical interventions (NPIs), including lockdowns, school closures, physical distancing, and masking, were implemented in the United States and globally to slow the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for the COVID-19 pandemic. These mitigation measures disrupted the transmission of seasonal influenza viruses and other directly-transmitted respiratory viruses throughout 2020 and 2021 (Cowling et al., 2020; Huang et al., 2021; Olsen et al., 2020; Olsen et al., 2021; Qi et al., 2021; Tempia et al., 2021), and population immunity to influenza is expected to have decreased substantially during this period of low circulation (Ali et al., 2022; Baker et al., 2020). COVID-19 NPIs relaxed during 2021 and 2022 and co-circulation of A(H3N2) and A(H1N1)pdm09 viruses in the United States resumed during the 2022–2023 influenza season. Our study concludes with the 2018–2019 season, and thus it is unclear whether our modeling approach would be useful in projecting seasonal burden during the post-pandemic period, without an additional component to account for COVID-19-related perturbations to influenza transmission. Further studies will need to determine whether ecological interactions between influenza viruses have changed or if the effects of viral evolution and subtype interference on seasonal outbreaks are different in the post-pandemic period.

In conclusion, relationships between A(H3N2) antigenic drift, epidemic impact, and age dynamics are moderate, with genetic distances based on broad sets of H3 and N2 epitope sites having greater predictive power than serology-based antigenic distances for the timeframe analyzed. Influenza epidemiological patterns are consistent with increased population susceptibility in seasons with high antigenic novelty, and our study is the first to link NA antigenic drift to epidemic burden, timing, and the age distribution of cases. It is well established that anti-HA and anti-NA antibodies are independent correlates of immunity (Couch et al., 2013; Gaglani et al., 2016; Gill and Murphy, 1977; Hope-Simpson, 1971; Memoli et al., 2016; Monto et al., 2015; Murphy et al., 1972), and the influenza research community has advocated for NA-based vaccines (Eichelberger et al., 2018; Krammer et al., 2018). The connection between NA drift and seasonal incidence further highlights the importance of monitoring evolution in both HA and NA to inform vaccine strain selection and epidemic forecasting efforts. Although antigenic change in both HA and NA was correlated with epidemic dynamics, ecological interactions between influenza A subtypes appear to be more influential than viral evolution in determining the intensity of annual A(H3N2) epidemics. The aim of our study was to retrospectively assess the potential drivers of annual A(H3N2) epidemics, yet we cautiously suggest that one could project the size or intensity of future epidemics based on sequence data and A(H1N1)pdm09 incidence alone (Goldstein et al., 2011; Wolf et al., 2010).

Acknowledgements

We thank the Influenza Division at the U.S. Centers for Disease Control and Prevention, the Victorian Infectious Diseases Reference Laboratory at the Australian Peter Doherty Institute for Infection and Immunity, the Influenza Virus Research Center at the Japan National Institute of Infectious Diseases, the Crick Worldwide Influenza Centre at the UK Francis Crick Institute for sharing HI titer data. We gratefully acknowledge the authors, originating laboratories, and submitting laboratories of the sequences from the GISAID EpiFlu Database on which this research is based (listed in Appendix 1). We thank members of the Fogarty International Center’s Division of International Epidemiology and Population Studies (DIEPS) and the Bedford Lab for useful discussions. ACP, CLH, and CV were supported by the in-house research division of the Fogarty International Center, U.S. National Institutes of Health. ACP was supported by the NSF Infectious Disease Evolution Across Scales (IDEAS) Research Collaboration Network. JH was supported by NIH NIAID awards F31 AI140714 and R01 AI165821. The work done at the Crick Worldwide Influenza Centre was supported by the Francis Crick Institute receiving core funding from Cancer Research UK (FC001030), the Medical Research Council (FC001030) and the Wellcome Trust (FC001030). SF, KN, NK, SW and HH were supported by the Ministry of Health, Labour and Welfare, Japan (10110400 and 10111800). SW was supported by the Japan Agency for Medical Research and Development (JP22fk0108118 and JP23fk0108662). The WHO Collaborating Centre for Reference and Research on Influenza is supported by the Australia Government Department of Health and Aged Care. The Melbourne WHO Collaborating Centre for Reference and Research on Influenza is supported by the Australian Government Department of Health. Influenza virus work in the Krammer laboratory was partially supported by the NIAID Centers of Excellence for Influenza Research and Surveillance (CEIRS) contract HHSN272201400008C, NIAID Centers of Excellence for Influenza Research and Response (CEIRR) contract 75N93021C00014 (FK), and NIAID CIVIC contract (75N93019C00051). TB was supported by NIH awards NIGMS R35 GM119774 and NIAID R01 AI127893. TB is an Investigator of the Howard Hughes Medical Institute. Funding sources were not involved in study design, data collection and interpretation, or the decision to submit the work for publication.

Appendix 1

GISAID Acknowledgements

WHO Collaborating Centre for Reference and Research on Influenza, Victorian Infectious Diseases Reference Laboratory, Australia; WHO Collaborating Centre for Reference and Research on Influenza, Chinese National Influenza Center, China; WHO Collaborating Centre for Reference and Research on Influenza, National Institute of Infectious Diseases, Japan; The Crick Worldwide Influenza Centre, The Francis Crick Institute, United Kingdom; WHO Collaborating Centre for the Surveillance, Epidemiology and Control of Influenza, Centers for Disease Control and Prevention, United States; Aalesund Sjukehus, Norway; ADImmune Corporation, Taiwan; ADPH Bureau of Clinical Laboratories, United States; Aichi Prefectural Institute of Public Health, Japan; Akershus University Hospital, Norway; Akita Research Center for Public Health and Environment, Japan; Alabama State Laboratory, United States; Alaska State Public Health Laboratory, United States; Alaska State Virology Lab, United States; Alfred Hospital, Australia; Aomori Prefectural Institute of Public Health and Environment, Japan; Aristotelian University of Thessaloniki, Greece; Arizona Department of Health Services, United States; Arkansas Department of Health, United States; Atlanta VA Medical Center, United States; Auckland Healthcare, New Zealand; Auckland Hospital, New Zealand; Austin Health, Australia; Baylor College of Medicine, United States; Baylor Scott and White Health, United States; California Department of Health Services, United States; Canberra Hospital, Australia; Cantacuzino Institute, Romania; Canterbury Health Services, New Zealand; Caribbean Epidemiology Center, Trinidad and Tobago; CDC Central Asia Office; CDC GAP Nigeria, Nigeria; CDC Kenya, Kenya; CEMIC University Hospital, Argentina; CENETROP, Bolivia; Center For Medical Microbiology, College of Public Health, University of the Philippines, Philippines; Center for Public Health and Environment, Hiroshima Prefectural Technology Research Institute, Japan; Center of Hygiene And Epidemiology, Kirov Oblast, Russian Federation; Center of Hygiene And Epidemiology, Yamalo-Nenets Autonomous Okrug, Russian Federation; Center of Hygiene And Epidemiology, The Republic Of Dagestan, Russian Federation; Central Health Laboratory, Mauritius; Central Laboratory of Public Health, Paraguay; Central Public Health Laboratory, Ministry of Health, Oman; Central Public Health Laboratory, Palestinian Territory; Central Public Health Laboratory, Papua New Guinea; Central Public Health Reference Laboratory, Sierra Leone; Central Research Institute for Epidemiology, Russian Federation; Central Virology Laboratory, Israel; Centre de Recherche Médicale et Sanitaire, Niger; Centre for Diseases Control and Prevention, Armenia; Centre for Infections, Health Protection Agency, United Kingdom; Centre National de Référence des Virus des Infections Respiratoires, France; Centre National de Référence du Virus Influenza Région Sud, France; Centre Pasteur du Cameroun, Cameroon; Centro de Investigación Regional Dr. Hideyo Noguchi, Mexico; Chiba City Institute of Health and Environment, Japan; Chiba Prefectural Institute of Public Health, Japan; Children’s Mercy Hospital, United States; Children’s Hospital Westmead, Australia; Chuuk State Hospital, Micronesia, Federated States of; City of El Paso Dept of Public Health, United States; City of Milwaukee Health Department, United States; Clinical Virology Unit, CDIM, Australia; Colorado Department of Health Lab, United States; Connecticut Department of Public Health, United States; Contiguo a Hospital Rosales, El Salvador; CSL Ltd, United States; Dallas County Health and Human Services, United States; DC Public Health Lab, United States; Delaware Public Health Lab, United States; Departamento de Laboratorio de Salud Publica, Uruguay; Department of Clinical Virology, University College London Hospitals NHS Foundation Trust, United Kingdom; Department of Health, Hong Kong; Department of Public Health, Niigata City, Japan; Department of Virology, Medical University Vienna, Austria; Disease Investigation Centre Wates (BBVW), Australia; Dorevitch Pathology, Australia; Drammen Hospital/Vestreviken HF, Norway; Ehime Prefecture Institute of Public Health and Environmental Science, Japan; Erasmus Medical Center, Netherlands; Erasmus University of Rotterdam, Netherlands; Ethiopian Health and Nutrition Research Institute (EHNRI), Ethiopia; Ethiopian Public Health Institute, Ethiopia; Evandro Chagas Institute, Brazil; Facultad de Medicina, Spain; FBUZ Center for Hygiene and Epidemiology, Russian Federation; Florida Department of Health, United States; Fred Hutchinson Cancer Research Center, United States; Fukui Prefectural Institute of Public Health, Japan; Fukuoka City Institute for Hygiene and the Environment, Japan; Fukuoka Institute of Public Health and Environmental Sciences, Japan; Fukushima Prefectural Institute of Public Health, Japan; Gart Naval General Hospital, United Kingdom; Georgia Public Health Laboratory, United States; Gifu Municipal Institute of Public Health, Japan; Gifu Prefectural Institute of Health and Environmental Sciences, Japan; Government Virus Unit, Hong Kong; Gunma Prefectural Institute of Public Health and Environmental Sciences, Japan; Hackensack University Medical Center, United States; Hamamatsu City Health Environment Research Center, Japan; Haukeland University Hospital, Dept. of Microbiology, Norway; Health Forde, Department of Microbiology, Norway; Health Protection Agency, United Kingdom; Health Protection Inspectorate, Estonia; Hellenic Pasteur Institute, Greece; Helsinki University Central Hospital, Finland; Hiroshima City Institute of Public Health, Japan; Hobart Pathology, Australia; Hokkaido Institute of Public Health, Japan; Hôpital Cantonal Universitaire de Geneves, Switzerland; Hôpital Charles Nicolle, Tunisia; Hôpital Georges L. Dumont, Canada; Hospital Clinic de Barcelona, Spain; Houston Department of Health and Human Services, United States; Hyogo Prefectural Institute of Public Health and Consumer Sciences, Japan; Ibaraki Prefectural Institute of Public Health, Japan; International Centre For Diarrhoeal Disease Research, Bangladesh; Illinois Department of Public Health, United States; Indiana State Department of Health Laboratories, United States; Infectology Center of Latvia, Latvia; Innlandet Hospital Trust, Division Lillehammer, Department for Microbiology, Norway; INRB Service De Virologie, Democratic Republic of the Congo; Institut Fédératif de Recherche Lyon, France; Institut Louis Malardé Clinical Laboratory, French Polynesia; Institut National d'Hygiène, Morocco; Instituto Nacional de Investigación en Salud Pública, Ecuador; Institut National de Recherches en Sante Publique, Mauritania;; Institut de Recherche en Sciences de la Santé, Burkina Faso; Institut Pasteur d’Algerie, Algeria; Institut Pasteur de Bangui, Central African Republic; Institut Pasteur de Dakar, Senegal; Institut Pasteur de Madagascar, Madagascar; Institut Pasteur in Cambodia, Cambodia; Institut Pasteur New Caledonia, New Caledonia; Institut Pasteur, France; Institut Penyelidikan Perubatan, Malaysia; Institute National D’Hygiene, Togo; Institute of Environmental Science and Research, New Zealand; Institute of Environmental Science and Research, Tonga; Institute For Biomedical Sciences, Suriname; Institute of Environmental Science & Research, New Zealand; Institute of Epidemiology and Infectious Diseases, Ukraine; Institute of Epidemiology Disease Control and Research, Bangladesh; Institute of Immunology and Virology Torlak, Serbia; Institute of Medical and Veterinary Science (IMVS), Australia; Institute of Public Health, Serbia; Institute of Public Health, Albania; Institute of Public Health, Montenegro; Institute Pasteur du Cambodia, Cambodia; Instituto Adolfo Lutz, Brazil; Instituto Conmemorativo Gorgas de Estudios de la Salud, Panama; Instituto De Diagnostico Y Referencia Epidemiologicos, Mexico;Instituto de Salud Carlos III, Spain; Instituto de Salud Publica de Chile, Chile; Instituto Nacional de Enfermedades Infecciosas, Argentina; Instituto Nacional de Higiene Rafael Rangel, Venezuela, Bolivia; Instituto Nacional de Laboratoriosde Salud (INLASA), Bolivia; Instituto Nacional de Salud de Columbia, Colombia; Instituto Nacional de Saude, Portugal; Iowa State Hygienic Laboratory, United States; IRSS, Burkina Faso; Ishikawa Prefectural Institute of Public Health and Environmental Science, Japan; ISS, Italy; Istanbul University, Turkey; Istituto Di Igiene, Italy; Istituto Superiore di Sanità, Italy; Ivanovsky Research Institute of Virology RAMS, Russian Federation; Jiangsu Provincial Center for Disease Control and Prevention, China; John Hunter Hospital, Australia; Kagawa Prefectural Research Institute for Environmental Sciences and Public Health, Japan; Kagoshima Prefectural Institute for Environmental Research and Public Health, Japan; Kanagawa Prefectural Institute of Public Health, Japan; Kansas Department of Health and Environment, United States; Kawasaki City Institute of Public Health, Japan; KEMRI Wellcome Trust Research Programme, Kenya; Kentucky Division of Laboratory Services, United States; Kitakyusyu City Institute of Environmental Sciences, Japan; Klinisk Mikrobiologi, Hallands Sjukhus Halmstad, Sweden; Klinisk Mikrobiologi, Karolinska Universitetslaboratoriet, Karolinska Universitetssjukhuset Solna, Sweden; Klinisk Mikrobiologi, Laboratoriemedicin, Norrlands Universitetssjukhus Umea, Sweden; Klinisk Mikrobiologi, Sahlgrenska Universitetssjukhuset Goteborg, Sweden; Kobe Institute of Health, Japan; Kochi Public Health and Sanitation Institute, Japan; Kumamoto City Environmental Research Center, Japan; Kumamoto Prefectural Institute of Public Health and Environmental Science, Japan; Kyoto City Institute of Health and Environmental Sciences, Japan; Kyoto Prefectural Institute of Public Health and Environment, Japan; Laboratoire De Santé Publique Du Québec, Canada; Laboratoire National de Sante Publique, Haiti; Laboratoire National de Sante, Luxembourg; Laboratório Central do Estado do Paraná, Brazil; Laboratorio Central do Estado do Rio de Janeiro, Brazil; Laboratorio de Investigacion/Centro de Educacion Medica y Amistad Dominico Japones (CEMADOJA), Dominican Republic; Laboratorio De Isolamento Viral, Mozambique; Laboratorio De Referencia Nacional Virus Respiratorios, Instituto Nacional De Salud, Peru; Laboratorio De Saude Publico, Macao; Laboratorio de Virologia, Direccion de Microbiologia, Nicaragua; Laboratorio de Virus Respiratorio, Mexico; Laboratorio Di Virologia, Azienda Ospedaliero Universitaria Ospedali Riuniti Ancona, Italy; Laboratorio Nacional de Influenza, Costa Rica; Laboratorio Nacional De Salud Guatemala, Guatemala; Laboratorio Nacional de Virologia, Honduras; Laboratory Directorate, Jordan; Laboratory for Virology, National Institute of Public Health, Slovenia; Laboratory of Influenza and ILI, Belarus; LACEN/ES Laboratório Central de Saúde Pública do Estado do Espirito Santo, Brazil; LACEN/RS - Laboratório Central de Saúde Pública do Rio Grande do Sul, Brazil; LACEN-SC - Laboratório Central de Saúde Pública do Estado de Santa Catarina; Landspitali - University Hospital, Iceland; Lismore Base Hospital, Australia; Lithuanian AIDS Center Laboratory, Lithuania; Los Angeles Quarantine Station, CDC Quarantine Epidemiology and Surveillance Team, United States; Louisiana Department of Health and Hospitals, United States; Maine Health and Environmental Testing Laboratory, United States; Marshfield Clinic Research Foundation, United States; Maryland Department of Health and Mental Hygiene, United States; Massachusetts Department of Public Health, United States; Mater Dei Hospital, Malta; Medical Research Institute, Sri Lanka; Medical University Vienna, Austria; Melbourne Pathology, Australia; Michigan Department of Community Health, United States; Microbiology Services Colindale, Public Health England, United Kingdom; Mie Prefecture Health and Environment Research Institute, Japan; Mikrobiologisk laboratorium, Sykehuset i Vestfold, Norway; Ministry of Health and Population, Egypt; Ministry of Health of Ukraine, Ukraine; Ministry of Health, Bahrain; Ministry of Health, Kiribati; Ministry of Health, Lao, People’s Democratic Republic; Ministry of Health, NIHRD, Indonesia; Ministry of Health, Maldives; Ministry of Health, Oman; Ministry of Health Riyadh, Saudi Arabia; Ministry of Health, Singapore; Ministry of Health, Thailand; Minnesota Department of Health, United States; Mississippi Public Health Laboratory, United States; Missouri Department of Health and Senior Services, United States; Miyagi Prefectural Institute of Public Health and Environment, Japan; Miyazaki Prefectural Institute for Public Health and Environment, Japan; Molde Hospital, Laboratory for Medical Microbiology, Norway; Monash Medical Centre, Australia; Montana Laboratory Services Bureau, United States; Montana Public Health Laboratory, United States; Nagano City Health Center, Japan; Nagano Environmental Conservation Research Institute, Japan; Nagasaki Prefectural Institute For Environment Research and Public Health, Japan; Nagoya City Public Health Research Institute, Japan; NAMRU-2 U.S. Naval Medical Research Unit-2, Cambodia; NAMRU-2 U.S. Naval Medical Research Unit-2, Indonesia; NAMRU-6 U.S. Naval Medical Research Unit-6, Peru; Nara Prefectural Institute for Hygiene and Environment, Japan; National Center for Communicable Diseases, Mongolia; National Center For Epidemiology, National Influenza Center, Hungary; National Center for Laboratory and Epidemiology, Laos; National Centre for Disease Control and Public Health, Georgia; National Centre for Preventive Medicine, Moldova, Republic of; National Centre for Scientific Services for Virology and Vector Borne Diseases, Fiji; National Health Laboratory, Japan; National Health Laboratory, Myanmar; National Influenza Center CVD-Mali, Mali; National Influenza Center French Guiana and French Indies, French Guiana; National Influenza Center, Brazil; National Influenza Center, Mongolia; National Influenza Centre for Northern Greece, Greece; National Influenza Centre of Iraq, Iraq; National Influenza Lab, Tanzania, United Republic of; National Influenza Reference Laboratory, Nigeria; National Institute for Communicable Disease, South Africa; National Institute for Health and Welfare, Finland; National Institute For Medical Research, United Kingdom; National Institute For Public Health and The Environment (RIVM), Netherlands; National Institute of Health, Korea, Republic of; National Institute of Health, Pakistan; National Institute of Hygiene and Epidemiology, Vietnam; National Institute of Infectious Diseases (NIID), Japan; National Institute of Public Health of Kosova, Kosovo; National Institute of Public Health - National Institute of Hygiene, Poland; National Institute of Public Health, Czech Republic; National Institute of Virology, India; National Microbiology Laboratory, Health Canada, Canada; National Public Health Institute of Slovakia, Slovakia; National Public Health Laboratory, Cambodia; National Public Health Laboratory, Ministry of Health, Singapore, Singapore; National Public Health Laboratory, Nepal; National Public Health Laboratory, Singapore; National Reference Laboratory, Kazakhstan; National Referral Hospital, Solomon Islands; National University Hospital, Singapore; National Virology Laboratory, Center Microbiological Investigations, Kyrgyzstan; National Veterinary Institute, Sri Lanka; National Virus Reference Laboratory, Ireland; Naval Health Research Center, United States; NCDC Public Health Reference Laboratory, Nigeria; Nebraska Public Health Lab, United States; Nevada State Health Laboratory, United States; New Hampshire Public Health Laboratories, United States; New Jersey Department of Health and Senior Services, United States; New Mexico Department of Health, United States; New York City Department of Health, United States; New York Presbyterian Hospital Columbia University Medical Center, Microbiology Department, United States; New York State Department of Health, United States; Nicosia General Hospital, Cyprus; Niigata City Institute of Public Health and Environment, Japan; Niigata Prefectural Institute of Public Health and Environmental Sciences, Japan; Niigata University, Japan; N ingbo International Travel Healthcare Center, China; North Carolina State Laboratory of Public Health, United States; North Dakota Department of Health, United States; Norwegian Institute of Public Health, Norway; Ohio Department of Health Laboratories, United States; Oita Prefectural Institute of Health and Environment, Japan; Okayama Prefectural Institute for Environmental Science and Public Health, Japan; Okinawa Prefectural Institute of Health and Environment, Japan; Oklahoma State Department of Health, United States; Ontario Agency for Health Protection and Promotion (OAHPP), Canada; Oregon Public Health Laboratory, United States; Osaka City Institute of Public Health and Environmental Sciences, Japan; Osaka Prefectural Institute of Public Health, Japan; Oslo University Hospital, Ulleval Hospital, Dept. of Microbiology, Norway; Ostfold Hospital - Fredrikstad, Dept. of Microbiology, Norway; Oswaldo Cruz Institute - FIOCRUZ - Laboratory of Respiratory Viruses and Measles (LVRS), Brazil; Papua New Guinea Institute of Medical Research, Papua New Guinea; Pasteur Institut of Côte D'ivoire, Côte D'ivoire; Pasteur Institute of Ho Chi Minh City, Vietnam; Pasteur Institute, Influenza Laboratory, Vietnam; Pathwest QE II Medical Centre, Australia; Pennsylvania Department of Health, United States; Prince of Wales Hospital, Australia; Princess Margaret Hospital for Children, Australia; Provincial Laboratory For Public Health For Northern Alberta, Canada; Provincial Laboratory for Public Health, Alberta Health Services, Canada; Provincial Laboratory of Public Health For Southern Alberta, Canada; Public Health Agency of Sweden, Sweden; Public Health Laboratory Services Branch, Centre for Health Protection, Hong Kong; Public Health Laboratory, Barbados; Public Health Laboratory, Virology Unit, Kuwait; Public Health Ontario, Canada; Public Health Wales Microbiology, United Kingdom; Puerto Rico Department of Health, Puerto Rico; Queensland Health Forensic and Scientific Services, Australia; Queensland Health Scientific Services, Australia; Rafic Hariri University Hospital, Lebanon; Refik Saydam National Public Health Agency, Turkey; Regent Seven Seas Cruises, United States; Republic Institute For Health Protection, Macedonia; Republic of Nauru Hospital, Nauru; Republican Anti Plague Station, Azerbaijan, Republic of; Research Institute for Environmental Sciences and Public Health of Iwate Prefecture, Japan; Research Institute of Health Sciences (IRSS), Burkina Faso; Research Institute of Tropical Medicine, Philippines; Rhode Island Department of Health, United States; Robert-Koch-Institute, Germany; Roy Romanow Provincial Laboratory, Canada; Royal Centre For Disease Control, Bhutan; Royal Children’s Hospital, Australia; Royal Darwin Hospital, Australia; Royal Hobart Hospital, Australia; Royal Melbourne Hospital, Australia; Russian Academy of Medical Sciences, Russian Federation; Rwanda Biomedical Center, National Reference Laboratory, Rwanda; Saga Prefectural Institute of Public Health and Pharmaceutical Research, Japan; Sagamihara City Laboratory of Public Health, Japan; Saitama City Institute of Health Science and Research, Japan; Saitama Institute of Public Health, Japan; Saitama Medical University, Japan; Sakai City Institute of Public Health, Japan; San Antonio Metropolitan Health, United States; Sandringham, National Institute for Communicable Disease, South Africa; Sapporo City Institute of Public Health, Japan; Sciensano, Scientific Institute of Public Health, Belgium; Scientific Institute of Public Health, Belgium; Seattle and King County Public Health Lab, United States; Sendai City Institute of Public Health, Japan; Servicio de Microbiología Complejo Hospitalario de Navarra, Spain; Servicio de Microbiología Hospital Donostia, Spain; Servicio de Microbiología Hospital Meixoeiro, Spain; Servicio de Microbiología Hospital Miguel Servet, Spain; Servicio de Microbiología Hospital Ramón y Cajal, Spain; Servicio de Microbiología Hospital San Pedro de Alcántara, Spain; Servicio de Microbiología Hospital Universitario de Gran Canaria Doctor Negrín, Spain; Servicio de Microbiología Hospital Universitario Son Espases, Spain; Servicio de Microbiología Hospital Virgen de las Nieves, Spain; Servicio de Virosis Respiratorias INEI-ANLIS Carlos G. Malbran, Argentina; Seychelles Public Health Laboratory, Seychelles; Sheikh Khalifa Medical City, United Arab Emirates; Shanghai International Travel Healthcare Center, China; Shiga Prefectural Institute of Public Health, Japan; Shimane Prefectural Institute of Public Health and Environmental Science, Japan; Shizuoka City Institute of Environmental Sciences and Public Health, Japan; Shizuoka Institute of Environment and Hygiene, Japan; Singapore General Hospital, Singapore; Sorlandet Sykehus HF, Dept. of Medical Microbiology, Norway; South Carolina Department of Health, United States; South Dakota Public Health Lab, United States; Southern Nevada Public Health Lab, United States; Spokane Regional Health District, United States; St. Jude’s Children’s Research Hospital, United States; St. Olavs Hospital HF, Dept. of Medical Microbiology, Norway; State Agency, Infectology Center of Latvia, Latvia; State of Hawaii Department of Health, United States; State of Idaho Bureau of Laboratories, United States; State Research Center of Virology and Biotechnology Vector, Russian Federation; Statens Serum Institute, Denmark; Stavanger Universitetssykehus, Avd. for Medisinsk Mikrobiologi, Norway; Supreme Health Council, Qatar; Swedish Institute for Infectious Disease Control, Sweden; Taiwan CDC, Taiwan; Tan Tock Seng Hospital, Singapore; Tarrant County Public Health, Texas, United States; Tehran University of Medical Sciences, Iran; Tennessee Department of Health Laboratory-Nashville, United States; Texas Children’s Hospital, United States; Texas Department of State Health Services, United States; Texas Department of State Health Services, South Texas Laboratory, United States; Thai National Influenza Center, Thailand; Thailand MOPH-U.S. CDC Collaboration (IEIP), Thailand; The Nebraska Medical Center, United States; The NIAID Influenza Genome Sequencing Consortium, United States; Thüringer Landesamt für Verbraucherschutz, Germany; Tochigi Prefectural Institute of Public Health and Environmental Science, Japan; Tokushima Prefectural Centre for Public Health and Environmental Sciences, Japan; Tokyo Metropolitan Institute of Public Health, Japan; Tottori Prefectural Institute of Public Health and Environmental Science, Japan; Toyama Institute of Health, Japan; U.S. Air Force School of Aerospace Medicine, United States; U.S. AMC AFRIMS Department of Virology, Thailand; U.S. Army Medical Research Unit, Kenya (USAMRU-K), Geis Human Influenza Program, Kenya; Uganda Virus Research Institute (UVRI), National Influenza Center, Uganda; Unilabs Laboratoriemedicin Stockholm Solna, Sweden; Unilabs Laboratoriemedicin Vastra Gotaland, Skaraborgs Sjukhus Skovde, Sweden; Universidad de Valladolid, Spain; Universitetssykehuset Nord-Norge HF, Norway; University Malaya, Malaysia; University of Genoa, Italy; University of Ghana, Ghana; University of Michigan SPH EPID, United States; University of The West Indies, Jamaica; University of Vienna, Austria; University of Virginia, Medical Labs/Microbiology, United States; University Teaching Hospital, Zambia; Uoc Policlinico Di Bari Dimo, Italy; UPMC-CLB Dept of Microbiology, United States; Utah Department of Health, United States; Utah Public Health Laboratory, United States; Utsunomiya City Institute of Public Health and Environment Science, Japan; VACSERA, Egypt; Vanderbilt University Medical Center, United States; Vefa Center, Tajikistan; Vermont Department of Health Laboratory, United States; Victorian Infectious Diseases Reference Laboratory, Australia; Virginia Division of Consolidated Laboratories, United States; Virus Research Center, Sendai Medical Center, Japan; Wakayama City Institute of Public Health, Japan; Wakayama Prefectural Research Center of Environment and Public Health, Japan; Walter Reed Army Institute of Research, United States; Washington State Public Health Laboratory, United States; West Virginia Office of Laboratory Services, United States; Westchester County Department of Laboratories and Research, United States; Westmead Hospital, Australia; National Influenza Centre Russian Federation, Russian Federation; WHO National Influenza Centre, National Institute of Medical Research (NIMR), Thailand; WHO National Influenza Centre, Norway; Wisconsin State Laboratory of Hygiene, United States; Wyoming Public Health Laboratory, United States; Yamagata Prefectural Institute of Public Health, Japan; Yamaguchi Prefectural Institute of Public Health and Environment, Japan; Yamanashi Institute for Public Health, Japan; Yap State Hospital, Micronesia; Yokohama City Institute of Health, Japan; Yokosuka Institute of Public Health, Japan

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Contributor Information

Amanda C Perofsky, Email: amanda.perofsky@nih.gov.

Talía Malagón, McGill University, Canada.

Diane M Harper, University of Michigan—Ann Arbor, United States.

Funding Information

This paper was supported by the following grants:

  • Fogarty International Center to Amanda C Perofsky, Chelsea L Hansen, Cécile Viboud.

  • National Science Foundation 1354890 to Amanda C Perofsky.

  • National Institutes of Health F31 AI140714 to John Huddleston.

  • National Institutes of Health R01 AI165821 to John Huddleston.

  • Cancer Research UK FC001030 to Nicola Lewis, Lynne Whittaker, Burcu Ermetal, Ruth Harvey, Monica Galiano, Rodney Stuart Daniels, John W McCauley.

  • Medical Research Council FC001030 to Nicola Lewis, Lynne Whittaker, Burcu Ermetal, Ruth Harvey, Monica Galiano, Rodney Stuart Daniels, John W McCauley.

  • Wellcome Trust FC001030 to Nicola Lewis, Lynne Whittaker, Burcu Ermetal, Ruth Harvey, Monica Galiano, Rodney Stuart Daniels, John W McCauley.

  • Ministry of Health, Labour and Welfare 10110400 to Seiichiro Fujisaki, Kazuya Nakamura, Noriko Kishida, Shinji Watanabe, Hideki Hasegawa.

  • Ministry of Health, Labour and Welfare 10111800 to Seiichiro Fujisaki, Kazuya Nakamura, Noriko Kishida, Shinji Watanabe, Hideki Hasegawa.

  • Japan Agency for Medical Research and Development JP22fk0108118 to Shinji Watanabe.

  • Japan Agency for Medical Research and Development JP23fk0108662 to Shinji Watanabe.

  • Department of Health and Aged Care, Australian Government to Sheena G Sullivan, Ian G Barr, Kanta Subbarao.

  • Department of Health, Government of Western Australia to Sheena G Sullivan, Ian G Barr, Kanta Subbarao.

  • National Institutes of Health HHSN272201400008C to Florian Krammer.

  • National Institutes of Health 75N93021C00014 to Florian Krammer.

  • National Institutes of Health 75N93019C00051 to Florian Krammer.

  • National Institutes of Health R35 GM119774 to Trevor Bedford.

  • National Institutes of Health R01 AI127893 to Trevor Bedford.

  • Howard Hughes Medical Institute to Trevor Bedford.

Additional information

Competing interests

No competing interests declared.

Received personal fees from Sanofi outside the submitted work.

Received consulting fees, honoraria, and travel support from Sanofi Pasteur and Sequris.

The WHO Collaborating Centre for Reference and Research on Influenza in Melbourne has a collaborative research and development agreement (CRADA) with CSL Seqirus for isolation of candidate vaccine viruses in cells and an agreement with IFPMA for isolation of candidate vaccine viruses in eggs. SGS reports honoraria from CSL Seqirus, Moderna, Pfizer, and Evo Health.

The WHO Collaborating Centre for Reference and Research on Influenza in Melbourne has a collaborative research and development agreement (CRADA) with CSL Seqirus for isolation of candidate vaccine viruses in cells and an agreement with IFPMA for isolation of candidate vaccine viruses in eggs.

The Icahn School of Medicine at Mount Sinai has filed patent applications relating to influenza virus vaccines (U.S. patent numbers: 12030928, 11865173, 11266734, 11254733, 10736956, 10583188, 10137189, 10131695, 9968670, 9371366; publication numbers: 20230181715, 20220403358, 20220249652, 20220242935, 20220153873, 20210260179, 20190125859, 20190106461, 20180333479), SARS-CoV-2 serological assays (publication number: 20240210415), and SARS-CoV-2 vaccines (publication numbers: 20230310583, 20230226171), which list FK as co-inventor. FK has consulted for Merck and Pfizer (before 2020), and is currently consulting for Pfizer, Seqirus, 3rd Rock Ventures, GSK and Avimex. The Krammer laboratory is also collaborating with Pfizer on animal models of SARS‐CoV‐2 and with Dynavax on universal influenza virus vaccines.

Received honoraria for serving as an Editor in Chief of the journal Epidemics (Elsevier).

Author contributions

Conceptualization, Data curation, Software, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – review and editing.

Resources, Data curation, Software, Formal analysis, Investigation, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Investigation, Methodology, Writing – review and editing.

Resources, Data curation, Funding acquisition, Investigation, Writing – review and editing.

Conceptualization, Resources, Software, Supervision, Funding acquisition, Methodology, Project administration.

Conceptualization, Resources, Software, Supervision, Funding acquisition, Methodology, Project administration, Writing – review and editing.

Ethics

The human surveillance data and viral sequence data used in this study are anonymous and were openly available to the public prior to the initiation of this study. Therefore, this research does not constitute human subjects research. Influenza syndromic and virologic surveillance data can be obtained from the U.S. Centers for Disease Control and Prevention (CDC) FluView Interactive dashboard (https://www.cdc.gov/flu/weekly/fluviewinteractive.htm). Influenza viral sequence data can be obtained from the Global Initiative on Sharing All Influenza Data (GISAID) database (https://gisaid.org/). The GISAID Initiative ensures that open access to data in GISAID is provided free-of-charge to all individuals that agreed to identify themselves and agreed to uphold the GISAID sharing mechanism governed through its Database Access Agreement. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines for cross-sectional studies.

Additional files

Supplementary file 1. GISAID accessions and metadata for influenza H3 and N2 sequences, including originating labs and submitting labs.
elife-91849-supp1.xlsx (1.2MB, xlsx)
MDAR checklist

Data availability

Sequence data are available from GISAID using accession ids provided in Supplementary file 1. Source code for phylogenetic analyses, inferred HI titers from serological measurements, and evolutionary fitness measurements are available in the GitHub repository https://github.com/blab/perofsky-ili-antigenicity (copy archived at Huddleston, 2024). The five replicate trees for HA and NA can be found at https://nextstrain.org/groups/blab/ under the keyword "perofsky-ili-antigenicity". Epidemiological data, datasets combining seasonal evolutionary fitness measurements and epidemic metrics, and source code for calculating epidemic metrics and performing statistical analyses are available at https://doi.org/10.5281/zenodo.11188848 and https://github.com/aperofsky/H3N2_Antigenic_Epi (copy archived at Perofsky, 2024). Raw serological measurements are restricted from public distribution by previous data sharing agreements.

The following dataset was generated:

Perofsky A. 2024. aperofsky/H3N2_Antigenic_Epi: Initial release (v1.0.0) Zenodo.

References

  1. Ali ST, Lau YC, Shan S, Ryu S, Du Z, Wang L, Xu XK, Chen D, Xiong J, Tae J, Tsang TK, Wu P, Lau EHY, Cowling BJ. Prediction of upcoming global infection burden of influenza seasons after relaxation of public health and social measures during the COVID-19 pandemic: a modelling study. The Lancet. Global Health. 2022;10:e1612–e1622. doi: 10.1016/S2214-109X(22)00358-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Altman MO, Bennink JR, Yewdell JW, Herrin BR. Lamprey VLRB response to influenza virus supports universal rules of immunogenicity and antigenicity. eLife. 2015;4:e07467. doi: 10.7554/eLife.07467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Altmann A, Toloşi L, Sander O, Lengauer T. Permutation importance: a corrected feature importance measure. Bioinformatics. 2010;26:1340–1347. doi: 10.1093/bioinformatics/btq134. [DOI] [PubMed] [Google Scholar]
  4. Axelsen JB, Yaari R, Grenfell BT, Stone L. Multiannual forecasting of seasonal influenza dynamics reveals climatic and evolutionary drivers. PNAS. 2014;111:9538–9542. doi: 10.1073/pnas.1321656111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baker RE, Park SW, Yang W, Vecchi GA, Metcalf CJE, Grenfell BT. The impact of COVID-19 nonpharmaceutical interventions on the future dynamics of endemic infections. PNAS. 2020;117:30547–30553. doi: 10.1073/pnas.2013182117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bedford T, Cobey S, Beerli P, Pascual M. Global migration dynamics underlie evolution and persistence of human influenza A (H3N2) PLOS Pathogens. 2010;6:e1000918. doi: 10.1371/journal.ppat.1000918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bedford T, Cobey S, Pascual M. Strength and tempo of selection revealed in viral gene genealogies. BMC Evolutionary Biology. 2011;11:220. doi: 10.1186/1471-2148-11-220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bedford T, Rambaut A, Pascual M. Canalization of the evolutionary trajectory of the human influenza virus. BMC Biology. 2012;10:38. doi: 10.1186/1741-7007-10-38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bedford T, Suchard MA, Lemey P, Dudas G, Gregory V, Hay AJ, McCauley JW, Russell CA, Smith DJ, Rambaut A. Integrating influenza antigenic dynamics with molecular evolution. eLife. 2014;3:e01914. doi: 10.7554/eLife.01914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bedford T, Riley S, Barr IG, Broor S, Chadha M, Cox NJ, Daniels RS, Gunasekaran CP, Hurt AC, Kelso A, Klimov A, Lewis NS, Li X, McCauley JW, Odagiri T, Potdar V, Rambaut A, Shu Y, Skepner E, Smith DJ, Suchard MA, Tashiro M, Wang D, Xu X, Lemey P, Russell CA. Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature. 2015;523:217–220. doi: 10.1038/nature14460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Belongia EA, Kieke BA, Donahue JG, Coleman LA, Irving SA, Meece JK, Vandermause M, Lindstrom S, Gargiullo P, Shay DK. Influenza vaccine effectiveness in Wisconsin during the 2007-08 season: comparison of interim and final results. Vaccine. 2011;29:6558–6563. doi: 10.1016/j.vaccine.2011.07.002. [DOI] [PubMed] [Google Scholar]
  12. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B. 1995;57:289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]
  13. Bhatt S, Holmes EC, Pybus OG. The genomic rate of molecular adaptation of the human influenza A virus. Molecular Biology and Evolution. 2011;28:2443–2451. doi: 10.1093/molbev/msr044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Bhatt S, Ferguson N, Flaxman S, Gandy A, Mishra S, Scott JA. Semi-mechanistic Bayesian modelling of COVID-19 with renewal processes. Journal of the Royal Statistical Society Series A. 2023;186:601–615. doi: 10.1093/jrsssa/qnad030. [DOI] [Google Scholar]
  15. Biggerstaff M, Cauchemez S, Reed C, Gambhir M, Finelli L. Estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. BMC Infectious Diseases. 2014a;14:480. doi: 10.1186/1471-2334-14-480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Biggerstaff M, Jhung MA, Reed C, Fry AM, Balluz L, Finelli L. Influenza-like illness, the time to seek healthcare, and influenza antiviral receipt during the 2010-2011 influenza season-United States. The Journal of Infectious Diseases. 2014b;210:535–544. doi: 10.1093/infdis/jiu224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Boni MF, Gog JR, Andreasen V, Christiansen FB. Influenza drift and epidemic size: the race between generating and escaping immunity. Theoretical Population Biology. 2004;65:179–191. doi: 10.1016/j.tpb.2003.10.002. [DOI] [PubMed] [Google Scholar]
  18. Brett IC, Johansson BE. Immunization against influenza A virus: comparison of conventional inactivated, live-attenuated and recombinant baculovirus produced purified hemagglutinin and neuraminidase vaccines in A murine model system. Virology. 2005;339:273–280. doi: 10.1016/j.virol.2005.06.006. [DOI] [PubMed] [Google Scholar]
  19. Bridges CB, Thompson WW, Meltzer MI, Reeve GR, Talamonti WJ, Cox NJ, Lilac HA, Hall H, Klimov A, Fukuda K. Effectiveness and cost-benefit of influenza vaccination of healthy working adults: A randomized controlled trial. JAMA. 2000;284:1655–1663. doi: 10.1001/jama.284.13.1655. [DOI] [PubMed] [Google Scholar]
  20. Bush RM, Bender CA, Subbarao K, Cox NJ, Fitch WM. Predicting the evolution of human influenza A. Science. 1999;286:1921–1925. doi: 10.1126/science.286.5446.1921. [DOI] [PubMed] [Google Scholar]
  21. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker MA, Guo J, Li P, Riddell A. Stan: a probabilistic programming language. Journal of Statistical Software. 2017;76:1–32. doi: 10.18637/jss.v076.i01. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Castilla J, Navascués A, Fernández-Alonso M, Reina G, Pozo F, Casado I, Guevara M, Martínez-Baz I, Barricarte A, Ezpeleta C, Primary Health Care Sentinel Network. Network for Influenza Surveillance in Hospitals of Navarra Effectiveness of subunit influenza vaccination in the 2014-2015 season and residual effect of split vaccination in previous seasons. Vaccine. 2016;34:1350–1357. doi: 10.1016/j.vaccine.2016.01.054. [DOI] [PubMed] [Google Scholar]
  23. Centers for Disease Control and Prevention (CDC) Assessment of the effectiveness of the 2003-04 influenza vaccine among children and adults—Colorado, 2003. MMWR. Morbidity and Mortality Weekly Report. 2004;53:707–710. [PubMed] [Google Scholar]
  24. Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases Flu Vaccination Coverage, United States, 2018–19 Influenza Season. 2019. [March 20, 2023]. https://www.cdc.gov/flu/fluvaxview/coverage-1819estimates.htm
  25. Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases FluView Interactive. 2023a. [October 20, 2023]. https://www.cdc.gov/flu/weekly/fluviewinteractive.htm
  26. Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases U.S. influenza surveillance: purpose and methods. 2023b. [April 24, 2022]. https://www.cdc.gov/flu/weekly/overview.htm
  27. Chao A, Gotelli NJ, Hsieh TC, Sander EL, Ma KH, Colwell RK, Ellison AM. Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies. Ecological Monographs. 2014;84:45–67. doi: 10.1890/13-0133.1. [DOI] [Google Scholar]
  28. Charu V, Zeger S, Gog J, Bjørnstad ON, Kissler S, Simonsen L, Grenfell BT, Viboud C. Human mobility and the spatial transmission of influenza in the United States. PLOS Computational Biology. 2017;13:e1005382. doi: 10.1371/journal.pcbi.1005382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Chattopadhyay I, Kiciman E, Elliott JW, Shaman JL, Rzhetsky A. Conjunction of factors triggering waves of seasonal influenza. eLife. 2018;7:e30756. doi: 10.7554/eLife.30756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Chen YQ, Wohlbold TJ, Zheng NY, Huang M, Huang Y, Neu KE, Lee J, Wan H, Rojas KT, Kirkpatrick E, Henry C, Palm AKE, Stamper CT, Lan LYL, Topham DJ, Treanor J, Wrammert J, Ahmed R, Eichelberger MC, Georgiou G, Krammer F, Wilson PC. Influenza infection in humans induces broadly cross-reactive and protective neuraminidase-reactive antibodies. Cell. 2018;173:417–429. doi: 10.1016/j.cell.2018.03.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Cobey S, Hensley SE. Immune history and influenza virus susceptibility. Current Opinion in Virology. 2017;22:105–111. doi: 10.1016/j.coviro.2016.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Cori A, Ferguson NM, Fraser C, Cauchemez S. A new framework and software to estimate time-varying reproduction numbers during epidemics. American Journal of Epidemiology. 2013;178:1505–1512. doi: 10.1093/aje/kwt133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Couch RB, Kasel JA, Gerin JL, Schulman JL, Kilbourne ED. Induction of partial immunity to influenza by a neuraminidase-specific vaccine. The Journal of Infectious Diseases. 1974;129:411–420. doi: 10.1093/infdis/129.4.411. [DOI] [PubMed] [Google Scholar]
  34. Couch RB, Atmar RL, Franco LM, Quarles JM, Wells J, Arden N, Niño D, Belmont JW. Antibody correlates and predictors of immunity to naturally occurring influenza in humans and the importance of antibody to the neuraminidase. The Journal of Infectious Diseases. 2013;207:974–981. doi: 10.1093/infdis/jis935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Cowling BJ, Fang VJ, Riley S, Malik Peiris JS, Leung GM. Estimation of the serial interval of influenza. Epidemiology. 2009;20:344–347. doi: 10.1097/EDE.0b013e31819d1092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Cowling BJ, Ng S, Ma ESK, Cheng CKY, Wai W, Fang VJ, Chan K-H, Ip DKM, Chiu SS, Peiris JSM, Leung GM. Protective efficacy of seasonal influenza vaccination against seasonal and pandemic influenza virus infection during 2009 in Hong Kong. Clinical Infectious Diseases. 2010;51:1370–1379. doi: 10.1086/657311. [DOI] [PubMed] [Google Scholar]
  37. Cowling BJ, Perera RAPM, Fang VJ, Chan K-H, Wai W, So HC, Chu DKW, Wong JY, Shiu EY, Ng S, Ip DKM, Peiris JSM, Leung GM. Incidence of influenza virus infections in children in Hong Kong in a 3-year randomized placebo-controlled vaccine study, 2009-2012. Clinical Infectious Diseases. 2014;59:517–524. doi: 10.1093/cid/ciu356. [DOI] [PubMed] [Google Scholar]
  38. Cowling BJ, Ali ST, Ng TWY, Tsang TK, Li JCM, Fong MW, Liao Q, Kwan MY, Lee SL, Chiu SS, Wu JT, Wu P, Leung GM. Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study. The Lancet. Public Health. 2020;5:e279–e288. doi: 10.1016/S2468-2667(20)30090-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Dalziel BD, Kissler S, Gog JR, Viboud C, Bjørnstad ON, Metcalf CJE, Grenfell BT. Urbanization and humidity shape the intensity of influenza epidemics in U.S. cities. Science. 2018;362:75–79. doi: 10.1126/science.aat6030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Debeer D, Strobl C. Conditional permutation importance revisited. BMC Bioinformatics. 2020;21:307. doi: 10.1186/s12859-020-03622-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Dhanasekaran V, Sullivan S, Edwards KM, Xie R, Khvorov A, Valkenburg SA, Cowling BJ, Barr IG. Human seasonal influenza under COVID-19 and the potential consequences of influenza lineage elimination. Nature Communications. 2022;13:1721. doi: 10.1038/s41467-022-29402-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Du X, King AA, Woods RJ, Pascual M. Evolution-informed forecasting of seasonal influenza A (H3N2) Science Translational Medicine. 2017;9:eaan5325. doi: 10.1126/scitranslmed.aan5325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Eichelberger MC, Morens DM, Taubenberger JK. Neuraminidase as an influenza vaccine antigen: a low hanging fruit, ready for picking to improve vaccine effectiveness. Current Opinion in Immunology. 2018;53:38–44. doi: 10.1016/j.coi.2018.03.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Epstein SL. Prior H1N1 influenza infection and susceptibility of cleveland family study participants during the H2N2 pandemic of 1957: an experiment of nature. The Journal of Infectious Diseases. 2006;193:49–53. doi: 10.1086/498980. [DOI] [PubMed] [Google Scholar]
  45. Ferguson NM, Galvani AP, Bush RM. Ecological and immunological determinants of influenza evolution. Nature. 2003;422:428–433. doi: 10.1038/nature01509. [DOI] [PubMed] [Google Scholar]
  46. Ferguson NM, Cummings DAT, Cauchemez S, Fraser C, Riley S, Meeyai A, Iamsirithaworn S, Burke DS. Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature. 2005;437:209–214. doi: 10.1038/nature04017. [DOI] [PubMed] [Google Scholar]
  47. Ferrari S, Cribari-Neto F. Beta regression for modelling rates and proportions. Journal of Applied Statistics. 2004;31:799–815. doi: 10.1080/0266476042000214501. [DOI] [Google Scholar]
  48. Fiore AE, Shay DK, Broder K, Iskander JK, Uyeki TM, Mootrey G, Bresee JS, Cox NJ, Centers for Disease Control and Prevention Prevention and control of seasonal influenza with vaccines: recommendations of the Advisory Committee on Immunization Practices (ACIP), 2009. MMWR. Recommendations and Reports. 2009;58:1–52. [PubMed] [Google Scholar]
  49. Flannery B, Zimmerman RK, Gubareva LV, Garten RJ, Chung JR, Nowalk MP, Jackson ML, Jackson LA, Monto AS, Ohmit SE, Belongia EA, McLean HQ, Gaglani M, Piedra PA, Mishin VP, Chesnokov AP, Spencer S, Thaker SN, Barnes JR, Foust A, Sessions W, Xu X, Katz J, Fry AM. Enhanced genetic characterization of influenza A(H3N2) viruses and vaccine effectiveness by genetic group, 2014-2015. The Journal of Infectious Diseases. 2016;214:1010–1019. doi: 10.1093/infdis/jiw181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Flannery B, Chung JR, Monto AS, Martin ET, Belongia EA, McLean HQ, Gaglani M, Murthy K, Zimmerman RK, Nowalk MP, Jackson ML, Jackson LA, Rolfes MA, Spencer S, Fry AM, Investigators USFV. Influenza vaccine effectiveness in the united states during the 2016-2017 season. Clinical Infectious Diseases. 2019;68:1798–1806. doi: 10.1093/cid/ciy775. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Flannery B, Kondor RJG, Chung JR, Gaglani M, Reis M, Zimmerman RK, Nowalk MP, Jackson ML, Jackson LA, Monto AS, Martin ET, Belongia EA, McLean HQ, Kim SS, Blanton L, Kniss K, Budd AP, Brammer L, Stark TJ, Barnes JR, Wentworth DE, Fry AM, Patel M. Spread of antigenically drifted influenza A(H3N2) viruses and vaccine effectiveness in the united states during the 2018-2019 Season. The Journal of Infectious Diseases. 2020;221:8–15. doi: 10.1093/infdis/jiz543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Fox SJ, Miller JC, Meyers LA. Seasonality in risk of pandemic influenza emergence. PLOS Computational Biology. 2017;13:e1005749. doi: 10.1371/journal.pcbi.1005749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software. 2010;33:1–22. [PMC free article] [PubMed] [Google Scholar]
  54. Gaglani M, Pruszynski J, Murthy K, Clipper L, Robertson A, Reis M, Chung JR, Piedra PA, Avadhanula V, Nowalk MP, Zimmerman RK, Jackson ML, Jackson LA, Petrie JG, Ohmit SE, Monto AS, McLean HQ, Belongia EA, Fry AM, Flannery B. Influenza vaccine effectiveness against 2009 pandemic influenza A(H1N1) virus differed by vaccine type during 2013-2014 in the United States. The Journal of Infectious Diseases. 2016;213:1546–1556. doi: 10.1093/infdis/jiv577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Garten RJ, Davis CT, Russell CA, Shu B, Lindstrom S, Balish A, Sessions WM, Xu X, Skepner E, Deyde V, Okomo-Adhiambo M, Gubareva L, Barnes J, Smith CB, Emery SL, Hillman MJ, Rivailler P, Smagala J, de Graaf M, Burke DF, Fouchier RAM, Pappas C, Alpuche-Aranda CM, López-Gatell H, Olivera H, López I, Myers CA, Faix D, Blair PJ, Yu C, Keene KM, Dotson PD, Jr, Boxrud D, Sambol AR, Abid SH, St George K, Bannerman T, Moore AL, Stringer DJ, Blevins P, Demmler-Harrison GJ, Ginsberg M, Kriner P, Waterman S, Smole S, Guevara HF, Belongia EA, Clark PA, Beatrice ST, Donis R, Katz J, Finelli L, Bridges CB, Shaw M, Jernigan DB, Uyeki TM, Smith DJ, Klimov AI, Cox NJ. Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans. Science. 2009;325:197–201. doi: 10.1126/science.1176225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Gatti L, Koenen MH, Zhang JD, Anisimova M, Verhagen LM, Schutten M, Osterhaus A, van der Vries E. Cross-reactive immunity potentially drives global oscillation and opposed alternation patterns of seasonal influenza A viruses. Scientific Reports. 2022;12:8883. doi: 10.1038/s41598-022-08233-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Geoghegan JL, Saavedra AF, Duchêne S, Sullivan S, Barr I, Holmes EC. Continental synchronicity of human influenza virus epidemics despite climatic variation. PLOS Pathogens. 2018;14:e1006780. doi: 10.1371/journal.ppat.1006780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Gerdil C. The annual production cycle for influenza vaccine. Vaccine. 2003;21:1776–1779. doi: 10.1016/s0264-410x(03)00071-9. [DOI] [PubMed] [Google Scholar]
  59. Gill PW, Murphy AM. Naturally acquired immunity to influenza type A: A further prospective study. The Medical Journal of Australia. 1977;2:761–765. doi: 10.5694/j.1326-5377.1977.tb99276.x. [DOI] [PubMed] [Google Scholar]
  60. Goldstein E, Cobey S, Takahashi S, Miller JC, Lipsitch M. Predicting the epidemic sizes of influenza A/H1N1, A/H3N2, and B: A statistical method. PLOS Medicine. 2011;8:e1001051. doi: 10.1371/journal.pmed.1001051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Gong LI, Suchard MA, Bloom JD. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife. 2013;2:e00631. doi: 10.7554/eLife.00631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Gostic KM, Bridge R, Brady S, Viboud C, Worobey M, Lloyd-Smith JO. Childhood immune imprinting to influenza A shapes birth year-specific risk during seasonal H1N1 and H3N2 epidemics. PLOS Pathogens. 2019;15:e1008109. doi: 10.1371/journal.ppat.1008109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Gostic KM, McGough L, Baskerville EB, Abbott S, Joshi K, Tedijanto C, Kahn R, Niehus R, Hay JA, De Salazar PM, Hellewell J, Meakin S, Munday JD, Bosse NI, Sherrat K, Thompson RN, White LF, Huisman JS, Scire J, Bonhoeffer S, Stadler T, Wallinga J, Funk S, Lipsitch M, Cobey S. Practical considerations for measuring the effective reproductive number, Rt. PLOS Computational Biology. 2020;16:e1008409. doi: 10.1371/journal.pcbi.1008409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Grebe KM, Yewdell JW, Bennink JR. Heterosubtypic immunity to influenza A virus: where do we stand? Microbes and Infection. 2008;10:1024–1029. doi: 10.1016/j.micinf.2008.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Greene SK, Ionides EL, Wilson ML. Patterns of influenza-associated mortality among US elderly by geographic region and virus subtype, 1968-1998. American Journal of Epidemiology. 2006;163:316–326. doi: 10.1093/aje/kwj040. [DOI] [PubMed] [Google Scholar]
  66. Grenfell BT, Bjørnstad ON, Kappey J. Travelling waves and spatial hierarchies in measles epidemics. Nature. 2001;414:716–723. doi: 10.1038/414716a. [DOI] [PubMed] [Google Scholar]
  67. Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, Sagulenko P, Bedford T, Neher RA. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018;34:4121–4123. doi: 10.1093/bioinformatics/bty407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Hansen CL, Chaves SS, Demont C, Viboud C. Mortality associated with influenza and respiratory syncytial virus in the US, 1999-2018. JAMA Network Open. 2022;5:e220527. doi: 10.1001/jamanetworkopen.2022.0527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Hay AJ, Gregory V, Douglas AR, Lin YP. The evolution of human influenza viruses. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2001;356:1861–1870. doi: 10.1098/rstb.2001.0999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. He D, Lui R, Wang L, Tse CK, Yang L, Stone L. Global spatio-temporal patterns of influenza in the post-pandemic Era. Scientific Reports. 2015;5:11013. doi: 10.1038/srep11013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Henry C, Zheng N-Y, Huang M, Cabanov A, Rojas KT, Kaur K, Andrews SF, Palm A-KE, Chen Y-Q, Li Y, Hoskova K, Utset HA, Vieira MC, Wrammert J, Ahmed R, Holden-Wiltse J, Topham DJ, Treanor JJ, Ertl HC, Schmader KE, Cobey S, Krammer F, Hensley SE, Greenberg H, He X-S, Wilson PC. Influenza virus vaccination elicits poorly adapted b cell responses in elderly individuals. Cell Host & Microbe. 2019;25:357–366. doi: 10.1016/j.chom.2019.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Hensley SE. Challenges of selecting seasonal influenza vaccine strains for humans with diverse pre-exposure histories. Current Opinion in Virology. 2014;8:85–89. doi: 10.1016/j.coviro.2014.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Hill MO. Diversity and evenness: a unifying notation and its consequences. Ecology. 1973;54:427–432. doi: 10.2307/1934352. [DOI] [Google Scholar]
  74. Hoffman MD, Gelman A. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research: JMLR. 2014;15:1593–1623. [Google Scholar]
  75. Hope-Simpson RE. Hong Kong influenza variant. British Medical Journal. 1971;3:531. doi: 10.1136/bmj.3.5773.531-b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Hothorn T, Bühlmann P, Dudoit S, Molinaro A, van der Laan MJ. Survival ensembles. Biostatistics. 2006;7:355–373. doi: 10.1093/biostatistics/kxj011. [DOI] [PubMed] [Google Scholar]
  77. Huang QS, Wood T, Jelley L, Jennings T, Jefferies S, Daniells K, Nesdale A, Dowell T, Turner N, Campbell-Stokes P, Balm M, Dobinson HC, Grant CC, James S, Aminisani N, Ralston J, Gunn W, Bocacao J, Danielewicz J, Moncrieff T, McNeill A, Lopez L, Waite B, Kiedrzynski T, Schrader H, Gray R, Cook K, Currin D, Engelbrecht C, Tapurau W, Emmerton L, Martin M, Baker MG, Taylor S, Trenholme A, Wong C, Lawrence S, McArthur C, Stanley A, Roberts S, Rahnama F, Bennett J, Mansell C, Dilcher M, Werno A, Grant J, van der Linden A, Youngblood B, Thomas PG, NPIsImpactOnFlu Consortium. Webby RJ. Impact of the COVID-19 nonpharmaceutical interventions on influenza and other respiratory viral infections in New Zealand. Nature Communications. 2021;12:1001. doi: 10.1038/s41467-021-21157-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Huddleston J, Barnes JR, Rowe T, Xu X, Kondor R, Wentworth DE, Whittaker L, Ermetal B, Daniels RS, McCauley JW, Fujisaki S, Nakamura K, Kishida N, Watanabe S, Hasegawa H, Barr I, Subbarao K, Barrat-Charlaix P, Neher RA, Bedford T. Integrating genotypes and phenotypes improves long-term forecasts of seasonal influenza A/H3N2 evolution. eLife. 2020;9:e60067. doi: 10.7554/eLife.60067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Huddleston J, Hadfield J, Sibley TR, Lee J, Fay K, Ilcisin M, Harkins E, Bedford T, Neher RA, Hodcroft EB. Augur: a bioinformatics toolkit for phylogenetic analyses of human pathogens. Journal of Open Source Software. 2021;6:2906. doi: 10.21105/joss.02906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Huddleston J. Perofsky-ili-antigenicity. swh:1:rev:b6591ed767bff9c7f9caec216ebe72e031d20042Software Heritage. 2024 https://archive.softwareheritage.org/swh:1:dir:dacc6ef9ae3eb7dd200f5c0cc82be349488e099a;origin=https://github.com/blab/perofsky-ili-antigenicity;visit=swh:1:snp:ecc390c30ff7f27e799b29262e126dcff41d2d7e;anchor=swh:1:rev:b6591ed767bff9c7f9caec216ebe72e031d20042
  81. Jackson ML, Chung JR, Jackson LA, Phillips CH, Benoit J, Monto AS, Martin ET, Belongia EA, McLean HQ, Gaglani M, Murthy K, Zimmerman R, Nowalk MP, Fry AM, Flannery B. Influenza vaccine effectiveness in the United States during the 2015-2016 Season. The New England Journal of Medicine. 2017;377:534–543. doi: 10.1056/NEJMoa1700153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Jang SH, Kang J. Factors associated with influenza vaccination uptake among U.S. adults: focus on nativity and race/ethnicity. International Journal of Environmental Research and Public Health. 2021;18:5349. doi: 10.3390/ijerph18105349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Janjua NZ, Skowronski DM, De Serres G, Dickinson J, Crowcroft NS, Taylor M, Winter A-L, Hottes TS, Fonseca K, Charest H, Drews SJ, Sabaiduc S, Bastien N, Li Y, Gardy JL, Petric M. Estimates of influenza vaccine effectiveness for 2007-2008 from Canada’s sentinel surveillance system: cross-protection against major and minor variants. The Journal of Infectious Diseases. 2012;205:1858–1868. doi: 10.1093/infdis/jis283. [DOI] [PubMed] [Google Scholar]
  84. Johansson BE, Grajower B, Kilbourne ED. Infection-permissive immunization with influenza virus neuraminidase prevents weight loss in infected mice. Vaccine. 1993;11:1037–1039. doi: 10.1016/0264-410x(93)90130-p. [DOI] [PubMed] [Google Scholar]
  85. Johansson MA, Cummings DAT, Glass GE. Multiyear climate variability and dengue--El Niño southern oscillation, weather, and dengue incidence in Puerto Rico, Mexico, and Thailand: a longitudinal data analysis. PLOS Medicine. 2009;6:e1000168. doi: 10.1371/journal.pmed.1000168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Kawai N, Ikematsu H, Iwaki N, Satoh I, Kawashima T, Tsuchimoto T, Kashiwagi S. A prospective, Internet-based study of the effectiveness and safety of influenza vaccination in the 2001-2002 influenza season. Vaccine. 2003;21:4507–4513. doi: 10.1016/s0264-410x(03)00508-5. [DOI] [PubMed] [Google Scholar]
  88. Kilbourne ED. Comparative efficacy of neuraminidase-specific and conventional influenza virus vaccines in induction of antibody to neuraminidase in humans. The Journal of Infectious Diseases. 1976;134:384–394. doi: 10.1093/infdis/134.4.384. [DOI] [PubMed] [Google Scholar]
  89. Kilbourne ED, Johansson BE, Grajower B. Independent and disparate evolution in nature of influenza A virus hemagglutinin and neuraminidase glycoproteins. PNAS. 1990;87:786–790. doi: 10.1073/pnas.87.2.786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Kirkpatrick E, Qiu X, Wilson PC, Bahl J, Krammer F. The influenza virus hemagglutinin head evolves faster than the stalk domain. Scientific Reports. 2018;8:10432. doi: 10.1038/s41598-018-28706-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Kissling E, Valenciano M, Larrauri A, Oroszi B, Cohen JM, Nunes B, Pitigoi D, Rizzo C, Rebolledo J, Paradowska-Stankiewicz I, Jiménez-Jorge S, Horváth JK, Daviaud I, Guiomar R, Necula G, Bella A, O’Donnell J, Głuchowska M, Ciancio BC, Nicoll A, Moren A. Low and decreasing vaccine effectiveness against influenza A(H3) in 2011/12 among vaccination target groups in Europe: results from the I-MOVE multicentre case-control study. Euro Surveillance. 2013;18:20390. doi: 10.2807/ese.18.05.20390-en. [DOI] [PubMed] [Google Scholar]
  92. Koel BF, Burke DF, Bestebroer TM, van der Vliet S, Zondag GCM, Vervaet G, Skepner E, Lewis NS, Spronken MIJ, Russell CA, Eropkin MY, Hurt AC, Barr IG, de Jong JC, Rimmelzwaan GF, Osterhaus A, Fouchier RAM, Smith DJ. Substitutions near the receptor binding site determine major antigenic change during influenza virus evolution. Science. 2013;342:976–979. doi: 10.1126/science.1244730. [DOI] [PubMed] [Google Scholar]
  93. Koelle K, Cobey S, Grenfell B, Pascual M. Epochal evolution shapes the phylodynamics of interpandemic influenza A (H3N2) in humans. Science. 2006;314:1898–1903. doi: 10.1126/science.1132745. [DOI] [PubMed] [Google Scholar]
  94. Koelle K, Kamradt M, Pascual M. Understanding the dynamics of rapidly evolving pathogens through modeling the tempo of antigenic change: influenza as a case study. Epidemics. 2009;1:129–137. doi: 10.1016/j.epidem.2009.05.003. [DOI] [PubMed] [Google Scholar]
  95. Koelle K, Rasmussen DA. The effects of A deleterious mutation load on patterns of influenza A/H3N2’s antigenic evolution in humans. eLife. 2015;4:e07361. doi: 10.7554/eLife.07361. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Kramer SC, Shaman J. Development and validation of influenza forecasting for 64 temperate and tropical countries. PLOS Computational Biology. 2019;15:e1006742. doi: 10.1371/journal.pcbi.1006742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Krammer F, Fouchier RAM, Eichelberger MC, Webby RJ, Shaw-Saliba K, Wan H, Wilson PC, Compans RW, Skountzou I, Monto AS. NAction! How can neuraminidase-based immunity contribute to better influenza virus vaccines? mBio. 2018;9:e02332-17. doi: 10.1128/mBio.02332-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Krammer F. The human antibody response to influenza A virus infection and vaccination. Nature Reviews. Immunology. 2019;19:383–397. doi: 10.1038/s41577-019-0143-6. [DOI] [PubMed] [Google Scholar]
  99. Krammer F. Unpublished influenza N2 epitope sites. afa8e58Github. 2023 https://github.com/blab/perofsky-ili-antigenicity/blob/master/config/distance_maps/h3n2/na/krammer.json
  100. Kryazhimskiy S, Dushoff J, Bazykin GA, Plotkin JB. Prevalence of epistasis in the evolution of influenza A surface proteins. PLOS Genetics. 2011;7:e1001301. doi: 10.1371/journal.pgen.1001301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Kuhn M. Building predictive models in R using the caret package. Journal of Statistical Software. 2008;28:1–26. doi: 10.18637/jss.v028.i05. [DOI] [Google Scholar]
  102. Kuhn M, Johnson K. Applied Predictive Modeling. Springer; 2013. [DOI] [Google Scholar]
  103. Kuhn M, Johnson K. Feature engineering and selection: a practical approach for predictive models. Chapman and Hall/CRC; 2019. [Google Scholar]
  104. Lam EKS, Morris DH, Hurt AC, Barr IG, Russell CA. The impact of climate and antigenic evolution on seasonal influenza virus epidemics in Australia. Nature Communications. 2020;11:2741. doi: 10.1038/s41467-020-16545-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Laurie KL, Guarnaccia TA, Carolan LA, Yan AWC, Aban M, Petrie S, Cao P, Heffernan JM, McVernon J, Mosse J, Kelso A, McCaw JM, Barr IG. Interval between infections and viral hierarchy are determinants of viral interference following influenza virus infection in a ferret model. The Journal of Infectious Diseases. 2015;212:1701–1710. doi: 10.1093/infdis/jiv260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Lee EC, Arab A, Goldlust SM, Viboud C, Grenfell BT, Bansal S. Deploying digital health data to optimize influenza surveillance at national and local scales. PLOS Computational Biology. 2018;14:e1006020. doi: 10.1371/journal.pcbi.1006020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Lee JM, Eguia R, Zost SJ, Choudhary S, Wilson PC, Bedford T, Stevens-Ayers T, Boeckh M, Hurt AC, Lakdawala SS, Hensley SE, Bloom JD. Mapping person-to-person variation in viral mutations that escape polyclonal serum targeting influenza hemagglutinin. eLife. 2019;8:e49324. doi: 10.7554/eLife.49324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Lessler J, Reich NG, Brookmeyer R, Perl TM, Nelson KE, Cummings DAT. Incubation periods of acute respiratory viral infections: a systematic review. The Lancet. Infectious Diseases. 2009;9:291–300. doi: 10.1016/S1473-3099(09)70069-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Lester RT, McGeer A, Tomlinson G, Detsky AS. Use of, effectiveness of, and attitudes regarding influenza vaccine among house staff. Infection Control & Hospital Epidemiology. 2003;24:839–844. doi: 10.1086/502146. [DOI] [PubMed] [Google Scholar]
  110. Li Y, Myers JL, Bostick DL, Sullivan CB, Madara J, Linderman SL, Liu Q, Carter DM, Wrammert J, Esposito S, Principi N, Plotkin JB, Ross TM, Ahmed R, Wilson PC, Hensley SE. Immune history shapes specificity of pandemic H1N1 influenza antibody responses. The Journal of Experimental Medicine. 2013;210:1493–1500. doi: 10.1084/jem.20130212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Liebhold A, Koenig WD, Bjørnstad ON. Spatial synchrony in population dynamics. Annual Review of Ecology, Evolution, and Systematics. 2004;35:467–490. doi: 10.1146/annurev.ecolsys.34.011802.132516. [DOI] [Google Scholar]
  112. Lu PJ, Singleton JA, Euler GL, Williams WW, Bridges CB. Seasonal influenza vaccination coverage among adult populations in the United States, 2005-2011. American Journal of Epidemiology. 2013;178:1478–1487. doi: 10.1093/aje/kwt158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Lu PJ, Hung MC, O’Halloran AC, Ding H, Srivastav A, Williams WW, Singleton JA. Seasonal influenza vaccination coverage trends among adult populations, U.S., 2010-2016. American Journal of Preventive Medicine. 2019;57:458–469. doi: 10.1016/j.amepre.2019.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Luksza M, Lässig M. A predictive fitness model for influenza. Nature. 2014;507:57–61. doi: 10.1038/nature13087. [DOI] [PubMed] [Google Scholar]
  115. Margine I, Hai R, Albrecht RA, Obermoser G, Harrod AC, Banchereau J, Palucka K, García-Sastre A, Palese P, Treanor JJ, Krammer F. H3N2 influenza virus infection induces broadly reactive hemagglutinin stalk antibodies in humans and mice. Journal of Virology. 2013;87:4728–4737. doi: 10.1128/JVI.03509-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. McLean HQ, Thompson MG, Sundaram ME, Meece JK, McClure DL, Friedrich TC, Belongia EA. Impact of repeated vaccination on vaccine effectiveness against influenza A(H3N2) and B during 8 seasons. Clinical Infectious Diseases. 2014;59:1375–1385. doi: 10.1093/cid/ciu680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Memoli MJ, Shaw PA, Han A, Czajkowski L, Reed S, Athota R, Bristol T, Fargis S, Risos K, Powers JH, Davey RT, Taubenberger JK. Evaluation of antihemagglutinin and antineuraminidase antibodies as correlates of protection in an influenza A/H1N1 virus healthy human challenge model. mBio. 2016;7:e00417. doi: 10.1128/mBio.00417-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Miller MS, Gardner TJ, Krammer F, Aguado LC, Tortorella D, Basler CF, Palese P. Neutralizing antibodies against previously encountered influenza virus strains increase over time: A longitudinal analysis. Science Translational Medicine. 2013;5:198ra107. doi: 10.1126/scitranslmed.3006637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Monto AS, Petrie JG, Cross RT, Johnson E, Liu M, Zhong W, Levine M, Katz JM, Ohmit SE. Antibody to influenza virus neuraminidase: An independent correlate of protection. The Journal of Infectious Diseases. 2015;212:1191–1199. doi: 10.1093/infdis/jiv195. [DOI] [PubMed] [Google Scholar]
  120. Muggeo VMR. Estimating regression models with unknown break-points. Statistics in Medicine. 2003;22:3055–3071. doi: 10.1002/sim.1545. [DOI] [PubMed] [Google Scholar]
  121. Muggeo V. Segmented: an R package to fit regression models with broken-line relationships. R News. 2008;8:20–25. [Google Scholar]
  122. Murphy BR, Kasel JA, Chanock RM. Association of serum anti-neuraminidase antibody with resistance to influenza in man. The New England Journal of Medicine. 1972;286:1329–1332. doi: 10.1056/NEJM197206222862502. [DOI] [PubMed] [Google Scholar]
  123. Myers JL, Wetzel KS, Linderman SL, Li Y, Sullivan CB, Hensley SE. Compensatory hemagglutinin mutations alter antigenic properties of influenza viruses. Journal of Virology. 2013;87:11168–11172. doi: 10.1128/JVI.01414-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Nachbagauer R, Choi A, Izikson R, Cox MM, Palese P, Krammer F. Age dependence and isotype specificity of influenza virus hemagglutinin stalk-reactive antibodies in humans. mBio. 2016;7:e01996. doi: 10.1128/mBio.01996-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. National Health Interview Survey TABLE: Self-reported influenza vaccination coverage trends 1989-2008 among adults by age group, risk group, race/ethnicity, health-care worker status, and pregnancy status. 2008. [April 24, 2008]. https://www.cdc.gov/flu/pdf/fluvaxview/nhis89_08fluvaxtrendtab.pdf
  126. Ndifon W, Dushoff J, Levin SA. On the use of hemagglutination-inhibition for influenza surveillance: surveillance data are predictive of influenza vaccine effectiveness. Vaccine. 2009;27:2447–2452. doi: 10.1016/j.vaccine.2009.02.047. [DOI] [PubMed] [Google Scholar]
  127. Neher RA, Russell CA, Shraiman BI. Predicting evolution from the shape of genealogical trees. eLife. 2014;3:e03568. doi: 10.7554/eLife.03568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Neher RA, Bedford T, Daniels RS, Russell CA, Shraiman BI. Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses. PNAS. 2016;113:E1701–E1709. doi: 10.1073/pnas.1525578113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Nelson MI, Holmes EC. The evolution of epidemic influenza. Nature Reviews. Genetics. 2007;8:196–205. doi: 10.1038/nrg2053. [DOI] [PubMed] [Google Scholar]
  130. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution. 2015;32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Ohmit SE, Thompson MG, Petrie JG, Thaker SN, Jackson ML, Belongia EA, Zimmerman RK, Gaglani M, Lamerato L, Spencer SM, Jackson L, Meece JK, Nowalk MP, Song J, Zervos M, Cheng P-Y, Rinaldo CR, Clipper L, Shay DK, Piedra P, Monto AS. Influenza vaccine effectiveness in the 2011-2012 season: protection against each circulating virus and the effect of prior vaccination on estimates. Clinical Infectious Diseases. 2014;58:319–327. doi: 10.1093/cid/cit736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Olsen SJ, Azziz-Baumgartner E, Budd AP, Brammer L, Sullivan S, Pineda RF, Cohen C, Fry AM. Decreased influenza activity during the COVID-19 pandemic-United States. American Journal of Transplantation. 2020;20:3681–3685. doi: 10.1111/ajt.16381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Olsen SJ, Winn AK, Budd AP, Prill MM, Steel J, Midgley CM, Kniss K, Burns E, Rowe T, Foust A, Jasso G, Merced-Morales A, Davis CT, Jang Y, Jones J, Daly P, Gubareva L, Barnes J, Kondor R, Sessions W, Smith C, Wentworth DE, Garg S, Havers FP, Fry AM, Hall AJ, Brammer L, Silk BJ. Changes in influenza and other respiratory virus activity during the COVID-19 pandemic - United States, 2020-2021. MMWR. Morbidity and Mortality Weekly Report. 2021;70:1013–1019. doi: 10.15585/mmwr.mm7029a1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Pebody R, Warburton F, Ellis J, Andrews N, Potts A, Cottrell S, Reynolds A, Gunson R, Thompson C, Galiano M, Robertson C, Gallagher N, Sinnathamby M, Yonova I, Correa A, Moore C, Sartaj M, de Lusignan S, McMenamin J, Zambon M. End-of-season influenza vaccine effectiveness in adults and children, United Kingdom, 2016/17. Euro Surveillance. 2017;22:17-00306. doi: 10.2807/1560-7917.ES.2017.22.44.17-00306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Pei S, Kandula S, Yang W, Shaman J. Forecasting the spatial transmission of influenza in the United States. PNAS. 2018;115:2752–2757. doi: 10.1073/pnas.1708856115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. Perofsky A. H3N2 antigenic epi. swh:1:rev:903dd3f1c669ef4c1687b15230d6c79ca9740a98Software Heritage. 2024 https://archive.softwareheritage.org/swh:1:dir:74c0de0f04e1a8fda9b100ff33216847f825f7e2;origin=https://github.com/aperofsky/H3N2_Antigenic_Epi;visit=swh:1:snp:b005c94d8e2ef5db3a29a79b512788634fbda7c0;anchor=swh:1:rev:903dd3f1c669ef4c1687b15230d6c79ca9740a98
  137. Petrova VN, Russell CA. The evolution of seasonal influenza viruses. Nature Reviews. Microbiology. 2018;16:47–60. doi: 10.1038/nrmicro.2017.118. [DOI] [PubMed] [Google Scholar]
  138. Public Health Agency of Canada Effectiveness of vaccine against medical consultation due to laboratory-confirmed influenza: results from a sentinel physician pilot project in British Columbia, 2004-2005. Canada Communicable Disease Report. 2005;31:181–191. [PubMed] [Google Scholar]
  139. Qi Y, Shaman J, Pei S. Quantifying the impact of COVID-19 nonpharmaceutical interventions on influenza transmission in the United States. The Journal of Infectious Diseases. 2021;224:1500–1508. doi: 10.1093/infdis/jiab485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Ranjeva S, Subramanian R, Fang VJ, Leung GM, Ip DKM, Perera RAPM, Peiris JSM, Cowling BJ, Cobey S. Age-specific differences in the dynamics of protective immunity to influenza. Nature Communications. 2019;10:1660. doi: 10.1038/s41467-019-09652-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. R Development Core Team . Vienna, Austria: R Foundation for Statistical Computing; 2023. https://www.R-project.org/ [Google Scholar]
  142. Rolfes MA, Flannery B, Chung JR, O’Halloran A, Garg S, Belongia EA, Gaglani M, Zimmerman RK, Jackson ML, Monto AS, Alden NB, Anderson E, Bennett NM, Billing L, Eckel S, Kirley PD, Lynfield R, Monroe ML, Spencer M, Spina N, Talbot HK, Thomas A, Torres SM, Yousey-Hindes K, Singleton JA, Patel M, Reed C, Fry AM, McLean HQ, King JP, Nowalk MP, Balasubramani GK, Bear TM, Hickey R, Williams JV, Reis EC, Moehling KK, Eng H, Jackson LA, Smith M, Raiyani C, Clipper L, Murthy K, Chen W, Reis M, Petrie JG, Malosh RE, McSpadden EJ, Segaloff HE, Cheng CK, Truscon R, Johnson E, Lamerato LE, Rosenblum B, Ford S, Johnson M, Raviotta JM, Sax T, Steele J, Susick M, Chabra R, Garofolo E, Iozzi P, Kevish B, Middleton DB, Urbanski L, Ponder T, Crumbaker T, Iosefo I, Sleeth P, Gandy V, Bounds K, Kylberg M, Rao A, Fader R, Walker K, Volz M, Ray J, Price D, Thomas J, Wehbe-Janek H, Beeram M, Boyd J, Walkowiak J, Probe R, Couchman G, Motakef S, Arroliga A, Kaniclides A, Bouldin E, Baker C, Berke K, Smith M, Rajesh N, Alleman E, Bauer S, Groesbeck M, Brundidge K, Hafeez N, Jackson J, Anastasia I, Kadoo G, Petnic S, Ryan A, Maslar A, Meek J, Chen R, Stephens S, Thomas S, Segler S, Openo K, Fawcett E, Farley M, Martin A, Ryan P, Sunkel R, Lutich T, Perlmutter R, Grace B, Blood T, Zerrlaut C, McMahon M, Strain A, Christensen J, Angeles K, Butler L, Khanlian S, Mansmann R, McMullen C, Pradhan E, Manzi K, Felsen C, Gaitan M, Long K, Fisher N, Hawley E, O’Shaughnessy R, Scott M, Crawford C, Schaffner W, Markus T, Leib K, Dyer K, Santibanez T, Zhai Y, Lu P, Srivastav A, Hung M-C, US Influenza Vaccine Effectiveness (Flu VE) Network, the Influenza Hospitalization Surveillance Network, and the Assessment Branch, Immunization Services Division, Centers for Disease Control and Prevention Effects of influenza vaccination in the united states during the 2017–2018 influenza season. Clinical Infectious Diseases. 2019;69:1845–1853. doi: 10.1093/cid/ciz075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Russell KE, Fowlkes A, Stockwell MS, Vargas CY, Saiman L, Larson EL, LaRussa P, Di Lonardo S, Popowich M, St George K, Steffens A, Reed C. Comparison of outpatient medically attended and community-level influenza-like illness-New York City, 2013-2015. Influenza and Other Respiratory Viruses. 2018;12:336–343. doi: 10.1111/irv.12540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  144. Sagulenko P, Puller V, Neher RA. TreeTime: maximum-likelihood phylodynamic analysis. Virus Evolution. 2018;4:vex042. doi: 10.1093/ve/vex042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Sandbulte MR, Westgeest KB, Gao J, Xu X, Klimov AI, Russell CA, Burke DF, Smith DJ, Fouchier RAM, Eichelberger MC. Discordant antigenic drift of neuraminidase and hemagglutinin in H1N1 and H3N2 influenza viruses. PNAS. 2011;108:20748–20753. doi: 10.1073/pnas.1113801108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  146. Sax C, Steiner P. Temporal disaggregation of time series. The R Journal. 2013;5:80. doi: 10.32614/RJ-2013-028. [DOI] [Google Scholar]
  147. Schulman JL, Khakpour M, Kilbourne ED. Protective effects of specific immunity to viral neuraminidase on influenza virus infection of mice. Journal of Virology. 1968;2:778–786. doi: 10.1128/JVI.2.8.778-786.1968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  148. Schulman JL, Kilbourne ED. Independent variation in nature of hemagglutinin and neuraminidase antigens of influenza virus: distinctiveness of hemagglutinin antigen of Hong Kong-68 virus. PNAS. 1969;63:326–333. doi: 10.1073/pnas.63.2.326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Scott JA, Gandy A, Mishra S, Bhatt S, Flaxman S, Unwin HJT, Ish-Horowicz J. Epidemia: an R package for semi-mechanistic bayesian modelling of infectious diseases using point processes. arXiv. 2021 doi: 10.48550/arXiv.2110.12461. [DOI]
  150. Shaman J, Kohn M. Absolute humidity modulates influenza survival, transmission, and seasonality. PNAS. 2009;106:3243–3248. doi: 10.1073/pnas.0806852106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Shaman J, Pitzer VE, Viboud C, Grenfell BT, Lipsitch M. Absolute humidity and the seasonal onset of influenza in the continental United States. PLOS Biology. 2010;8:e1000316. doi: 10.1371/journal.pbio.1000316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Shannon CE. A mathematical theory of communication. Bell System Technical Journal. 1948;27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
  153. Shih AC-C, Hsiao T-C, Ho M-S, Li W-H. Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution. PNAS. 2007;104:6283–6288. doi: 10.1073/pnas.0701396104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. Shu Y, McCauley J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveillance. 2017;22:30494. doi: 10.2807/1560-7917.ES.2017.22.13.30494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Simonsen L. The global impact of influenza on morbidity and mortality. Vaccine. 1999;17:S3–S10. doi: 10.1016/S0264-410X(99)00099-7. [DOI] [PubMed] [Google Scholar]
  156. Simonsen L, Viboud C. The art of modeling the mortality impact of winter-seasonal pathogens. Journal of Infectious Diseases. 2012;206:625–627. doi: 10.1093/infdis/jis419. [DOI] [PubMed] [Google Scholar]
  157. Simpson CR, Lone NI, Kavanagh K, Ritchie LD, Robertson C, Sheikh A, McMenamin J. Trivalent inactivated seasonal influenza vaccine effectiveness for the prevention of laboratory-confirmed influenza in a Scottish population 2000 to 2009. Eurosurveillance. 2015;20:21043. doi: 10.2807/1560-7917.ES2015.20.8.21043. [DOI] [PubMed] [Google Scholar]
  158. Skowronski DM, Masaro C, Kwindt TL, Mak A, Petric M, Li Y, Sebastian R, Chong M, Tam T, De Serres G. Estimating vaccine effectiveness against laboratory-confirmed influenza using A sentinel physician network: results from the 2005-2006 season of dual A and B vaccine mismatch in Canada. Vaccine. 2007;25:2842–2851. doi: 10.1016/j.vaccine.2006.10.002. [DOI] [PubMed] [Google Scholar]
  159. Skowronski DM, De Serres G, Dickinson J, Petric M, Mak A, Fonseca K, Kwindt TL, Chan T, Bastien N, Charest H, Li Y. Component-specific effectiveness of trivalent influenza vaccine as monitored through a sentinel surveillance network in Canada, 2006-2007. The Journal of Infectious Diseases. 2009;199:168–179. doi: 10.1086/595862. [DOI] [PubMed] [Google Scholar]
  160. Skowronski DM, De Serres G, Crowcroft NS, Janjua NZ, Boulianne N, Hottes TS, Rosella LC, Dickinson JA, Gilca R, Sethi P, Ouhoummane N, Willison DJ, Rouleau I, Petric M, Fonseca K, Drews SJ, Rebbapragada A, Charest H, Hamelin M-E, Boivin G, Gardy JL, Li Y, Kwindt TL, Patrick DM, Brunham RC, Canadian SAVOIR Team Association between the 2008-09 seasonal influenza vaccine and pandemic H1N1 illness during Spring-Summer 2009: four observational studies from Canada. PLOS Medicine. 2010;7:e1000258. doi: 10.1371/journal.pmed.1000258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Skowronski DM, Janjua NZ, De Serres G, Winter AL, Dickinson JA, Gardy JL, Gubbay J, Fonseca K, Charest H, Crowcroft NS, Fradet MD, Bastien N, Li Y, Krajden M, Sabaiduc S, Petric M. A sentinel platform to evaluate influenza vaccine effectiveness and new variant circulation, Canada 2010-2011 season. Clinical Infectious Diseases. 2012;55:332–342. doi: 10.1093/cid/cis431. [DOI] [PubMed] [Google Scholar]
  162. Skowronski DM, Janjua NZ, De Serres G, Sabaiduc S, Eshaghi A, Dickinson JA, Fonseca K, Winter A-L, Gubbay JB, Krajden M, Petric M, Charest H, Bastien N, Kwindt TL, Mahmud SM, Van Caeseele P, Li Y. Low 2012-13 influenza vaccine effectiveness associated with mutation in the egg-adapted H3N2 vaccine strain not antigenic drift in circulating viruses. PLOS ONE. 2014a;9:e92153. doi: 10.1371/journal.pone.0092153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  163. Skowronski DM, Janjua NZ, Sabaiduc S, De Serres G, Winter AL, Gubbay JB, Dickinson JA, Fonseca K, Charest H, Bastien N, Li Y, Kwindt TL, Mahmud SM, Van Caeseele P, Krajden M, Petric M. Influenza A/subtype and B/lineage effectiveness estimates for the 2011-2012 trivalent vaccine: cross-season and cross-lineage protection with unchanged vaccine. The Journal of Infectious Diseases. 2014b;210:126–137. doi: 10.1093/infdis/jiu048. [DOI] [PubMed] [Google Scholar]
  164. Skowronski DM, Chambers C, Sabaiduc S, De Serres G, Winter AL, Dickinson JA, Krajden M, Gubbay JB, Drews SJ, Martineau C, Eshaghi A, Kwindt TL, Bastien N, Li Y. A perfect storm: Impact of genomic variation and serial vaccination on low influenza vaccine effectiveness during the 2014-2015 season. Clinical Infectious Diseases. 2016;63:21–32. doi: 10.1093/cid/ciw176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Skowronski DM, Chambers C, De Serres G, Sabaiduc S, Winter AL, Dickinson JA, Gubbay JB, Fonseca K, Drews SJ, Charest H, Martineau C, Krajden M, Petric M, Bastien N, Li Y, Smith DJ. Serial vaccination and the antigenic distance hypothesis: Effects on influenza vaccine effectiveness during A(H3N2) epidemics in Canada, 2010-2011 to 2014-2015. The Journal of Infectious Diseases. 2017a;215:1059–1099. doi: 10.1093/infdis/jix074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  166. Skowronski DM, Chambers C, Sabaiduc S, Dickinson JA, Winter A-L, De Serres G, Drews SJ, Jassem A, Gubbay JB, Charest H, Balshaw R, Bastien N, Li Y, Krajden M. Interim estimates of 2016/17 vaccine effectiveness against influenza A(H3N2), Canada, January 2017. Euro Surveillance. 2017b;22:30460. doi: 10.2807/1560-7917.ES.2017.22.6.30460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Skowronski DM, Leir S, Sabaiduc S, Chambers C, Zou M, Rose C, Olsha R, Dickinson JA, Winter A-L, Jassem A, Gubbay JB, Drews SJ, Charest H, Chan T, Hickman R, Bastien N, Li Y, Krajden M, De Serres G. Influenza vaccine effectiveness by A(H3N2) phylogenetic subcluster and prior vaccination history: 2016-2017 and 2017-2018 epidemics in Canada. The Journal of Infectious Diseases. 2022;225:1387–1398. doi: 10.1093/infdis/jiaa138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Smith DJ, Lapedes AS, de Jong JC, Bestebroer TM, Rimmelzwaan GF, Osterhaus A, Fouchier RAM. Mapping the antigenic and genetic evolution of influenza virus. Science. 2004;305:371–376. doi: 10.1126/science.1097211. [DOI] [PubMed] [Google Scholar]
  169. Smith GJD, Vijaykrishna D, Bahl J, Lycett SJ, Worobey M, Pybus OG, Ma SK, Cheung CL, Raghwani J, Bhatt S, Peiris JSM, Guan Y, Rambaut A. Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature. 2009;459:1122–1125. doi: 10.1038/nature08182. [DOI] [PubMed] [Google Scholar]
  170. Sonoguchi T, Naito H, Hara M, Takeuchi Y, Fukumi H. Cross-subtype protection in humans during sequential, overlapping, and/or concurrent epidemics caused by H3N2 and H1N1 influenza viruses. The Journal of Infectious Diseases. 1985;151:81–88. doi: 10.1093/infdis/151.1.81. [DOI] [PubMed] [Google Scholar]
  171. Sridhar S, Begom S, Bermingham A, Hoschler K, Adamson W, Carman W, Bean T, Barclay W, Deeks JJ, Lalvani A. Cellular immune correlates of protection against symptomatic pandemic influenza. Nature Medicine. 2013;19:1305–1312. doi: 10.1038/nm.3350. [DOI] [PubMed] [Google Scholar]
  172. Sridhar S. Heterosubtypic T-cell immunity to influenza in humans: challenges for universal T-cell influenza vaccines. Frontiers in Immunology. 2016;7:195. doi: 10.3389/fimmu.2016.00195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Steinhoff MC, Fries LF, Karron RA, Clements ML, Murphy BR. Effect of heterosubtypic immunity on infection with attenuated influenza A virus vaccines in young children. Journal of Clinical Microbiology. 1993;31:836–838. doi: 10.1128/jcm.31.4.836-838.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Strobl C, Boulesteix AL, Zeileis A, Hothorn T. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics. 2007;8:25. doi: 10.1186/1471-2105-8-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Strobl C, Boulesteix AL, Kneib T, Augustin T, Zeileis A. Conditional variable importance for random forests. BMC Bioinformatics. 2008;9:307. doi: 10.1186/1471-2105-9-307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Suzuki Y. Positive selection operates continuously on hemagglutinin during evolution of H3N2 human influenza A virus. Gene. 2008;427:111–116. doi: 10.1016/j.gene.2008.09.012. [DOI] [PubMed] [Google Scholar]
  177. Tempia S, Walaza S, Bhiman JN, McMorrow ML, Moyes J, Mkhencele T, Meiring S, Quan V, Bishop K, McAnerney JM, von Gottberg A, Wolter N, Du Plessis M, Treurnicht FK, Hellferscee O, Dawood H, Naby F, Variava E, Siwele C, Baute N, Nel J, Reubenson G, Zar HJ, Cohen C. Decline of influenza and respiratory syncytial virus detection in facility-based surveillance during the COVID-19 pandemic, South Africa, January to October 2020. Euro Surveillance. 2021;26:2001600. doi: 10.2807/1560-7917.ES.2021.26.29.2001600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  178. Terajima M, Babon JAB, Co MDT, Ennis FA. Cross-reactive human B cell and T cell epitopes between influenza A and B viruses. Virology Journal. 2013;10:244. doi: 10.1186/1743-422X-10-244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  179. Torrence C, Compo GP. A practical guide to wavelet analysis. Bulletin of the American Meteorological Society. 1998;79:61–78. doi: 10.1175/1520-0477(1998)079&#x0003c;0061:APGTWA&#x0003e;2.0.CO;2. [DOI] [Google Scholar]
  180. Treanor JJ, Talbot HK, Ohmit SE, Coleman LA, Thompson MG, Cheng PY, Petrie JG, Lofthus G, Meece JK, Williams JV, Berman L, Breese Hall C, Monto AS, Griffin MR, Belongia E, Shay DK, Network USFV. Effectiveness of seasonal influenza vaccines in the United States during a season with circulation of all three vaccine strains. Clinical Infectious Diseases. 2012;55:951–959. doi: 10.1093/cid/cis574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Ulmer JB, Fu TM, Deck RR, Friedman A, Guan L, DeWitt C, Liu X, Wang S, Liu MA, Donnelly JJ, Caulfield MJ. Protective CD4+ and CD8+ T cells against influenza virus induced by vaccination with nucleoprotein DNA. Journal of Virology. 1998;72:5648–5653. doi: 10.1128/JVI.72.7.5648-5653.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Valenciano M, Kissling E, Larrauri A, Nunes B, Pitigoi D, O’Donnell J, Reuss A, Horváth JK, Paradowska-Stankiewicz I, Rizzo C, Falchi A, Daviaud I, Brytting M, Meijer A, Kaic B, Gherasim A, Machado A, Ivanciuc A, Domegan L, Schweiger B, Ferenczi A, Korczyńska M, Bella A, Vilcu A-M, Mosnier A, Zakikhany K, de Lange M, Kurečić Filipovićović S, Johansen K, Moren A, I-MOVE primary care multicentre case-control team Exploring the effect of previous inactivated influenza vaccination on seasonal influenza vaccine effectiveness against medically attended influenza: Results of the European I-MOVE multicentre test-negative case-control study, 2011/2012-2016/2017. Influenza and Other Respiratory Viruses. 2018;12:567–581. doi: 10.1111/irv.12562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  183. van Doorn E, Darvishian M, Dijkstra F, Donker GA, Overduin P, Meijer A, Hak E. Influenza vaccine effectiveness estimates in the Dutch population from 2003 to 2014: the test-negative design case-control study with different control groups. Vaccine. 2017;35:2831–2839. doi: 10.1016/j.vaccine.2017.04.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  184. Viboud C, Boëlle PY, Pakdaman K, Carrat F, Valleron AJ, Flahault A. Influenza epidemics in the United States, France, and Australia, 1972-1997. Emerging Infectious Diseases. 2004;10:32–39. doi: 10.3201/eid1001.020705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. Viboud C, Bjørnstad ON, Smith DL, Simonsen L, Miller MA, Grenfell BT. Synchrony, waves, and spatial hierarchies in the spread of influenza. Science. 2006;312:447–451. doi: 10.1126/science.1125237. [DOI] [PubMed] [Google Scholar]
  186. Ward B, Clarke T, Freeman G, Schiller J. Early Release of Selected Estimates Based on Data From the 2014 National Health Interview Survey. National Center for Health Statistics; 2015. https://www.cdc.gov/nchs/nhis/releases/released201506.htm [Google Scholar]
  187. Ward B, Clarke T, Nugent C, Schiller J. Early Release of Selected Estimates Based on Data From the 2015 National Health Interview Survey. National Center for Health Statistics; 2016. https://www.cdc.gov/nchs/nhis/releases/released201605.htm [Google Scholar]
  188. Webster RG, Laver WG. Determination of the number of nonoverlapping antigenic areas on Hong Kong (H3N2) influenza virus hemagglutinin with monoclonal antibodies and the selection of variants with potential epidemiological significance. Virology. 1980;104:139–148. doi: 10.1016/0042-6822(80)90372-4. [DOI] [PubMed] [Google Scholar]
  189. Webster RG, Laver WG, Air GM, Schild GC. Molecular mechanisms of variation in influenza viruses. Nature. 1982;296:115–121. doi: 10.1038/296115a0. [DOI] [PubMed] [Google Scholar]
  190. Webster RG, Bean WJ, Gorman OT, Chambers TM, Kawaoka Y. Evolution and ecology of influenza A viruses. Microbiological Reviews. 1992;56:152–179. doi: 10.1128/mr.56.1.152-179.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  191. Weinberger DM, Krause TG, Mølbak K, Cliff A, Briem H, Viboud C, Gottfredsson M. Influenza epidemics in Iceland over 9 decades: changes in timing and synchrony with the United States and Europe. American Journal of Epidemiology. 2012;176:649–655. doi: 10.1093/aje/kws140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  192. Wiley DC, Wilson IA, Skehel JJ. Structural identification of the antibody-binding sites of Hong Kong influenza haemagglutinin and their involvement in antigenic variation. Nature. 1981;289:373–378. doi: 10.1038/289373a0. [DOI] [PubMed] [Google Scholar]
  193. Wilson IA, Cox NJ. Structural basis of immune recognition of influenza virus hemagglutinin. Annual Review of Immunology. 1990;8:737–771. doi: 10.1146/annurev.iy.08.040190.003513. [DOI] [PubMed] [Google Scholar]
  194. Wohlbold TJ, Nachbagauer R, Xu H, Tan GS, Hirsh A, Brokstad KA, Cox RJ, Palese P, Krammer F. Vaccination with adjuvanted recombinant neuraminidase induces broad heterologous, but not heterosubtypic, cross-protection against influenza virus infection in mice. mBio. 2015;6:e02556. doi: 10.1128/mBio.02556-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  195. Wolf YI, Viboud C, Holmes EC, Koonin EV, Lipman DJ. Long intervals of stasis punctuated by bursts of positive selection in the seasonal evolution of influenza A virus. Biology Direct. 2006;1:34. doi: 10.1186/1745-6150-1-34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Wolf YI, Nikolskaya A, Cherry JL, Viboud C, Koonin E, Lipman DJ. Projection of seasonal influenza severity from sequence and serological data. PLOS Currents. 2010;2:RRN1200. doi: 10.1371/currents.RRN1200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  197. World Health Organization Global Influenza Programme: FluNet. 2023. [September 5, 2024]. https://www.who.int/tools/flunet
  198. Wraith S, Balmaseda A, Carrillo FAB, Kuan G, Huddleston J, Kubale J, Lopez R, Ojeda S, Schiller A, Lopez B, Sanchez N, Webby R, Nelson MI, Harris E, Gordon A. Homotypic protection against influenza in a pediatric cohort in Managua, Nicaragua. Nature Communications. 2022;13:1190. doi: 10.1038/s41467-022-28858-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Wu A, Peng Y, Du X, Shu Y, Jiang T. Correlation of influenza virus excess mortality with antigenic variation: application to rapid estimation of influenza mortality burden. PLOS Computational Biology. 2010;6:e1000882. doi: 10.1371/journal.pcbi.1000882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Xie H, Wan XF, Ye Z, Plant EP, Zhao Y, Xu Y, Li X, Finch C, Zhao N, Kawano T, Zoueva O, Chiang MJ, Jing X, Lin Z, Zhang A, Zhu Y. H3N2 Mismatch of 2014-15 northern hemisphere influenza vaccines and head-to-head comparison between human and ferret antisera derived antigenic maps. Scientific Reports. 2015;5:15279. doi: 10.1038/srep15279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  201. Yan L, Neher RA, Shraiman BI. Phylodynamic theory of persistence, extinction and speciation of rapidly adapting pathogens. eLife. 2019;8:e44205. doi: 10.7554/eLife.44205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  202. Yang W, Lau EHY, Cowling BJ. Dynamic interactions of influenza viruses in Hong Kong during 1998-2018. PLOS Computational Biology. 2020;16:e1007989. doi: 10.1371/journal.pcbi.1007989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  203. Zhang A, Stacey HD, Mullarkey CE, Miller MS. Original antigenic sin: how first exposure shapes lifelong anti-influenza virus immune responses. Journal of Immunology. 2019;202:335–340. doi: 10.4049/jimmunol.1801149. [DOI] [PubMed] [Google Scholar]
  204. Zhao K, Wulder MA, Hu T, Bright R, Wu Q, Qin H, Li Y, Toman E, Mallick B, Zhang X, Brown M. Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm. Remote Sensing of Environment. 2019;232:111181. doi: 10.1016/j.rse.2019.04.034. [DOI] [Google Scholar]
  205. Zimmerman RK, Nowalk MP, Chung J, Jackson ML, Jackson LA, Petrie JG, Monto AS, McLean HQ, Belongia EA, Gaglani M, Murthy K, Fry AM, Flannery B, Investigators USFV, Investigators USFV. 2014-2015 influenza vaccine effectiveness in the United States by vaccine type. Clinical Infectious Diseases. 2016;63:1564–1573. doi: 10.1093/cid/ciw635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  206. Zost SJ, Parkhouse K, Gumina ME, Kim K, Diaz Perez S, Wilson PC, Treanor JJ, Sant AJ, Cobey S, Hensley SE. Contemporary H3N2 influenza viruses have a glycosylation site that alters binding of antibodies elicited by egg-adapted vaccine strains. PNAS. 2017;114:12578–12583. doi: 10.1073/pnas.1712377114. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife assessment

Talía Malagón 1

This paper explores the relationships among evolutionary and epidemiological quantities in influenza, and presents fundamental findings that substantially advance our understanding of the drivers of influenza epidemics. The authors use a rich set of data sources to gather and analyze compelling evidence on the roles of genetic distance, other influenza dynamics and epidemiological indicators in predicting influenza epidemics. The central findings highlight the significant influence of genetic distance on A(H3N2) virus epidemiology and emphasize the role of A(H1N1) virus incidence in shaping A(H3N2) epidemics, suggesting subtype interference as a key factor. This paper also makes relevant data available to the research community.

Reviewer #1 (Public review):

Anonymous

Summary:

The authors aimed to investigate the contribution of antigenic drift in the HA and NA genes of seasonal influenza A(H3N2) virus to their epidemic dynamics. Analyzing 22 influenza seasons before the COVID-19 pandemic, the study explored various antigenic and genetic markers, comparing them against indicators characterizing the epidemiology of annual outbreaks. The central findings highlight the significant influence of genetic distance on A(H3N2) virus epidemiology and emphasize the role of A(H1N1) virus incidence in shaping A(H3N2) epidemics, suggesting subtype interference as a key factor.

Major Strengths:

The paper is well-organized, written with clarity, and presents a comprehensive analysis. The study design, incorporating a span of 22 seasons, provides a robust foundation for understanding influenza dynamics. The inclusion of diverse antigenic and genetic markers enhances the depth of the investigation, and the exploration of subtype interference adds valuable insights.

Major Weaknesses:

While the analysis is thorough, some aspects require deeper interpretation, particularly in the discussion of certain results. Clarity and depth could be improved in the presentation of findings, and minor adjustments are suggested. Furthermore, the evolving dynamics of H3N2 predominance post-2009 need better elucidation.

Comments on revised version:

The authors have addressed each of the comments well. I have no further comments.

eLife. 2024 Sep 25;13:RP91849. doi: 10.7554/eLife.91849.3.sa2

Author response

Amanda C Perofsky 1, John Huddleston 2, Chelsea L Hansen 3, John R Barnes 4, Thomas Rowe 5, Xiyan Xu 6, Rebecca Kondor 7, David E Wentworth 8, Nicola Lewis 9, Lynne Whittaker 10, Burcu Ermetal 11, Ruth Harvey 12, Monica Galiano 13, Rodney Stuart Daniels 14, John W McCauley 15, Seiichiro Fujisaki 16, Kazuya Nakamura 17, Noriko Kishida 18, Shinji Watanabe 19, Hideki Hasegawa 20, Sheena G Sullivan 21, Ian Barr 22, Kanta Subbarao 23, Florian Krammer 24, Trevor Bedford 25, Cécile Viboud 26

The following is the authors’ response to the original reviews.

Public Reviews:

Reviewer #1 (Public Review):

Summary:

The authors aimed to investigate the contribution of antigenic drift in the HA and NA genes of seasonal influenza A(H3N2) virus to their epidemic dynamics. Analyzing 22 influenza seasons before the COVID-19 pandemic, the study explored various antigenic and genetic markers, comparing them against indicators characterizing the epidemiology of annual outbreaks. The central findings highlight the significant influence of genetic distance on A(H3N2) virus epidemiology and emphasize the role of A(H1N1) virus incidence in shaping A(H3N2) epidemics, suggesting subtype interference as a key factor.

Major Strengths:

The paper is well-organized, written with clarity, and presents a comprehensive analysis. The study design, incorporating a span of 22 seasons, provides a robust foundation for understanding influenza dynamics. The inclusion of diverse antigenic and genetic markers enhances the depth of the investigation, and the exploration of subtype interference adds valuable insights.

Major Weaknesses:

While the analysis is thorough, some aspects require deeper interpretation, particularly in the discussion of certain results. Clarity and depth could be improved in the presentation of findings. Furthermore, the evolving dynamics of H3N2 predominance post-2009 need better elucidation.

Reviewer #2 (Public Review):

Summary: This paper aims to achieve a better understanding of how the antigenic or genetic compositions of the dominant influenza A viruses in circulation at a given time are related to key features of seasonal influenza epidemics in the US. To this end, the authors analyze an extensive dataset with a range of statistical, data science and machine learning methods. They find that the key drivers of influenza A epidemiological dynamics are interference between influenza A subtypes and genetic divergence, relative to the previous one or two seasons, in a broader range of antigenically related sites than previously thought.

Strengths: A thorough investigation of a large and complex dataset.

Weaknesses: The dataset covers a 21 year period which is substantial by epidemiological standards, but quite small from a statistical or machine learning perspective. In particular, it was not possible to follow the usual process and test predictive performance of the random forest model with an independent dataset.

Reviewer #3 (Public Review):

Summary:

This paper explores the relationships among evolutionary and epidemiological quantities in influenza, using a wide range of datasets and features, and using both correlations and random forests to examine, primarily, what are the drivers of influenza epidemics. It's a strong paper representing a thorough and fascinating exploration of potential drivers, and it makes a trove of relevant data readily available to the community.

Strengths:

This paper makes links between epidemiological and evolutionary data for influenza. Placing each in the context of the other is crucial for understanding influenza dynamics and evolution and this paper does a thorough job of this, with many analyses and nuances. The results on the extent to which evolutionary factors relate to epidemic burden, and on interference among influenza types, are particularly interesting. The github repository associated with the paper is clear, comprehensive, and well-documented.

Weaknesses:

The format of the results section can be hard to follow, and we suggest improving readability by restructuring and simplifying in some areas. There are a range of choices made about data preparation and scaling; the authors could explore sensitivity of the results to some of these.

Response to public reviews

We appreciate the positive comments from the reviewers and have implemented or responded to all of the reviewers’ recommendations.

In response to Reviewer 1, we expand on the potential drivers and biological implications of the findings pointed out in their specific recommendations. For example, we now explicitly mention that antigenically distinct 3c.2a and 3c.3a viruses began to co-circulate in 2012 and underwent further diversification during subsequent seasons in our study. We note that, after the 2009 A(H1N1) pandemic, the mean fraction of influenza positive cases typed as A(H3N2) in A(H3N2) dominant seasons is lower compared to A(H3N2) dominant seasons prior to 2009. We propose that the weakening of A(H3N2) predominance may be linked to the diversification of A(H3N2) viruses during the 2010s, wherein multiple antigenically distinct clades with similar fitness circulated in each season, as opposed to a single variant with high fitness.

In response to Reviewer 2, we agree that it would be ideal and best practice to measure model performance with an independent test set, but our dataset includes only ~20 seasons. Predictions of independent test sets of 2-3 seasons had unstable performance, which indicates we do not have sufficient power to measure model performance with a test set this small. In the revised manuscript, we provide more justification and clarification of our methodology. Instead of testing model performance on an independent test set, we use leave-one-season-out cross-validation to train models and measure model performance, wherein each “assessment” set contains one season of data (predicted by the model), and the corresponding “analysis” set (“fold”) contains the remaining seasons. This approach is roughly analogous to splitting data into training and test sets, but all seasons are used at some point in the training of the model (Kuhn & Johnson, 2019).

In response to Reviewer 3, we follow the reviewer’s advice to put the Methods section before the Results section. Concerning Reviewer 3’s question about the sensitivity of our results to data preparation and rescaling, we provide more justification and clarification of our methodology in the revised manuscript. In our study, we adjust influenza type/subtype incidences for differences in reporting between the pre- and post-2009 pandemic periods and across HHS regions. We adjust for differences in reporting between the pre- and post-2009 periods because the US CDC and WHO increased laboratory testing capacity in response to the 2009 A(H1N1) pandemic, which led to substantial, long-lasting improvements to influenza surveillance that are still in place today. Figure 1 - figure supplement 2 shows systematic increases in influenza test volume in all HHS regions after the 2009 pandemic. Given the substantial increase in test volume after 2009, we opted to keep the time trend adjustment for the pre- and post-2009 pandemic periods and evaluate whether adjusting for regional reporting differences affects our results. When estimating univariate correlations between various A(H3N2) epidemic metrics and evolutionary indicators, we found qualitatively equivalent results when adjusting for both pre- and post-2009 pandemic reporting and regional reporting versus only adjusting for the pre- and post-2009 pandemic reporting.

Reviewer #1 (Recommendations For The Authors):

Specific comments:

(1) Line 155-156. Request for a reference for: "Given that protective immunity wanes after 1-4 years"

We now include two references (He et al. 2015 and Wraith et al. 2022), which were cited at the beginning of the introduction when referring to the duration of protective immunity for antigenically homologous viruses. (Lines 640-642 in revised manuscript)

(2) Line 162-163: Request a further explanation of the negative correlation between seasonal diversity of HA and NA LBI values and NA epitope distance. Clarify biological implications to aid reader understanding.

In the revised manuscript we expand on the biological implications of A(H3N2) virus populations characterized by high antigenic novelty and low LBI diversity.

Lines 649-653:

“The seasonal diversity of HA and NA LBI values was negatively correlated with NA epitope distance (Figure 2 – figure supplements 5 – 6), with high antigenic novelty coinciding with low genealogical diversity. This association suggests that selective sweeps tend to follow the emergence of drifted variants with high fitness, resulting in seasons dominated by a single A(H3N2) variant rather than multiple cocirculating clades.”

(3) Figure S3 legend t-2 may be marked as t-1.

Thank you for catching this. We have fixed this typo. Note: Figure S3 is now Figure 2 – figure supplement 5.

(4) Lines 201-214. The key takeaways from the analysis of subtype dominance are ultimately not clear. It also misses the underlying dynamics that H3N2 predominance following an evolutionary change has waned since 2009.

In the revised manuscript we elaborate on key takeaways concerning the relationship between antigenic drift and A(H3N2) dominance. We also add a caveat noting that A(H3N2) predominance is weaker during the post-2009 period, which may be linked to the diversification of A(H3N2) lineages after 2012. We do not know of a reference that links the diversification of A(H3N2) viruses in the 2010s to a particular evolutionary change. Therefore, we do not attribute the diversification of A(H3N2) viruses to a specific evolutionary change in A(H3N2) variants circulating at the time (A/Perth/16/2009-like strains (PE09)). Instead, we allude to the potential role of A(H3N2) diversification in creating multiple co-circulating lineages that may have less of a fitness advantage.

Lines 681-703:

“We explored whether evolutionary changes in A(H3N2) may predispose this subtype to dominate influenza virus circulation in a given season. A(H3N2) subtype dominance – the proportion of influenza positive samples typed as A(H3N2) – increased with H3 epitope distance (t – 2) (R2 = 0.32, P = 0.05) and N2 epitope distance (t – 1) (R2 = 0.34, P = 0.03) (regression results: Figure 4; Spearman correlations: Figure 3 – figure supplement 1). Figure 4 illustrates this relationship at the regional level across two seasons in which A(H3N2) was nationally dominant, but where antigenic change differed. In 2003-2004, we observed widespread dominance of A(H3N2) viruses after the emergence of the novel antigenic cluster, FU02 (A/Fujian/411/2002-like strains). In contrast, there was substantial regional heterogeneity in subtype circulation during 2007-2008, a season in which A(H3N2) viruses were antigenically similar to those circulating in the previous season. Patterns in type/subtype circulation across all influenza seasons in our study period are shown in Figure 4 – figure supplement 1. As observed for the 2003-2004 season, widespread A(H3N2) dominance tended to coincide with major antigenic transitions (e.g., A/Sydney/5/1997 (SY97) seasons, 1997-1998 to 1999-2000; A/California/7/2004 (CA04) season, 20042005), though this was not universally the case (e.g., A/Perth/16/2009 (PE09) season, 2010-2011).

After the 2009 A(H1N1) pandemic, A(H3N2) dominant seasons still occurred more frequently than A(H1N1) dominant seasons, but the mean fraction of influenza positive cases typed as A(H3N2) in A(H3N2) dominant seasons was lower compared to A(H3N2) dominant seasons prior to 2009. Antigenically distinct 3c.2a and 3c.3a viruses began to co-circulate in 2012 and underwent further diversification during subsequent seasons in our study (https://nextstrain.org/seasonal-flu/h3n2/ha/12y@2024-05-13)(Dhanasekaran et al., 2022; Huddleston et al., 2020; Yan et al., 2019). The decline in A(H3N2) predominance during the post-2009 period may be linked to the genetic and antigenic diversification of A(H3N2) viruses, wherein multiple lineages with similar fitness co-circulated in each season.”

(5) Line 253-255: It would be beneficial to provide a more detailed interpretation of the statement that "pre-2009 seasonal A(H1N1) viruses may limit the circulation of A(H3N2) viruses to a greater extent than A(H1N1)pdm09 viruses." Elaborate on the cause-and-effect relationship within this statement.

In the revised manuscript we suggest that seasonal A(H1N1) viruses may interfere with the circulation of A(H3N2) viruses to a greater extent than A(H1N1)pdm09 viruses, because seasonal A(H1N1) viruses and A(H3N2) are more closely related, and thus may elicit stronger cross-reactive T cell responses.

Lines 738-745:

“The internal gene segments NS, M, NP, PA, and PB2 of A(H3N2) viruses and pre-2009 seasonal A(H1N1) viruses share a common ancestor (Webster et al., 1992) whereas A(H1N1)pdm09 viruses have a combination of gene segments derived from swine and avian reservoirs that were not reported prior to the 2009 pandemic (Garten et al., 2009; Smith et al., 2009). Non-glycoprotein genes are highly conserved between influenza A viruses and elicit cross-reactive antibody and T cell responses (Grebe et al., 2008; Sridhar, 2016). Because pre-2009 seasonal A(H1N1) viruses and A(H3N2) are more closely related, we hypothesized that seasonal A(H1N1) viruses could potentially limit the circulation of A(H3N2) viruses to a greater extent than A(H1N1)pdm09 viruses, due to greater T cell-mediated cross-protective immunity.”

(6) In the results section, many statements report statistical results of correlation analyses. Consider providing further interpretations of these results, such as the implications of nonsignificant correlations and how they support or contradict the hypothesis or previous studies. For example, the statement on line 248 regarding the lack of significant correlation between influenza B epidemic size and A(H3N2) epidemic metrics would benefit from additional discussion on what this non-significant correlation signifies and how it relates to the hypothesis or previous research.

In the Discussion section, we suggest that the lack of an association between influenza B circulation and A(H3N2) epidemic metrics is due to few T and B cell epitopes shared between influenza A and B viruses (Terajima et al., 2013).

Lines 1005-1007 in revised manuscript (Lines 513-515 in original manuscript):

“Overall, we did not find any indication that influenza B incidence affects A(H3N2) epidemic burden or timing, which is not unexpected, given that few T and B cell epitopes are shared between the two virus types (Terajima et al., 2013).”

Minor comments:

(1) Line 116-122: Include a summary statistical description of all collected data sets, detailing the number of HA and NA sequence data and their sources. Briefly describe subsampled data sets, specifying preferences (e.g., the number of HA or NA sequence data collected from each region).

In our revised manuscript we now include supplementary tables that summarize the number of A/H3 and

A/N2 sequences in each subsampled dataset, aggregated by world region, for all seasons combined (Figure 2 - table supplements 1 - 2). We also include supplementary figures showing the number of sequences collected in each month and each season in North America versus the other nine world regions combined (Figure 2 - figure supplements 1 - 2). Subsampled datasets are plotted individually in the figures below but individual time series are difficult to discern due to minor differences in sequence counts across the datasets.

(2) Figure 7A: Due to space limitations, consider rounding numbers on the x-axis to whole numbers for clarity.

Thank you for this suggestion. In the revised manuscript we round numbers in the axes of Figure 7A (Figure 9A in the revised manuscript) so that the axes are less crowded.

(3) Figure 4C & Figure 4D: Note that Region 10 (purple) data were unavailable for seasons before 2009 (lines 1483-1484). Label each region on the map with its respective region number (1 to 10) and indicate this in the legend for easy identification.

In our original submission, the legend for Figure 4 included “Data for Region 10 (purple) were not available for seasons prior to 2009” at the end of the caption. We have moved this sentence, as well as other descriptions that apply to both C and D, so that they follow the sentence “C-D. Regional patterns of influenza type and subtype incidence during two seasons when A(H3N2) was nationally dominant.”

In our revised manuscript, Figure 4, and Figure 4 - figure supplement 1 (Figure S10 in original submission) include labels for each HHS region.

We did not receive specific recommendations from Reviewer #2. However, our responses to Reviewer #3 addresses the study’s weaknesses mentioned by Reviewer #2.

Reviewer #3 (Recommendations For The Authors):

This paper explores the relationships among evolutionary and epidemiological quantities in influenza, using a wide range of datasets and features, and using both correlations and random forests to examine, primarily, what are the drivers of influenza epidemics.

This is a work horse of paper, in the volumes of data that are analyzed and the extensive analysis that is done. The data that are provided are a treasure trove resource for influenza modelers and for anyone interested in seeing influenza surveillance data in the context of evolution, and evolutionary information in the context of epidemiology.

L53 - end of sentence "and antigenic drift": not sure this fits, explain? I thought this sentence was in contrast to antigenic drift.

Thank you for catching this. We did not intend to include “and antigenic drift” at the end of this sentence and have removed it (Line 59).

Para around L115: would using primarily US data be a limitation, because it's global immunity that shapes success of strains? Or, how much does each country's immunity and vaccination and so on actually shape what strains succeed there, compared to global/international factors?

The HA and NA phylogenetic trees in our study are enriched with U.S. sequences because our study focuses on epidemiological dynamics in the U.S., and we wanted to prioritize A(H3N2) viruses that the U.S. human population encountered in each season. We agree with the reviewer that the world population may be the right scale to understand how immunity, acquired by vaccination or natural infection, may shape the emergence and success of new lineages that will go on to circulate globally. However, our study assesses the overall impact of antigenic drift on regional A(H3N2) epidemic dynamics in the U.S. In other words, our driving question is whether we can predict the population-level impact of an A(H3N2) variant in the U.S., conditional on this particular lineage having established in the U.S. and circulating at relatively high levels. We do not assess the global or population-level factors that may influence which A(H3N2) virus lineages are successful in a given location or season.

We have added a clarifying sentence to the end of the Introduction to narrow the scope of the paper for the reader.

Line 114-116: “Rather than characterize in situ evolution of A(H3N2) lineages circulating in the U.S., we study the epidemiological impacts of antigenic drift once A(H3N2) variants have arrived on U.S. soil and managed to establish and circulate at relatively high levels.”

In the Results section, I found the format hard to follow, because of the extensive methodological details, numbers with CIs and long sentences. Sentences sometimes included the question, definitions of variables, and lists. For example at line 215 we have: "Next, we tested for associations between A(H3N2) evolution and epidemic timing, including onset week, defined as the winter changepoint in incidence [16], and peak week, defined as the first week of maximum incidence; spatiotemporal synchrony, measured as the variation (standard deviation, s.d.) in regional onset and peak timing; and epidemic speed, including seasonal duration and the number of weeks from onset to peak (Table 2, Figure S11)". I would suggest putting the methods section first, using shorter sentences, separating lists from the question being asked, and stating what was found without also putting in all the extra detail. Putting the methods section before the results might reduce the sense that you have to explain what you did and how in the results section too.

Thank you for suggesting how to improve the readability of the Results section. In the revised manuscript, we follow the reviewer’s advice to put the Methods section before the Results section. Although eLife formatting requirements specify the order: Introduction, Results, Discussion, and Methods, the journal allows for the Methods section to follow the Introduction when it makes sense to do so. We agree with the reviewer that putting the Methods section before the Results section makes our results easier to follow because we no longer need to introduce methodological details at the beginning of each set of results.

L285 in the RF you remove variables without significant correlations with the target variables, but isn't one of the aims of RF to uncover relationships where a correlation might not be evident, and in part to reveal combinations of features that give the targeted outcome? Also with the RF, I am a bit concerned that you could not use the leave-one-out approach because it was "unstable" - presumably that means that you obtain quite different results if you leave out a season. How robust are these results, and what are the most sensitive aspects? Are the same variables typically high in importance if you leave out a season, for example? What does the scatterplot of observed vs predicted epidemic size (as in Fig 7) look like if each prediction is for the one that was left out (i.e. from a model trained on all the rest)? In my experience, where the RF is "unstable", that can look pretty terrible even if the model trained on all the data looks great (as does Figure 7). In any case I think it's worth discussing sensitivity.

(1) In response to the reviewer’s first question, we explain our rationale for not including all candidate predictors in random forest and penalized regression models.

Models trained with different combinations of predictors can have similar performance, and these combinations of predictors can include variables that do not necessarily have strong univariate associations with the target variable. The performance of random forest and LASSO regression models are not sensitive to redundant or irrelevant predictors (see Figure 10.2 in Kuhn & Johnson, 2019). However, if our goal is variable selection rather than strictly model performance, it is considered best practice to remove collinear, redundant, and/or irrelevant variables prior to training models (see section 11.3 in Kuhn & Johnson, 2019). In both random forest and LASSO regression models, if there are highly collinear variables that are useful for predicting the target variable, the predictor chosen by the model becomes a random selection. In random forest models, these highly collinear variables will be used in all splits across the forest of decision trees, and this redundancy dilutes variable importance scores. Thus, failing to minimize multicollinearity prior to model training could result in some variables having low rankings and the appearance of being unimportant, because their importance scores are overshadowed by those of the highly correlated variables. Our rationale for preprocessing predictor data follows the philosophy of Kuhn & Johnson, 2019, who recommend including the minimum possible set of variables that does not compromise model performance. Even if a particular model is insensitive to extra predictors, Kuhn and John explain that “removing predictors can reduce the cost of acquiring data or improve the throughput of the software used to make predictions.”

In the revised manuscript, we include more details about our steps for preprocessing predictor data. We also follow the reviewer’s suggestion to include all evolutionary predictors in variable selection analyses, regardless of whether they have strong univariate correlations with target outcomes, because the performance of random forest and LASSO regression models is not affected by redundant predictors.

Including additional predictors in our variable selection analyses does not change our conclusions. As reported in our original manuscript, predictors with strong univariate correlations with various epidemic metrics were the highest ranked features in both random forest and LASSO regression models.

Lines 523-563:

“Preprocessing of predictor data: The starting set of candidate predictors included all viral fitness metrics: genetic and antigenic distances between current and previously circulating strains and the standard deviation and Shannon diversity of H3 and N2 LBI values in the current season. To account for potential type or subtype interference, we included A(H1N1) or A(H1N1)pdm09 epidemic size and B epidemic size in the current and prior season and the dominant IAV subtype in the prior season (Lee et al., 2018). We included A(H3N2) epidemic size in the prior season as a proxy for prior natural immunity to A(H3N2). To account for vaccine-induced immunity, we considered four categories of predictors and included estimates for the current and prior seasons: national vaccination coverage among adults (18-49 years coverage × ≥ 65 years coverage), adjusted A(H3N2) vaccine effectiveness (VE), a combined metric of vaccination coverage and A(H3N2) VE (18-49 years coverage × ≥ 65 years coverage × VE), and H3 and N2 epitope distances between naturally circulating A(H3N2) viruses and the U.S. A(H3N2) vaccine strain in each season. We could not include a predictor for vaccination coverage in children or consider cladespecific VE estimates, because these data were not available for most seasons in our study.

Random forest and LASSO regression models are not sensitive to redundant (highly collinear) features (Kuhn & Johnson, 2019), but we chose to downsize the original set of candidate predictors to minimize the impact of multicollinearity on variable importance scores. For both types of models, if there are highly collinear variables that are useful for predicting the target variable, the predictor chosen by the model becomes a random selection (Kuhn & Johnson, 2019). In random forest models, these highly collinear variables will be used in all splits across the forest of decision trees, and this redundancy dilutes variable importance scores (Kuhn & Johnson, 2019). We first confirmed that none of the candidate predictors had zero variance or near-zero variance. Because seasonal lags of each viral fitness metric are highly collinear, we included only one lag of each evolutionary predictor, with a preference for the lag that had the strongest univariate correlations with various epidemic metrics. We checked for multicollinearity among the remaining predictors by examining Spearman’s rank correlation coefficients between all pairs of predictors. If a particular pair of predictors was highly correlated (Spearman’s 𝜌 >0.8), we retained only one predictor from that pair, with a preference for the predictor that had the strongest univariate correlations with various epidemic metrics. Lastly, we performed QR decomposition of the matrix of remaining predictors to determine if the matrix is full rank and identify sets of columns involved in linear dependencies. This step did not eliminate any additional predictors, given that we had already removed pairs of highly collinear variables based on Spearman correlation coefficients.

After these preprocessing steps, our final set of model predictors included 21 variables, including 8 viral evolutionary indicators: H3 epitope distance (t – 2), HI log2 titer distance (t – 2), H3 RBS distance (t – 2), H3 non-epitope distance (t – 2), N2 epitope distance (t – 1), N2 non-epitope distance (t – 1), and H3 and N2 LBI diversity (s.d.) in the current season; 6 proxies for type/subtype interference and prior immunity: A(H1N1) and B epidemic sizes in the current and prior season, A(H3N2) epidemic size in the prior season, and the dominant IAV subtype in the prior season; and 7 proxies for vaccine-induced immunity: A(H3N2) VE in the current and prior season, H3 and N2 epitope distances between circulating strains and the vaccine strain in each season, the combined metric of adult vaccination coverage × VE in the current and prior season, and adult vaccination coverage in the prior season.”

(2) Next, we clarify our model training methodology to address the reviewer’s second point about using a leave-one-out cross-validation approach.

We believe the reviewer is mistaken; we use a leave-one-season-out validation approach which lends some robustness to the predictions. In our original submission, we stated “We created each forest by generating 3,000 regression trees from 10 repeats of a leave-one-season-out (jackknife) cross-validated sample of the data. Due to the small size of our dataset, evaluating the predictive accuracy of random forest models on a quasi-independent test set produced unstable estimates.” (Lines 813-816 in the original manuscript)

To clarify, we use leave-one-season-out cross-validation to train models and measure model performance, wherein each “assessment” set contains one season of data (predicted by the model), and the corresponding “analysis” set (“fold”) contains the remaining seasons. This approach is roughly analogous to splitting data into training and test sets, but all seasons are used at some point in the training of the model (see Section 3.4 in Kuhn & Johnson, 2019). To reduce noise, we generated 10 bootstrap resamples of each fold and averaged the RMSE and R2 values of model predictions from resamples.

Although it would be ideal and best practice to measure model performance with an independent test set, our dataset includes only ~20 seasons. We found that predictions of independent test sets of 2-3 seasons had unstable performance, which indicates we do not have sufficient power to measure model performance with a test set this small. Further, we suspect that large antigenic jumps in a small subset of seasons further contribute to variation in prediction accuracy across randomly selected test sets. Our rationale for using cross-validation instead of an independent test set is best described in Section 4.3 of Kuhn and Johnson’s book “Applied Predictive Modeling” (Kuhn & Johnson, 2013):

“When the number of samples is not large, a strong case can be made that a test set should be avoided because every sample may be needed for model building. Additionally, the size of the test set may not have sufficient power or precision to make reasonable judgements. Several researchers (Molinaro 2005; Martin and Hirschberg 1996; Hawkins et al. 2003) show that validation using a single test set can be a poor choice. Hawkins et al. (2003) concisely summarize this point: “holdout samples of tolerable size [...] do not match the cross-validation itself for reliability in assessing model fit and are hard to motivate. “Resampling methods, such as cross-validation, can be used to produce appropriate estimates of model performance using the training set. These are discussed in length in Sect.4.4. Although resampling techniques can be misapplied, such as the example shown in Ambroise and McLachlan (2002), they often produce performance estimates superior to a single test set because they evaluate many alternate versions of the data.”

In our revised manuscript, we provide additional clarification of our methods (Lines 574-590):

“We created each forest by generating 3,000 regression trees. To determine the best performing model for each epidemic metric, we used leave-one-season-out (jackknife) cross-validation to train models and measure model performance, wherein each “assessment” set is one season of data predicted by the model, and the corresponding “analysis” set contains the remaining seasons. This approach is roughly analogous to splitting data into training and test sets, but all seasons are used at some point in the training of each model (Kuhn & Johnson, 2019). Due to the small size of our dataset (~20 seasons), evaluating the predictive accuracy of random forest models on a quasi-independent test set of 2-3 seasons produced unstable estimates. Instead of testing model performance on an independent test set, we generated 10 bootstrap resamples (“repeats”) of each analysis set (“fold”) and averaged the predictions of models trained on resamples (Kuhn & Johnson, 2013, 2019). For each epidemic metric, we report the mean root mean squared error (RMSE) and R2 of predictions from the best tuned model. We used permutation importance (N=50 permutations) to estimate the relative importance of each predictor in determining target outcomes. Permutation importance is the decrease in prediction accuracy when a single feature (predictor) is randomly permuted, with larger values indicating more important variables. Because many features were collinear, we used conditional permutation importance to compute feature importance scores, rather than the standard marginal procedure (Altmann et al., 2010; Debeer & Strobl, 2020; Strobl et al., 2008; Strobl et al., 2007).”

(3) In response to the reviewer’s question about the sensitivity of results when one season is left out, we clarify that the variable importance scores in Figure 8 and model predictions in Figure 9 were generated by models tuned using leave-one-season-out cross-validation.

As explained above, in our leave-one-season-out cross-validation approach, each “assessment” set contains one season of data predicted by the model, and the corresponding “analysis” set (“fold”) contains the remaining seasons. We generated predictions of epidemic metrics and variable importance rankings by averaging the model output of 10 bootstrap resamples of each cross-validation fold.

In Lines 791-806, we describe which epidemic metrics have the highest prediction accuracy and report that random forest models tend to underpredict most epidemic metrics in seasons with high antigenic novelty:

“We measured correlations between observed values and model-predicted values at the HHS region level. Among the various epidemic metrics, random forest models produced the most accurate predictions of A(H3N2) subtype dominance (Spearman’s 𝜌 = 0.95, regional range = 0.85 – 0.97), peak incidence (𝜌 = 0.91, regional range = 0.72 – 0.95), and epidemic size (𝜌 = 0.9, regional range = 0.74 – 0.95), while predictions of effective Rt and epidemic intensity were less accurate (𝜌 = 0.81, regional range = 0.65 – 0.91; 𝜌 = 0.78, regional range = 0.63 – 0.92, respectively) (Figure 9). Random forest models tended to underpredict most epidemic targets in seasons with substantial H3 antigenic transitions, in particular the SY97 cluster seasons (1998-1999, 1999-2000) and the FU02 cluster season (2003-2004) (Figure 9).

For epidemic size and peak incidence, seasonal predictive error – the root-mean-square error (RMSE) across all regional predictions in a season – increased with H3 epitope distance (epidemic size, Spearman’s 𝜌 = 0.51, P = 0.02; peak incidence, 𝜌 = 0.63, P = 0.004) and N2 epitope distance (epidemic size, 𝜌 = 0.48, P = 0.04; peak incidence, 𝜌 = 0.48, P = 0.03) (Figure 9 – figure supplements 1 – 2). For models of epidemic intensity, seasonal RMSE increased with N2 epitope distance (𝜌 = 0.64, P = 0.004) but not H3 epitope distance (𝜌 = 0.06, P = 0.8) (Figure 9 – figure supplements 1 – 2). Seasonal RMSE of effective Rt and subtype dominance predictions did not correlate with H3 or N2 epitope distance (Figure 9 – figure supplements 1 – 2).”

I think the competition (interference) results are really interesting, perhaps among the most interesting aspects of this work.

Thank you! We agree that our finding that subtype interference has a greater impact than viral evolution on A(H3N2) epidemics is one of the more interesting results in the study.

Have you seen the paper by Barrat-Charlaix et al? They found that LBI was not good predicting frequency dynamics (see https://pubmed.ncbi.nlm.nih.gov/33749787/); instead, LBI was high for sequences like the consensus sequence, which was near to future strains. LBI also was not positively correlated with epidemic impact in Figure S7.

The local branching index (LBI) measures the rate of recent phylogenetic branching and approximates relative fitness among viral clades, with high LBI values representing greater fitness (Neher et al. 2014).

Two of this study’s co-authors (John Huddleston and Trevor Bedford) are also co-authors of BarratCharlaix et al. 2021. Barrat-Charlaix et al. 2021 assessed the performance of LBI in predicting the frequency dynamics and fixation of individual amino acid substitutions in A(H3N2) viruses. Our study is not focused on predicting the future success of A(H3N2) clades or the frequency dynamics or probability of fixation of individual substitutions. Instead, we use the standard deviation and Shannon diversity of LBI values in each season as a proxy for genealogical (clade-level) diversity. We find that, at a seasonal level, low diversity of H3 or N2 LBI values in the current season correlates with greater epidemic intensity, higher transmission rates, and shorter seasonal duration.

In the Discussion we provide an explanation for these correlation results (Lines 848-857):

“The local branching index (LBI) is traditionally used to predict the success of individual clades, with high LBI values indicating high viral fitness (Huddleston et al., 2020; Neher et al., 2014). In our epidemiological analysis, low diversity of H3 or N2 LBI in the current season correlated with greater epidemic intensity, higher transmission rates, and shorter seasonal duration. These associations suggest that low LBI diversity is indicative of a rapid selective sweep by one successful clade, while high LBI diversity is indicative of multiple co-circulating clades with variable seeding and establishment times over the course of an epidemic. A caveat is that LBI estimation is more sensitive to sequence sub-sampling schemes than strain-level measures. If an epidemic is short and intense (e.g., 1-2 months), a phylogenetic tree with our sub-sampling scheme (50 sequences per month) may not incorporate enough sequences to capture the true diversity of LBI values in that season.”

Figure 1 - LBI goes up over time. Is that partly to do with sampling? Overall how do higher sampling volumes in later years impact this analysis? (though you choose a fixed number of sequences so I guess you downsample to cope with that). I note that LBI is likely to be sensitive to sequencing density.

Thank you for pointing this out. We realized that increasing LBI Shannon diversity over the course of the study period was indeed an artefact of increasing sequence volume over time. Our sequence subsampling scheme involves selecting a random sample of up to 50 viruses per month, with up to 25 viruses selected from North America (if available) and the remaining sequences evenly divided across nine other global regions. In early seasons of the study (late 1990s/early 2000s), sampling was often too sparse to meet the 25 viruses/month threshold for North America or for the other global regions combined (H3: Figure 2 - figure supplement 1; N2: Figure 2 - figure supplement 2). Ecological diversity metrics are sensitive to sample size, which explains why LBI Shannon diversity appeared to steadily increase over time in our original submission. In our revised manuscript, we correct for uneven sample sizes across seasons before estimating Shannon diversity and clarify our methodology.

Lines 443-482:

“Clade growth: The local branching index (LBI) measures the relative fitness of co-circulating clades, with high LBI values indicating recent rapid phylogenetic branching (Huddleston et al., 2020; Neher et al., 2014). To calculate LBI for each H3 and N2 sequence, we applied the LBI heuristic algorithm as originally described by Neher et al., 2014 to H3 and N2 phylogenetic trees, respectively. We set the neighborhood parameter 𝜏 to 0.4 and only considered viruses sampled between the current season 𝑡 and the previous season 𝑡 – 1 as contributing to recent clade growth in the current season 𝑡.

Variation in the phylogenetic branching rates of co-circulating A(H3N2) clades may affect the magnitude, intensity, onset, or duration of seasonal epidemics. For example, we expected that seasons dominated by a single variant with high fitness might have different epidemiological dynamics than seasons with multiple co-circulating clades with varying seeding and establishment times. We measured the diversity of clade growth rates of viruses circulating in each season by measuring the standard deviation (s.d.) and Shannon diversity of LBI values in each season. Given that LBI measures relative fitness among cocirculating clades, we did not compare overall clade growth rates (e.g., mean LBI) across seasons.

Each season’s distribution of LBI values is right-skewed and does not follow a normal distribution. We therefore bootstrapped the LBI values of each season in each replicate dataset 1000 times (1000 samples with replacement) and estimated the seasonal standard deviation of LBI from resamples, rather than directly from observed LBI values. We also tested the seasonal standard deviation of LBI from log transformed LBI values, which produced qualitatively equivalent results to bootstrapped LBI values in downstream analyses.

As an alternative measure of seasonal LBI diversity, we binned raw H3 and N2 LBI values into categories based on their integer values (e.g. an LBI value of 0.5 is assigned to the (0,1] bin) and estimated the exponential of the Shannon entropy (Shannon diversity) of LBI categories (Hill, 1973; Shannon, 1948). The Shannon diversity of LBI considers both the richness and relative abundance of viral clades with different growth rates in each season and is calculated as follows:

P1D=exp(i=1Rpilnpi)

where PqD is the effective number of categories or Hill numbers of order 𝑞 (here, clades with different growth rates), with 𝑞 defining the sensitivity of the true diversity to rare versus abundant categories (Hill, 1973). exp is the exponential function, pi is the proportion of LBI values belonging to the 𝑖th category, and 𝑅 is richness (the total number of categories). Shannon diversity P1D (𝑞 = 1) estimates the effective number of categories in an assemblage using the geometric mean of their proportional abundances pi (Hill, 1973).

Because ecological diversity metrics are sensitive to sampling effort, we rarefied H3 and N2 sequence datasets prior to estimating Shannon diversity so that seasons had the same sample size. For each season in each replicate dataset, we constructed rarefaction and extrapolation curves of LBI Shannon diversity and extracted the Shannon diversity estimate of the sample size that was twice the size of the reference sample size (the smallest number of sequences obtained in any season during the study) (iNEXT R package) (Chao et al., 2014). Chao et al. found that their diversity estimators work well for rarefaction and short-range extrapolation when the extrapolated sample size is up to twice the reference sample size. For H3, we estimated seasonal diversity using replicate datasets subsampled to 360 sequences/season; For N2, datasets were subsampled to 230 sequences/season.”

Estimating the Shannon diversity of LBI from datasets with even sampling across seasons removes the previous secular trend of increasing LBI diversity over time (Figure 2 in revised manuscript).

Figure 3 - I wondered what about the co-dominant times?

In Figure 3, orange points correspond to seasons in which A(H3N2) and A(H1N1) were codominant. We are not sure of the reviewer’s specific question concerning codominant seasons, but if it concerns whether antigenic drift is linked to epidemic magnitude among codominant seasons alone, we cannot perform separate regression analyses for these seasons because there are only two codominant seasons during the 22 season study period.

Figure 4 - Related to drift and epidemic size, dominance, etc. -- when is drift measured, and (if it's measured in season t), would larger populations create more drift, simply by having access to more opportunity (via a larger viral population size)? This is a bit 'devil's advocate' but what if some epidemiological/behavioural process causes a larger and/or later peak, and those gave rise to higher drift?

Seasonal drift is measured as the genetic or antigenic distance between viruses circulating during season t and viruses circulating in the prior season (𝑡 – 1) or two seasons ago (𝑡 – 2).

Concerning the question about whether larger human populations lead to greater rates of antigenic drift, phylogeographic studies have repeatedly found that East-South-Southeast Asia are the source populations for A(H3N2) viruses (Bedford et al., 2015; Lemey et al., 2014), in part because these regions have tropical or subtropical climates and larger human populations, which enable year-round circulation and higher background infection rates. Larger viral populations (via larger host population sizes) and uninterrupted transmission may increase the efficiency of selection and the probability of strain survival and global spread (Wen et al., 2016). After A(H3N2) variants emerge in East-South-Southeast Asia and spread to other parts of the world, A(H3N2) viruses circulate via overlapping epidemics rather than local persistence (Bedford et al., 2015; Rambaut et al., 2008). Each season, A(H3N2) outbreaks in the US (and other temperate regions) are seeded by case importations from outside the US, genetic diversity peaks during the winter, and a strong genetic bottleneck typically occurs at the end of the season (Rambaut et al., 2008).

Due to their faster rates of antigenic evolution, A(H3N2) viruses undergo more rapid clade turnover and dissemination than A(H1N1) and B viruses, despite similar global migration networks across A(H3N2), A(H1N1), and B viruses (Bedford et al., 2015). Bedford et al. speculate that there is typically little geographic differentiation in A(H3N2) viruses circulating in each season because A(H3N2) viruses tend to infect adults, and adults are more mobile than children. Compared to A(H3N2) viruses, A(H1N1) and B viruses tend to have greater genealogical diversity, geographic differentiation, and longer local persistence times (Bedford et al., 2015; Rambaut et al., 2008). Thus, some A(H1N1) and B epidemics are reseeded by viruses that have persisted locally since prior epidemics (Bedford et al., 2015).

Theoretical models have shown that epidemiological processes can influence rates of antigenic evolution (Recker et al., 2007; Wen et al., 2016; Zinder et al., 2013), though the impact of flu epidemiology on viral evolution is likely constrained by the virus’s intrinsic mutation rate.

In conclusion, larger host population sizes and flu epidemiology can indeed influence rates of antigenic evolution. However, given that our study is US-centric and focuses on A(H3N2) viruses, these factors are likely not at play in our study, due to intrinsic biological characteristics of A(H3N2) viruses and the geographic location of our study.

We have added a clarifying sentence to the end of the Introduction to narrow the scope of the paper for the reader.

Line 114-116: “Rather than characterize in situ evolution of A(H3N2) lineages circulating in the U.S., we study the epidemiological impacts of antigenic drift once A(H3N2) variants have arrived on U.S. soil and managed to establish and circulate at relatively high levels.”

Methods --

L 620 about rescaling and pre- vs post-pandemic times : tell us more - how has reporting changed? could any of this not be because of reporting but because of NPIs or otherwise? Overall there is a lot of rescaling going on. How sensitive are the results to it?

it would be unreasonable to ask for a sensitivity analysis for all the results for all the choices around data preparation, but some idea where there is a reason to think there might be a dependence on one of these choices would be great.

In response to the 2009 A(H1N1) pandemic, the US CDC and WHO increased laboratory testing capacity and strengthened epidemiological networks, leading to substantial, long-lasting improvements to influenza surveillance that are still in place today (https://www.cdc.gov/flu/weekly/overview.htm). At the beginning of the COVID-19 pandemic, influenza surveillance networks were quickly adapted to detect and understand the spread of SARS-CoV-2. The 2009 pandemic occurred over a time span of less than one year, and strict non-pharmaceutical interventions (NPIs), such as lockdowns and mask mandates, were not implemented. Thus, we attribute increases in test volume during the post-2009 period to improved virologic surveillance and laboratory testing capacity rather than changes in care-seeking behavior. In the revised manuscript, we include a figure (Figure 1 - figure supplement 2) that shows systematic increases in test volume in all HHS regions after the 2009 pandemic.

Given the substantial increase in influenza test volume after 2009, we opted to keep the time trend adjustment for the pre- and post-2009 pandemic periods and evaluate whether adjusting for regional reporting differences affects our results. When estimating univariate correlations between various

A(H3N2) epidemic metrics and evolutionary indicators, we found qualitatively equivalent results for Spearman correlations and regression models, when adjusting for the pre- and post-2009 pandemic time periods and regional reporting versus only adjusting for the pre-/post-2009 pandemic time periods. Below, we share adjusted versions of Figure 3 (regression results) and Figure 3 - figure supplement 1 (Spearman correlations). Each figure only adjusts for differences in pre- and post-2009 pandemic reporting.

Author response image 1. Adjustment for pre- and post-2009 pandemic only.

Author response image 1.

Author response image 2. Adjustment for pre- and post-2009 pandemic only.

Author response image 2.

L635 - Why discretize the continuous LBI distribution and then use Shannon entropy when you could just use the variance and/or higher moments? (or quantiles)? Similarly, why not use the duration of the peak, rather than Shannon entropy? (though there, because presumably data are already binned weekly, and using duration would involve defining start and stop times, it's more natural than with LBI)

We realize that we failed to mention in the methods that we calculated the standard deviation of LBI in each season, in addition to the exponential of the Shannon entropy (Shannon diversity) of LBI. Both the Shannon diversity of LBI values and the standard deviation of LBI values were negatively correlated with effective Rt and epidemic intensity and positively correlated with seasonal duration. The two measures were similarly correlated with effective Rt and epidemic intensity (Figure 3 - figure supplements 2 - 3), while the Shannon diversity of LBI had slightly stronger correlations with seasonal duration than s.d. LBI (Figure 5). Thus, both measures of LBI diversity appear to capture potentially biologically important heterogeneities in clade growth rates.

Separately, we use the inverse Shannon entropy of the incidence distribution to measure the spread of an A(H3N2) epidemic during the season, following the methods of Dalziel et al. 2018. The peak of an epidemic is a single time point at which the maximum incidence occurs. We have not encountered “the duration of the peak” before in epidemiology terminology, and, to our knowledge, there is not a robust way to measure the “duration of a peak,” unless one were to measure the time span between multiple points of maximum incidence or designate an arbitrary threshold for peak incidence that is not strictly the maximum incidence. Given that Shannon entropy is based on the normalized incidence distribution over the course of the entire influenza season (week 40 to week 20), it does not require designating an arbitrary threshold to describe epidemic intensity.

L642 - again why normalize epidemic intensities, and how sensitive are the results to this? I would imagine given that the RF results were unstable under leave-one-out analysis that some of those results could be quite sensitive to choices of normalization and scaling.

Epidemic intensity, defined as the inverse Shannon entropy of the incidence distribution, measures the spread of influenza cases across the weeks in a season. Following Dalziel et al. 2018, we estimated epidemic intensity from normalized incidence distributions rather than raw incidences so that epidemic intensity is invariant under differences in reporting rates and/or attack rates across regions and seasons. If we were to use raw incidences instead, HHS regions or seasons could have the appearance of greater or lower epidemic intensity (i.e., incidence concentrated within a few weeks or spread out over several weeks), due to differences in attack rates or test volume, rather than fundamental differences in the shapes of their epidemic curves. In other words, epidemic intensity is intended to measure the shape and spread of an epidemic, regardless of the actual volume of cases in a given region or season.

In the methods section, we provide further clarification for why epidemic intensities are based on normalized incidence distributions rather than raw incidences.

Lines 206-209: “Epidemic intensity is intended to measure the shape and spread of an epidemic, regardless of the actual volume of cases in a given region or season. Following the methodology of Dalziel et al. 2018, epidemic intensity values were normalized to fall between 0 and 1 so that epidemic intensity is invariant to differences in reporting rates and/or attack rates across regions and seasons.”

L643 - more information about what goes into Epidemia (variables, priors) such that it's replicable/understandable without the code would be good.

We now include additional information concerning the epidemic models used to estimate Rt, including all model equations, variables, and priors (Lines 210-276 in Methods).

L667 did you do breakpoint detection? Why linear models? Was log(incidence) used?

In our original submission, we estimated epidemic onsets using piecewise regression models (Lines 666674 in original manuscript), which model non-linear relationships with breakpoints by iteratively fitting linear models (Muggeo, 2003). Piecewise regression falls under the umbrella of parametric methods for breakpoint detection.

We did not include results from linear models fit to log(incidence) or GLMs with Gaussian error distributions and log links, due to two reasons. First, models fit to log-transformed data require non-zero values as inputs. Although breakpoint detection does not necessarily require weeks of zero incidence leading up to the start of an outbreak, limiting the time period for breakpoint detection to weeks with nonzero incidence (so that we could use log transformed incidence) substantially pushed back previous more biologically plausible estimates of epidemic onset weeks. Second, as an alternative to limiting the dataset to weeks with non-zero incidence, we tried adding a small positive number to weekly incidences so that we could fit models to log transformed incidence for the whole time period spanning epidemic week 40 (the start of the influenza season) to the first week of maximum incidence. Fitting models to log transformed incidences produced unrealistic breakpoint locations, potentially because log transformations (1) linearize data, and (2) stabilize variance by reducing the impact of extreme values. Due to the short time span used for breakpoint detection, log transforming incidence diminishes abrupt changes in incidence at the beginning of outbreaks, making it difficult for models to estimate biologically plausible breakpoint locations. Log transformations of incidence may be more useful when analyzing time series spanning multiple seasons, rather than short time spans with sharp changes in incidence (i.e., the exponential growth phase of a single flu outbreak).

As an alternative to piecewise regression, our revised manuscript also estimates epidemic onsets using a Bayesian ensemble algorithm that accounts for the time series nature of incidence data and allows for complex, non-linear trajectories interspersed with change points (BEAST - a Bayesian estimator of Abrupt change, Seasonal change, and Trend; Zhao et al., 2019). Although a few regional onset time times differed across the two methods, our conclusions did not change concerning correlations between viral fitness and epidemic onset timing.

We have rewritten the methods section for estimating epidemic onsets to clarify our methodology and to include the BEAST method (Lines 292-308):

“We estimated the regional onsets of A(H3N2) virus epidemics by detecting breakpoints in A(H3N2) incidence curves at the beginning of each season. The timing of the breakpoint in incidence represents epidemic establishment (i.e., sustained transmission) rather than the timing of influenza introduction or arrival (Charu et al., 2017). We used two methods to estimate epidemic onsets: (1) piecewise regression, which models non-linear relationships with break points by iteratively fitting linear models to each segment (segmented R package) (Muggeo, 2008; Muggeo, 2003), and (2) a Bayesian ensemble algorithm (BEAST – a Bayesian estimator of Abrupt change, Seasonal change, and Trend) that explicitly accounts for the time series nature of incidence data and allows for complex, non-linear trajectories interspersed with change points (Rbeast R package) (Zhao et al., 2019). For each region in each season, we limited the time period of breakpoint detection to epidemic week 40 to the first week of maximum incidence and did not estimate epidemic onsets for regions with insufficient signal, which we defined as fewer than three weeks of consecutive incidence and/or greater than 30% of weeks with missing data. We successfully estimated A(H3N2) onset timing for most seasons, except for three A(H1N1) dominant seasons: 20002001 (0 regions), 2002-2003 (3 regions), and 2009-2010 (0 regions). Estimates of epidemic onset weeks were similar when using piecewise regression versus the BEAST method, and downstream analyses of correlations between viral fitness indicators and onset timing produced equivalent results. We therefore report results from onsets estimated via piecewise regression.”

L773 national indicators -- presumably this is because you don't have regional-level information, but it might be worth saying that earlier so it doesn't read like there are other indicators now, called national indicators, that we should have heard of

In the revised manuscript, we move a paragraph that was at the beginning of the Results to the beginning of the Methods.

Lines 123-132:

“Our study focuses on the impact of A(H3N2) virus evolution on seasonal epidemics from seasons 1997-1998 to 2018-2019 in the U.S.; whenever possible, we make use of regionally disaggregated indicators and analyses. We start by identifying multiple indicators of influenza evolution each season based on changes in HA and NA. Next, we compile influenza virus subtype-specific incidence time series for U.S. Department of Health and Human Service (HHS) regions and estimate multiple indicators characterizing influenza A(H3N2) epidemic dynamics each season, including epidemic burden, severity, type/subtype dominance, timing, and the age distribution of cases. We then assess univariate relationships between national indicators of evolution and regional epidemic characteristics. Lastly, we use multivariable regression models and random forest models to measure the relative importance of viral evolution, heterosubtypic interference, and prior immunity in predicting regional A(H3N2) epidemic dynamics.”

In Lines 484-487 in the Methods, we now mention that measures of seasonal antigenic and genetic distance are at the national level.

“For each replicate dataset, we estimated national-level genetic and antigenic distances between influenza viruses circulating in consecutive seasons by calculating the mean distance between viruses circulating in the current season 𝑡 and viruses circulating during the prior season (𝑡 – 1 year; one season lag) or two prior seasons ago (𝑡 – 2 years; two season lag).”

L782 Why Beta regression and what is "the resampled dataset" ?

Beta regression is appropriate for models of subtype dominance, epidemic intensity, and age-specific proportions of ILI cases because these data are continuous and restricted to the interval (0, 1) (Ferrari & Cribari-Neto, 2004). “The resampled dataset” refers to the “1000 bootstrap replicates of the original dataset (1000 samples with replacement)” mentioned in Lines 777-778 of the original manuscript.

In the revised manuscript, we include more background information about Beta regression models, and explicitly mention that regression models were fit to 1000 bootstrap replicates of the original dataset.

Lines 503-507:

“For subtype dominance, epidemic intensity, and age-specific proportions of ILI cases, we fit Beta regression models with logit links. Beta regression models are appropriate when the variable of interest is continuous and restricted to the interval (0, 1) (Ferrari & Cribari-Neto, 2004). For each epidemic metric, we fit the best-performing regression model to 1000 bootstrap replicates of the original dataset.”

The github is clear, comprehensive and well-documented, at least at a brief glance.

Thank you! At the time of resubmission, our GitHub repository is updated to incorporate feedback from the reviewers.

References

Altmann, A., Tolosi, L., Sander, O., & Lengauer, T. (2010). Permutation importance: a corrected feature importance measure. Bioinformatics, 26(10), 1340-1347. https://doi.org/10.1093/bioinformatics/btq134

Barrat-Charlaix, P., Huddleston, J., Bedford, T., & Neher, R. A. (2021). Limited Predictability of Amino Acid Substitutions in Seasonal Influenza Viruses. Mol Biol Evol, 38(7), 2767-2777. https://doi.org/10.1093/molbev/msab065

Bedford, T., Riley, S., Barr, I. G., Broor, S., Chadha, M., Cox, N. J., Daniels, R. S., Gunasekaran, C. P., Hurt, A. C., Kelso, A., Klimov, A., Lewis, N. S., Li, X., McCauley, J. W., Odagiri, T., Potdar, V., Rambaut, A., Shu, Y., Skepner, E., . . . Russell, C. A. (2015). Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature, 523(7559), 217-220. https://doi.org/10.1038/nature14460

Chao, A., Gotelli, N. J., Hsieh, T. C., Sander, E. L., Ma, K. H., Colwell, R. K., & Ellison, A. M. (2014). Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies. Ecological Monographs, 84(1), 45-67. https://doi.org/10.1890/13-0133.1

Charu, V., Zeger, S., Gog, J., Bjornstad, O. N., Kissler, S., Simonsen, L., Grenfell, B. T., & Viboud, C. (2017). Human mobility and the spatial transmission of influenza in the United States. PLOS Comput Biol, 13(2), e1005382. https://doi.org/10.1371/journal.pcbi.1005382

Dalziel, B. D., Kissler, S., Gog, J. R., Viboud, C., Bjornstad, O. N., Metcalf, C. J. E., & Grenfell, B. T.(2018). Urbanization and humidity shape the intensity of influenza epidemics in U.S. cities. Science, 362(6410), 75-79. https://doi.org/10.1126/science.aat6030

Debeer, D., & Strobl, C. (2020). Conditional permutation importance revisited. BMC Bioinformatics, 21(1), 307. https://doi.org/10.1186/s12859-020-03622-2

Dhanasekaran, V., Sullivan, S., Edwards, K. M., Xie, R., Khvorov, A., Valkenburg, S. A., Cowling, B. J., & Barr, I. G. (2022). Human seasonal influenza under COVID-19 and the potential consequences of influenza lineage elimination. Nat Commun, 13(1), 1721. https://doi.org/10.1038/s41467-022-29402-5

Ferrari, S., & Cribari-Neto, F. (2004). Beta Regression for Modelling Rates and Proportions. Journal of Applied Statistics, 31(7), 799-815. https://doi.org/10.1080/0266476042000214501

Garten, R. J., Davis, C. T., Russell, C. A., Shu, B., Lindstrom, S., Balish, A., Sessions, W. M., Xu, X., Skepner, E., Deyde, V., Okomo-Adhiambo, M., Gubareva, L., Barnes, J., Smith, C. B., Emery, S. L., Hillman, M. J., Rivailler, P., Smagala, J., de Graaf, M., . . . Cox, N. J. (2009). Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans. Science, 325(5937), 197-201. https://doi.org/10.1126/science.1176225

Grebe, K. M., Yewdell, J. W., & Bennink, J. R. (2008). Heterosubtypic immunity to influenza A virus: where do we stand? Microbes Infect, 10(9), 1024-1029. https://doi.org/10.1016/j.micinf.2008.07.002

Hill, M. O. (1973). Diversity and Evenness: A Unifying Notation and Its Consequences. Ecology, 54(2), 427-432. https://doi.org/10.2307/1934352

Huddleston, J., Barnes, J. R., Rowe, T., Xu, X., Kondor, R., Wentworth, D. E., Whittaker, L., Ermetal, B., Daniels, R. S., McCauley, J. W., Fujisaki, S., Nakamura, K., Kishida, N., Watanabe, S., Hasegawa, H., Barr, I., Subbarao, K., Barrat-Charlaix, P., Neher, R. A., & Bedford, T. (2020). Integrating genotypes and phenotypes improves long-term forecasts of seasonal influenza A/H3N2 evolution. Elife, 9, e60067. https://doi.org/10.7554/eLife.60067

Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26). Springer.

Kuhn, M., & Johnson, K. (2019). Feature engineering and selection: A practical approach for predictive models. Chapman and Hall/CRC.

Lee, E. C., Arab, A., Goldlust, S. M., Viboud, C., Grenfell, B. T., & Bansal, S. (2018). Deploying digital health data to optimize influenza surveillance at national and local scales. PLoS Comput Biol, 14(3), e1006020. https://doi.org/10.1371/journal.pcbi.1006020

Lemey, P., Rambaut, A., Bedford, T., Faria, N., Bielejec, F., Baele, G., Russell, C. A., Smith, D. J., Pybus, O. G., Brockmann, D., & Suchard, M. A. (2014). Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2. PLOS Pathog, 10(2), e1003932. https://doi.org/10.1371/journal.ppat.1003932

Muggeo, V. (2008). Segmented: An R Package to Fit Regression Models With Broken-Line Relationships. R News, 8, 20-25.

Muggeo, V. M. (2003). Estimating regression models with unknown break-points. Stat Med, 22(19), 30553071. https://doi.org/10.1002/sim.1545

Neher, R. A., Russell, C. A., & Shraiman, B. I. (2014). Predicting evolution from the shape of genealogical trees. Elife, 3, e03568. https://doi.org/10.7554/eLife.03568

Rambaut, A., Pybus, O. G., Nelson, M. I., Viboud, C., Taubenberger, J. K., & Holmes, E. C. (2008). The genomic and epidemiological dynamics of human influenza A virus. Nature, 453(7195), 615-619. https://doi.org/10.1038/nature06945

Recker, M., Pybus, O. G., Nee, S., & Gupta, S. (2007). The generation of influenza outbreaks by a network of host immune responses against a limited set of antigenic types. PNAS, 104(18), 7711-7716. https://doi.org/10.1073/pnas.0702154104

Shannon, C. E. (1948). A mathematical theory of communication. The Bell system technical journal, 27(3), 379-423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

Smith, G. J., Vijaykrishna, D., Bahl, J., Lycett, S. J., Worobey, M., Pybus, O. G., Ma, S. K., Cheung, C. L., Raghwani, J., Bhatt, S., Peiris, J. S., Guan, Y., & Rambaut, A. (2009). Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature, 459(7250), 1122-1125. https://doi.org/10.1038/nature08182

Sridhar, S. (2016). Heterosubtypic T-Cell Immunity to Influenza in Humans: Challenges for Universal T-Cell Influenza Vaccines. Front Immunol, 7, 195. https://doi.org/10.3389/fimmu.2016.00195

Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9, 307. https://doi.org/10.1186/1471-2105-9-307

Strobl, C., Boulesteix, A. L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8, 25. https://doi.org/10.1186/1471-2105-8-25

Terajima, M., Babon, J. A., Co, M. D., & Ennis, F. A. (2013). Cross-reactive human B cell and T cell epitopes between influenza A and B viruses. Virol J, 10, 244. https://doi.org/10.1186/1743-422x-10-244

Webster, R. G., Bean, W. J., Gorman, O. T., Chambers, T. M., & Kawaoka, Y. (1992). Evolution and ecology of influenza A viruses. Microbiological Reviews, 56(1), 152-179. https://doi.org/10.1128/mr.56.1.152-179.1992

Wen, F., Bedford, T., & Cobey, S. (2016). Explaining the geographical origins of seasonal influenza A(H3N2). Proc Biol Sci, 283(1838). https://doi.org/10.1098/rspb.2016.1312

Yan, L., Neher, R. A., & Shraiman, B. I. (2019). Phylodynamic theory of persistence, extinction and speciation of rapidly adapting pathogens. Elife, 8. https://doi.org/10.7554/eLife.44205

Zhao, K., Wulder, M. A., Hu, T., Bright, R., Wu, Q., Qin, H., Li, Y., Toman, E., Mallick, B., Zhang, X., & Brown, M. (2019). Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm. Remote Sensing of Environment, 232, 111181. https://doi.org/10.1016/j.rse.2019.04.034

Zinder, D., Bedford, T., Gupta, S., & Pascual, M. (2013). The Roles of Competition and Mutation in Shaping Antigenic and Genetic Diversity in Influenza. PLOS Pathogens, 9(1). https://doi.org/10.1371/journal.ppat.1003104

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Perofsky A. 2024. aperofsky/H3N2_Antigenic_Epi: Initial release (v1.0.0) Zenodo. [DOI]

    Supplementary Materials

    Figure 2—source data 1. A/H3 sequence counts in five subsampled datasets.

    We downloaded all H3 sequences and associated metadata from the GISAID EpiFlu database and focused our analysis on complete H3 sequences that were sampled between January 1, 1997, and October 1, 2019. To account for variation in sequence availability across global regions, we subsampled the selected sequences five times to representative sets of no more than 50 viruses per month, with preferential sampling for North America. Each month up to 25 viruses were selected from North America (when available) and up to 25 viruses were selected from nine other global regions (when available), with even sampling across the other global regions (China, Southeast Asia, West Asia, Japan and Korea, South Asia, Oceania, Europe, South America, and Africa).

    Figure 2—source data 2. A/N2 sequence counts in five subsampled datasets.

    We downloaded all N2 sequences and associated metadata from the GISAID EpiFlu database and focused our analysis on complete N2 sequences that were sampled between January 1, 1997, and October 1, 2019. To account for variation in sequence availability across global regions, we subsampled the selected sequences five times to representative sets of no more than 50 viruses per month, with preferential sampling for North America. Each month up to 25 viruses were selected from North America (when available) and up to 25 viruses were selected from nine other global regions (when available), with even sampling across the other global regions (China, Southeast Asia, West Asia, Japan and Korea, South Asia, Oceania, Europe, South America, and Africa).

    Supplementary file 1. GISAID accessions and metadata for influenza H3 and N2 sequences, including originating labs and submitting labs.
    elife-91849-supp1.xlsx (1.2MB, xlsx)
    MDAR checklist

    Data Availability Statement

    Sequence data are available from GISAID using accession ids provided in Supplementary file 1. Source code for phylogenetic analyses, inferred HI titers from serological measurements, and evolutionary fitness measurements are available in the GitHub repository https://github.com/blab/perofsky-ili-antigenicity (copy archived at Huddleston, 2024). The five replicate trees for HA and NA can be found at https://nextstrain.org/groups/blab/ under the keyword "perofsky-ili-antigenicity". Epidemiological data, datasets combining seasonal evolutionary fitness measurements and epidemic metrics, and source code for calculating epidemic metrics and performing statistical analyses are available at https://doi.org/10.5281/zenodo.11188848 and https://github.com/aperofsky/H3N2_Antigenic_Epi (copy archived at Perofsky, 2024). Raw serological measurements are restricted from public distribution by previous data sharing agreements.

    The following dataset was generated:

    Perofsky A. 2024. aperofsky/H3N2_Antigenic_Epi: Initial release (v1.0.0) Zenodo.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES