Abstract
Current vaccines provide limited protection against rapidly evolving viruses. For example, Centers for Disease Control and Prevention estimates show that the overall influenza vaccine effectiveness against outpatient illness in the United States averaged below 40% between 2012 and 2021. Moreover, the clinical outcomes of a vaccine can be assessed only retrospectively. Here we propose an in silico method named VaxSeer that predicts the antigenic match of vaccine candidates with circulating viruses, in the context of the viruses’ relative dominance in the future influenza season. Based on 10 years of retrospective evaluation using sequencing and antigenicity data, our approach consistently selects strains with better empirical antigenic matches to circulating viruses than annual recommendations. Finally, our predicted estimate of antigenic match exhibits a strong correlation with influenza vaccine effectiveness and reduction in disease burden, highlighting the promise of this framework to drive the vaccine selection process.
Subject terms: Machine learning, Epidemiology, Vaccines
By matching antigenicity prediction with a forecast of circulating strains in the next season, a model is shown to outperform standard recommendations for vaccine design in terms of vaccine effectiveness and disease burden in multiple evaluations throughout different influenza seasons.
Main
Twice a year, a panel of experts from the World Health Organization (WHO) gathers to recommend vaccine strains for the upcoming influenza seasons. To reduce the disease burden of influenza, the goal of the panel is to optimize ‘vaccine effectiveness’, which is defined as the reduction in the odds of influenza infection among vaccinated individuals compared to those who are not vaccinated1. After each season, influenza vaccine effectiveness is estimated by institutions such as the Centers for Disease Control and Prevention (CDC) through observational test-negative studies involving participants with medically attended outpatient illness2. If the vaccine strains that were selected turned out to be well aligned with the distribution of viral strains that were circulating during the influenza season, the effectiveness of an inactivated influenza vaccine may reach up to 40–60% in that season3. However, despite decades of research in prevention and surveillance, current influenza vaccines provide limited protection. CDC estimates from the US Influenza Vaccine Effectiveness Network reveal that overall vaccine effectiveness across subtypes and age groups dropped below 40% in five of the 10 years from 2012 to 2021 (refs. 4–12), with a concurrent rise in influenza-related hospitalization rates (Extended Data Fig. 1). For instance, during the 2014–2015 winter season, vaccine effectiveness was only 19%6.
Extended Data Fig. 1. Declining vaccine effectiveness, rising hospitalization rates in the U.S., and antigenic drift related to influenza.
(a) The vaccine effectiveness is obtained from the CDC, which represents the overall effectiveness across all influenza subtypes (excluding 2020 due to unavailable data). (b) The hospitalization rates are obtained from the Influenza Hospitalization Surveillance Network (FluSurv-NET)85. The 2019 and 2020 seasons are excluded due to the incomplete data influenced by SARS-CoV-2. (c) Antigenic drift of A/H3N2 between the period of vaccine selection and the following Northern Hemisphere Influenza Season. Prior to the vaccine strain selection, the prevailing clade was 3C.2a2. However, in the subsequent influenza season, the dominant clades shift to 3C.2a1b.1, 3C.3a.1, and 3C.2a1b.2. The density of each clade is calculated by aggregating the occurrence of HA sequences belonging to that clade. HA sequences are obtained from GISAID, and their clades are annotated by Nextclade86.
Many factors influence vaccine effectiveness, including the population’s immune history and the platforms and formulations used in vaccine production13–18. However, the antigenic match of vaccine strains to circulating viral strains is particularly important. This work focuses on improving this antigenic match, a necessary condition for vaccine effectiveness11,15,19.
Antigenic match can be assessed using two types of data. First, surveillance datasets such as the Global Initiative on Sharing All Influenza Data (GISAID)20 gather the sequences of viral proteins along with their collection time, providing the distribution of viral genotypes in the season of interest. Based on data, we can compute a viral strain’s ‘dominance’, which is the frequency of its occurrence during a particular season. Second, ‘antigenicity’ data measure the inhibition capacity of antibodies produced by a vaccine to a specific viral strain. For example, in vitro assays such as hemagglutination inhibition (HI) tests21 using postinfection ferret antisera are conducted by WHO Collaborating Centres (WHO CCs) to quantitatively analyze the antigenicity of candidate vaccines to circulating viruses. Both data sources contribute to antigenic match. Specifically, we quantify a vaccine’s antigenic match as the average of its antigenicity across circulating viral strains, weighted by each of their dominance. In this paper, we refer to this measure of antigenic match as the ‘coverage score’.
Because inactivated influenza vaccines require a production period of 6–9 months22, we need to evaluate the candidate vaccine strains prospectively. However, viral strain distributions at the time of vaccine selection may differ from those in the upcoming influenza season10,23,24 (illustrated in Extended Data Fig. 1c). Moreover, it is prohibitively expensive to validate the antigenicity of all candidate vaccines against every virus potentially circulating in the future season, and there are limited viral specimens available for study. In vitro assays such as the HI test21 can be run on only a limited number of vaccine candidates, typically fewer than 10 (ref. 25).
We hypothesize that the coverage score can be prospectively predicted using machine learning models trained on the aforementioned data from past seasons. In this paper, we propose VaxSeer, which integrates predictors for two components of the coverage score: one for dominance and another for antigenicity. Because the hemagglutinin (HA) protein plays a critical role in viral infection and immune response, we represent vaccine and virus strains solely through their HA protein sequences26,27. Our dominance predictor estimates the probability of an input HA protein sequence occurring in a given season. Dominance changes over time due to the competition between viral strains, and the rate of its change depends on the combined effect of all mutations in the protein sequence. Traditional epidemiological studies estimate the rate of change as the sum of independent contributions from single amino acid mutations28–37. These approaches may not fully capture higher-level properties, such as protein stability or host interactions, which depend on complex interactions between residues across the protein sequence38. In this study, we leveraged protein language models and an ordinary differential equation (ODE) to automatically capture the relationship between protein sequences and their dominance. Although existing protein language models assume a static fitness landscape39–43, our approach accounts for dynamic dominance shifts, making it better suited for rapidly evolving viruses such as influenza. In addition to the dominance prediction model, we also developed an antigenicity prediction model, which takes as input a pair of protein sequences (virus and vaccine) and outputs their predicted HI test outcome. Our antigenicity prediction model enables the in silico prediction of HI test outcomes for any vaccine–virus pair, reducing the need for time-consuming antigenicity experiments.
Using our model, we conducted a 10-year retrospective evaluation on two influenza subtypes: A/H3N2 and A/H1N1. We examined the correlation between the coverage score and real-world vaccine effectiveness estimated by three independent institutions across the United States, Europe and Canada as well as the number of symptomatic illnesses and medical visits averted by influenza vaccines in the United States, as estimated by the CDC. We then studied the ability of VaxSeer to prospectively identify vaccine strains with high antigenic match. In particular, we evaluated our proposed vaccine strains against the WHO’s annual recommendations, based on their empirical coverage scores, computed retrospectively from observed dominance and experimental HI data. Overall, our evaluations highlight the potential of VaxSeer to aid vaccine selection by prioritizing high-antigenic-match candidates from vast viral pools for resource-intensive laboratory or clinical validation.
Results
Overview of VaxSeer
Selecting vaccine strains based on predicted coverage score
To rank vaccine strains during the selection process, we predict their coverage scores with an in silico approach (Fig. 1). Given a set of circulating viral proteins and a set of candidate vaccine proteins, the dominance predictor estimates the expected dominance of each viral protein in the upcoming season. In addition, the antigenicity predictor uses a pairwise sequence alignment of viral and vaccine proteins as input and predicts the corresponding HI test results. Together, these predictions provide the predicted coverage score for each candidate vaccine by averaging its predicted antigenicity across multiple circulating viruses, weighted by their predicted dominance.
Fig. 1. VaxSeer features a two-track model for predicting the coverage score.
For each circulating virus strain, its HA protein sequence is fed into the dominance predictor, which outputs the dominance of this protein at the forecasting time (for example, the next influenza season). For a candidate vaccine strain, its HA protein and that of a circulating virus strain are aligned and input into the antigenicity predictor to obtain the antigenicity (quantified by HI values). The predicted coverage score of a candidate vaccine strain is then calculated by averaging its antigenicity across multiple circulating viruses, weighted by their respective dominance.
Training dominance predictor
The dominance predictor learns the relationship between HA sequences and the change of their dominance over time, enabling more accurate predictions of the future viral landscape. We train our dominance predictor using the dataset of protein sequences collected before the vaccine selection time, along with their respective collection dates. For each protein sequence, two language models44 predict the initial dominance and its rate of change, which are then used in an ODE to derive its dominance at the collection time. These two models are trained by aligning the predicted dominance with the actual protein distributions.
Training antigenicity predictor
The antigenicity predictor learns to predict the HI test results from vaccine and virus proteins. We train the antigenicity predictor using HI test data21 for vaccine–virus pairs, with both vaccine and viral proteins collected before the vaccine selection time. The HI test results are inversely proportional to the amount of vaccine-induced antibodies that can inhibit a specific virus strain. Because antigenicity depends on both the vaccine and virus strains, the antigenicity predictor takes the pair of HA sequences as input. To model relationships between locations both within a single sequence and across the pair of sequences, the antigenicity predictor is based on neural network architectures used to encode protein multiple sequence alignments (MSAs)45. The model is trained by regressing the predicted antigenicity with the actual antigenicity. Additional details on the dominance and antigenicity predictors are provided in the Methods.
Experimental setup
Data construction
We conducted retrospective validation of VaxSeer on the A/H1N1 and A/H3N2 subtypes of influenza. To quantify viral dominance during each season, we downloaded influenza HA protein sequences with associated collection times from the GISAID20. For antigenicity measurements, we collected HI test results based on postinfection ferret antisera from the reports of a WHO CC (Francis Crick Institute) prepared for the annual consultations on the composition of influenza vaccines from 2003 to February 2023.
To validate the correlation between coverage scores and disease outcomes, we gather influenza vaccine effectiveness for historically used vaccines from annual CDC publications4–11. This vaccine effectiveness is estimated based on patients with medically attended outpatient acute respiratory illness (ARI) in the US Influenza Vaccine Effectiveness Network. We also gathered the annual number of influenza illnesses and medical visits averted by the influenza vaccine in the United States, as estimated by the CDC46,47. For the analysis of vaccine effectiveness, we excluded the vaccine strains without subtype-specific effectiveness. We also excluded the years 2020 and 2021 to avoid potential bias caused by SARS-CoV-2 vaccines48,49, resulting in 10 past recommended vaccines with available effectiveness data for two subtypes. For additional comparisons, we also include vaccine effectiveness estimated from two other sources: Influenza Monitoring Vaccine Effectiveness (I-MOVE) in Europe50–55 and the Sentinel Practitioner Surveillance Network (SPSN) in Canada56–63. I-MOVE estimates influenza vaccine effectiveness based on patients with ARI or influenza-like illness (ILI) who present to a general practitioner across multiple centers in Europe. SPSN estimates vaccine effectiveness based on patients with ILI who present to community-based sentinel practitioners in Canada. Throughout the ‘Results’, vaccine effectiveness refers to CDC estimates, unless explicitly stated otherwise (for example, SPSN or I-MOVE). Further details on effectiveness estimates are available in the Methods and in Extended Data Table 1.
Extended Data Table 1.
Vaccine effectiveness estimates from the CDC in the United States, SPSN in Canada and I-MOVE in Europe, summarized from related publications during the 2012–2021 evaluation seasons in this study
Adjusted subtype-specific vaccine effectiveness (VE) across all age and gender groups, obtained from CDC4–12, SPSN56–63,87 and I-MOVE50–55 publications, is shown. Subtypes without available VE data are marked with ‘-’. For the CDC, overall adjusted VE across all subtypes (A/H3N2, A/H1N1, B/Yamagata and B/Victoria) is also included. No VE data were available for 2020–2021 due to low influenza circulation during the SARS-CoV-2 pandemic.
Evaluation metric
The main challenge in evaluating different vaccine selections is the lack of data on their real-world effectiveness. After all, we can only measure vaccine effectiveness and its impact on clinical endpoints for vaccines that were actually administered to the population. Instead, we propose to use a surrogate measure, the ‘empirical’ coverage score, to counterfactually evaluate vaccine strains that were not selected in past seasons. The empirical coverage score is an empirical approximation of the ground truth coverage score, which is computed retrospectively using observed frequencies of viral strains from surveillance data (GISAID) and experimental HI test results from WHO CCs. Although reliant on retrospective data, the empirical coverage score can be computed for any candidate vaccine strains that have been adequately tested with HI assays in our evaluation. In addition, although antigenic match is only one of many factors, the empirical coverage score is highly correlated with vaccine effectiveness for strains with available data from the CDC, exhibiting a Pearson correlation of 0.895 and a Spearman rank correlation of 0.976, with both P < 0.001 (Fig. 2a, blue). The positive correlation between empirical coverage scores and vaccine effectiveness is also observed across independent European and Canadian effectiveness studies (Extended Data Fig. 2). The empirical coverage score also demonstrates a positive correlation with the disease burden (symptomatic illnesses and medical visits) prevented by the vaccines in the United States, as estimated by the CDC (Extended Data Fig. 3a,b). The definitions of vaccine effectiveness, empirical coverage score and predicted coverage score are distinguished in Extended Data Table 2.
Fig. 2. VaxSeer selecting vaccine strains with better antigenic match.
a, Both empirical and predicted coverage scores exhibit positive correlations with real-world vaccine effectiveness, as estimated by the CDC (two-sided Spearman rank correlation ρ with P = 1.46 × 10−6 for empirical and P = 0.0005 for predicted coverage scores, two-sided Pearson correlation r with P = 0.0005 for empirical and P = 0.0014 for predicted coverage scores and linear regression slope m). The empirical coverage scores serve as a surrogate metric for evaluating vaccine selection due to its strong correlation with vaccine effectiveness. b, Vaccine strains selected by VaxSeer have higher empirical coverage scores than those recommended by the WHO for A/H3N2. The statistical significance of this improvement is confirmed by a one-sided Wilcoxon signed-rank test (P = 4.1 × 10−5). VaxSeer frequently selects the strains with the highest empirical coverage scores. Violin plots depict the distribution of predicted coverage scores among all candidate vaccines, including those with low experimental coverage. c,d, The predicted antigenicity (HI values) of circulating A/H3N2 strains in the 2019 Northern Hemisphere season, with respect to the vaccine recommended by the WHO (c) and VaxSeer (d). Our recommended vaccine covers a larger variety of circulating viruses (3C.2a1b.1a/b and 3C.2a1b.2b/a) than the WHO’s recommendation (3C.3a1). The red circle encompasses viruses from clade 3C.3a1, whereas the yellow and orange circles represent viruses from clade 3C.2a1b.1a/b and clade 3C.2a1b.2b/a, respectively. e, The phylogenetic tree of A/H3N2 during and after the 2019 Northern Hemisphere season, constructed by nextflu78 (version dated 21 November 2024). Each dot represents a viral strain, and black crosses indicate vaccine strains selected by the WHO. The vaccine strain chosen for the 2019 winter season is marked with an arrow. VaxSeer’s recommended vaccine strain covers the dominant clades (3C.2a1b.2a/b and 3C.2a1b.1a/b). Furthermore, clade 3C.2a1b.2a continued to expand in subsequent seasons, as depicted by the upper right gray branches. All pred, predicted coverage scores for all candidate vaccine strains.
Extended Data Fig. 2. The correlation between coverage scores and vaccine effectiveness (VE) estimated by the Influenza-Monitoring Vaccine Effectiveness (I-MOVE) in Europe and Sentinel Practitioner Surveillance Network (SPSN) in Canada.
The two-sided Pearson correlation R and two-sided Spearman rank correlation ρ, as well as their P-values, are illustrated. (a, b) Vaccine effectiveness is estimated by I-MOVE based on patients with acute respiratory illness or influenza-like illness consulting a general practitioner across multiple centers in Europe50–55. Both the empirical and predicted coverage scores exhibit positive and statistically significant correlations with vaccine effectiveness (P < 0.05)84. (c, d) SPSN estimates VE based on patients with influenza-like illness presenting to community-based sentinel practitioners in Canada56–63. The empirical coverage score shows strong and statistically significant correlations with VE, with both R and ρ exceeding 0.8 and P < 0.0584. The predicted coverage score also demonstrates a positive and statistically significant Spearman rank correlation with VE (P < 0.05). The larger P-value (0.084) observed for the Pearson correlation may be attributed to a potential non-linear relationship between the antigenic match and VE29.
Extended Data Fig. 3. Correlations between coverage scores and disease burden prevention, and comparison of A/H1N1 vaccine strains selected by VaxSeer versus WHO using empirical coverage scores.
(a, b) The empirical coverage score is correlated with disease burden prevention, as reflected by the estimated number of medical visits (a) and influenza illnesses (b) averted by vaccination in the U.S., according to CDC. Enhancing the antigenic match, as measured by the empirical coverage score, may help reduce disease burden. (c) Vaccine strains selected by VaxSeer exhibit higher empirical coverage scores than those recommended by the WHO for A/H1N1. VaxSeer frequently selects the strains with the highest empirical coverage scores. Violin plots depict the distribution of predicted coverage scores among all candidate vaccines collected in the previous three years, including those with low experimental coverage. (d) The predicted coverage score is correlated with the estimated number of influenza illnesses averted by vaccination in the U.S., according to CDC. Two-sided Pearson correlations (R) and corresponding P-values (P) are illustrated in the text.
Extended Data Table 2.
Glossary of terms used throughout the paper
Retrospective evaluation settings
Our evaluation spans the 2012–2021 winter seasons, starting in October of each year and continuing to March of the next year. We conduct evaluations only during the winter seasons due to the availability of vaccine effectiveness data. In line with the WHO’s recommendation schedule, we train our models on strains collected and HI tests conducted up to 8 months prior to the start of the season (before 1 February). To construct the sets of candidate vaccine strains and circulating virus strains, we consider all virus strains that were isolated at least five times within the past 3 years, similar to strategies for vaccines recommended by the WHO. For example, to evaluate the 2021–2022 winter season (October 2021 to March 2022), we train our models on any data collected before 1 February 2021, and we predict coverage scores for candidate vaccine strains isolated between February 2018 and February 2021. To ensure the accurate assessment of empirical coverage scores for those candidate vaccine strains, we only evaluate vaccines with HI test results against at least 40% of circulating viral sequences during the season of interest. A total of 51 candidate vaccines for A/H3N2 and 50 candidate vaccines for A/H1N1 were subject to comparison. The predicted and empirical coverage scores, along with the percentages of viral samples included in the calculation of empirical coverage score, are presented in Supplementary Table 1. Further details on data construction and evaluation settings are provided in the Methods.
Vaccine selection based on predicted coverage score
As illustrated in Fig. 2b and Extended Data Fig. 3c, the vaccine strains selected based on our models’ predicted coverage scores outperform the WHO’s recommendations in six out of 10 years for A/H1N1 and in nine out of 10 years for A/H3N2, when evaluated using empirical coverage scores. For the majority of remaining years, our recommendations are similar to those of the WHO. Overall (two subtypes, 10 years), our predictions achieve a statistically significant improvement in empirical coverage scores over the WHO’s recommendations (one-sided Wilcoxon signed-rank test, P = 4.1 × 10−5). In fact, VaxSeer successfully recommends the ‘best’ vaccine strain (top empirical coverage score) in seven out of 10 years for A/H1N1 and in five out of 10 years for A/H3N2, whereas the WHO’s recommendation matches the best vaccine strain in just three out of 10 years for A/H1N1 and in zero out of 10 years for A/H3N2. Due to experimental constraints of WHO CCs, only a subset of candidate vaccines is tested broadly (over 40% of influenza viruses), allowing for the assessment of their empirical coverage scores. Thus, the strains selected in Fig. 2b and Extended Data Fig. 3c only reflect these limited choices of candidate vaccines. If we consider the distributions of predicted coverage scores over all candidate strains circulating in the previous 3 years (violin plots in Fig. 2b and Extended Data Fig. 3c), there are strains that scored even higher but were not subjected to sufficient experimental validation by WHO CCs. This highlights the possibility that there may exist even more effective vaccine strains waiting to be discovered.
In certain instances, our model can anticipate suitable vaccine strains a year in advance of the WHO. For example, in the 2016 winter season for A/H1N1, although the WHO recommendation (A/California/07/2009) has a decent empirical coverage score, VaxSeer proposes an alternative with an even higher score (A/Michigan/45/2015, collected earliest on 7 September 2015). Promisingly, A/Michigan/45/2015 was recommended by the WHO for the subsequent winter season.
To visualize the breadth in coverage of different vaccine strains, we plot the predicted antigenicity of A/H3N2 strains circulating in the 2019 winter season. The strain recommended by the WHO aligns with one emerging clade, 3C.3a1 (Fig. 2c). By contrast, the strain recommended by VaxSeer offers a complementary antigenic profile, mainly covering the 3C.2a1b.1a/b and 3C.2a1b.2b/a clades (Fig. 2d). Although the WHO selects the vaccine covering the new clade54, our model tends to select the vaccine strain that is effective against the majority of circulating clades and the clade undergoing further expansion (3C.2a1b.2a), as depicted in the phylogenetic tree (Fig. 2e).
Correlation of predicted coverage score with vaccine effectiveness and clinical endpoints
Correlation with vaccine effectiveness
As shown in Fig. 2a (orange), the predicted coverage scores significantly correlate with vaccine effectiveness against outpatient ARI, as estimated by the CDC (r = 0.861, P = 0.0014 and ρ = 0.891, P = 0.0005). This correlation is consistently observed with vaccine effectiveness estimates from I-MOVE in Europe and SPSN in Canada, as shown in Extended Data Fig. 2. Further analyses in Fig. 3a and Extended Data Fig. 4a highlight the superior performance of our approach that combines antigenicity and dominance, along with accurate future dominance predictions, compared to alternative metrics based solely on viral dominance or antigenicity.
Fig. 3. The predicted coverage score is correlated with vaccine effectiveness, reduction of disease burden and empirical coverage score.
a, The two-sided Pearson correlation and associated P values between vaccine effectiveness and various scoring strategies for the vaccine strains selected by the WHO from 2012 to 2019 for A/H1N1 and A/H3N2. Evaluation was conducted on n = 10 data points, where each point represents a vaccine strain selected by the WHO for a specific year and subtype, with corresponding vaccine effectiveness estimates from the CDC. Our predicted coverage score shows the strongest correlation. b, The vaccines with higher effectiveness (>40) have higher predicted coverage scores than vaccines with lower effectiveness (≤40). The medians of the coverage scores are depicted by the center lines within the boxes. The box spans from the first to the third quartile, and the whiskers cover 1.5 times the interquartile range. Outliers beyond the whiskers are represented by individual marks. The P value, calculated by a one-sided independent t-test, is illustrated. c, Correlations between the predicted coverage scores and the estimated number of influenza medical visits averted by vaccination in the United States. Two-sided Pearson correlations (r) and corresponding P values (P) are illustrated in the text. d, Two-sided Spearman rank correlation and the associated P values between the empirical coverage scores and predicted coverage scores based on different dominance predictors39,42, evaluated across candidate vaccine strains of A/H1N1 (50 strains) and A/H3N2 (51 strains) from 2012 to 2021. Each data point represents a distinct strain for a specific year and subtype. Given our antigenicity predictor, our dominance model achieves the best performance. predict, predicted.
Extended Data Fig. 4. Correlations between vaccine effectiveness and various scoring strategies for the vaccine strains selected by WHO from 2012 to 2019 for A/H1N1 and A/H3N2.
(a) The two-sided Spearman rank correlation and associated P-values are illustrated. Our predicted coverage score shows the strongest correlation. (b, c) Vaccine effectiveness of past WHO-recommended vaccines plotted against two baseline scoring metrics: (b) Dominance of vaccine strain (last season): the frequency of the vaccine strain occurring during the previous influenza season; (c) Average antigenicity: the average HI values against a set of expert-selected circulating viruses. The two-sided Pearson correlation R and two-sided Spearman rank correlation ρ and their P-values are illustrated. The dominance of the vaccine strain shows no significant correlation with effectiveness, as vaccines with low dominance exhibit a wide range of effectiveness, highlighting the need to model their antigenicity with circulating viruses. Average antigenicity is less predictive in the cases of lower vaccine effectiveness, emphasizing the importance of considering future viral dominance in the presence of antigenic drift.
First, without accounting for antigenicity, the dominance of vaccine strains in the last season (‘Dominance of vaccine (last season)’ in Fig. 3a) is poorly correlated with vaccine effectiveness (Pearson correlation r = 0.4920 and P = 0.15), as further detailed in Extended Data Fig. 4b. Second, using antigenicity without considering the future dominance of viruses also yields suboptimal results. The baseline ‘average antigenicity’ method, which simply calculates the average HI value for a set of expert-selected viruses, demonstrates inferior correlation (Extended Data Fig. 4c). By contrast, our predicted coverage score achieves the highest correlation with vaccine effectiveness, emphasizing the need to model both antigenicity and future dominance.
Furthermore, ranking vaccines by their predicted coverage score allows us to distinguish between past vaccines with higher (≥40) and lower (<40) effectiveness (Fig. 3b). Vaccines with higher effectiveness exhibit significantly greater predicted coverage scores, as confirmed by a one-sided independent t-test (P = 0.026). Here, we set 40% as the threshold for vaccine effectiveness, as it is generally considered a lower bound of effectiveness when influenza vaccine strains are antigenically similar to circulating influenza viruses3.
Correlation with reduction of disease burden
Beyond effectiveness, we further analyze the association between predicted coverage score and the reduction of disease burden to showcase the potential impact of our approach. The CDC estimates the reduction of disease burden using the number of medical visits (Fig. 3c) and influenza illnesses (Extended Data Fig. 3d) averted by vaccination in the United States46,64. We combined the predicted coverage scores from two subtypes, weighted by the ratio of infections for each subtype, because the reduction in disease burden data does not differentiate between subtypes. The year 2020 was excluded due to the lack of available information. As shown in Fig. 3c and Extended Data Fig. 3d, our predicted coverage scores show positive alignment with the number of averted symptomatic illnesses and medical visits (Pearson correlations of 0.6858 and 0.6993, respectively, P < 0.05).
Performance of the dominance predictor
We next evaluate our dominance prediction model in isolation. With the goal of optimizing the empirical coverage scores for vaccines, the dominance prediction model’s performance is evaluated based on its accuracy in predicting the empirical coverage scores, in combination with a fixed antigenicity model. To explore the effect of modeling changes in dominance for VaxSeer, we compare it with the following baselines that models static viral dominance and are trained on the same data. We first compare against a baseline that defines dominance based on the empirical frequencies calculated from the previous season (‘Last’). This baseline assumes that the distribution of variants does not change between seasons. We further conduct a comparison with three state-of-the-art machine learning models, LM40, CSCS39 and EVEscape42, which learn static fitness landscape for proteins and have shown potential in predicting virus fitness and escapability39,41,42 (see Methods for details). By contrast, we define the evolution of dominance by a time-dependent ODE parameterized by language models44.
Our results show that our proposed dominance predictor achieves the highest correlation (Fig. 3d) and the lowest error (Extended Data Fig. 5a,b) with the empirical coverage score, in conjunction with our antigenicity predictor, highlighting the importance of using a temporal model. We also demonstrate the superior performance of our dominance predictor through its ability to identify future dominant sequences, as shown in Extended Data Fig. 5c. Extended Data Fig. 5d and Extended Data Table 3 also present a comparison of antigenicity models.
Extended Data Fig. 5. Additional evaluation results for antigenicity and dominance predictors.
(a, b) Root Mean Squared Error (a, lower is better) and Mean Absolute Error (b, lower is better) between the empirical coverage scores and predicted coverage scores based on different dominance predictors for A/H1N1 and A/H3N2 from 2012 to 2021. Given our antigenicity predictor, our dominance predictor achieves the best performance. The P values, shown in parentheses, are calculated using a one-sided Wilcoxon signed-rank test comparing our model’s errors with those of baseline models, adjusted for multiple comparisons using the Benjamini-Hochberg (False Discovery Rate). (c) The frequency of occurrence of the top 20% of sequences. The top-20% most dominant sequences are identified by different dominance prediction methods. The frequencies of occurrence of these sequences in the subsequent influenza season are calculated based on sequence data available in GISAID. We limited our evaluation to A/H3N2 from 2014 to 2019 (n = 6 evaluations) due to data sparsity for other subtypes and years. The medians of the frequencies are depicted by the center lines within the boxes. The box spans from the first to the third quartile, and the whiskers cover 1.5 times the interquartile range. Outliers beyond the whiskers are represented by individual marks. The P-values are calculated by the one-sided Wilcoxon signed-rank test, adjusted for multiple comparisons using the Benjamini-Hochberg (False Discovery Rate). Our model outperforms other baselines with statistical significance. (d) Two-sided Spearman rank correlation (with P value) between the predicted coverage scores and empirical coverage scores for A/H3N2 and A/H1N1, from 2012 to 2021 seasons. The combination of our dominance predictor and antigenicity predictor achieves the best performance.
Extended Data Table 3.
MAE of HI predictions on the heldout test set
The MAE for predictions on a heldout test set (spanning 2012–2021) across two subtypes demonstrates that VaxSeer achieves the best performance (with ‘*’ indicating that lower values are better). To confirm the significance of this improvement, P values were computed using a one-sided Wilcoxon signed-rank test, adjusted for multiple comparisons using the Benjamini–Hochberg method (false discovery rate), comparing VaxSeer’s MAE against those of other baseline models.
Discussion
Vaccines are an important defense against infectious diseases. However, in the presence of continual evolution, it is challenging to forecast the future landscape of viral strains and assess the antigenicity of candidate vaccines at scale. Here we propose VaxSeer, an in silico framework that ranks vaccine strains based on their predicted coverage score—a prospective estimation of a vaccine’s antigenic match with future viruses. We consider future dominance of viruses in the coverage score, which is predicted by a dominance predictor. This predictor expresses the change in dominance over time using an ODE, with parameters estimated by protein language models, enabling the prediction of dynamic dominance based on the entire protein sequence. Due to the observational nature of public health data, we validated VaxSeer through computational surrogates (empirical coverage score) rather than actual vaccine effectiveness from population-based trials. However, the strong correlation between the empirical coverage scores and real-world effectiveness suggests that VaxSeer has the potential to help select vaccine strains with improved effectiveness. Our model can contribute to influenza vaccine selection in two ways. First, it provides a complementary perspective to existing antigenic measurements by specifically considering future viral distributions. Thereby, it can be used as an additional information source during the selection process. Second, as a cost-effective and computationally efficient in silico tool, our model enables rapid screening of large viral strain pools to identify the most promising candidates. These candidates can then be prioritized for further resource-intensive validation through traditional laboratory techniques such as two-way tests or clinical trials.
Although various factors contribute to the ultimate effectiveness of a vaccine, our study focuses on antigenic match. We do not address other factors related to vaccine production or hosts that impact vaccine selection, such as dosage18,65, adjuvant choices16, vaccine platform17, egg adaptations in the production process66, vaccination timing67 or host immune history13–15. Despite not accounting for these factors, our predicted coverage score still demonstrates a reasonable correlation with vaccine effectiveness, highlighting its utility in improving antigenic match for vaccine selection. Moreover, our approach can be integrated with methods that consider these additional factors to offer a more holistic assessment of vaccine effectiveness.
When estimating the antigenic match, the sequencing data used in our study may be subject to sampling biases, leading to challenges in accurately determining strain distributions. For instance, the geographic distribution of sequences is uneven across the globe, and sequencing strategies might change over time (see Methods for details). In recent years, there has been an increasing preference for sequencing directly from original clinical specimens rather than from egg-passaged or cell-passaged viruses (Extended Data Fig. 6a), because viruses may mutate during the passage process68. To evaluate the impact of these biases on empirical coverage scores, we examined two factors: geographical location and passage history. As shown in Extended Data Fig. 7, the empirical coverage scores derived from viral samples across different locations and passages remain consistent, suggesting that vaccine antigenic profiles are relatively robust to sampling biases in viral sequences. Nevertheless, acquiring more unbiased sequencing data is beneficial for improving the accuracy of our predictions.
Extended Data Fig. 6. Shifts in the distribution of viral passage histories in GISAID, and the frequency of new HA sequences occurring in each winter season for A/H3N2.
(a) The number of samples with different passage histories submitted to GISAID each year, showing an increasing preference for sequencing original specimens over passaged samples. (b) The frequency of samples with HA protein sequences not observed in previous years, calculated based on GISAID sequence data.
Extended Data Fig. 7. Consistency between the empirical coverage scores calculated from viruses collected in different geographical locations, and from virus subsets with varying passage annotations.
The two-sided Pearson correlation (R), two-sided Spearman rank correlation (ρ), and their corresponding P-values, along with the number of evaluated vaccines (n), are provided. Only vaccines with sufficient HI test data, covering over 40% of circulating viral sequences in the corresponding geographical areas or with corresponding passage annotations, are included. (a-f) The empirical coverage scores across different geographical regions are highly correlated with the global empirical coverage scores. (g, h) In the ‘All’ setting, the dominance of viruses is calculated using all protein sequences in the GISAID dataset. In the ‘Original’ setting, only sequences collected from original specimens are used to calculate the dominance. In ‘No eggs or cell lines’ setting, proteins from viruses passaged in egg or cell lines (for example MDCK/SIAT) are excluded. The empirical coverage scores calculated from all viral samples are highly consistent with those derived from subsets with different passage annotations.
In the present study, the antigenicity is estimated using HI assays conducted by one WHO CC (Francis Crick Institute) due to data availability. These data may be biased by the protocols adopted by this institute, and the HI assay itself has several inherent limitations. First, instead of measuring neutralization, it detects the binding of antibodies to HA proteins by observing the inhibition of hemagglutination. Thus, applying HI assays to recent H3 variants is problematic, as these strains exhibit reduced hemagglutination activity69. Second, the antibodies are obtained from postinfection ferret antisera70, which may not align with the immune responses in human populations13–15. Third, the passage histories of vaccines and viruses can impact HI test results68,71. Despite these limitations, HI assays remain useful for approximating antigenicity, as empirical coverage scores derived from them demonstrate a significant correlation with vaccine effectiveness. Furthermore, our antigenicity predictor is able to learn meaningful antigenic features from noisy HI data, as evidenced by its accurate coverage score predictions. The accuracy of our method can be further improved by incorporating antigenicity data from more precise assays, such as standardized HI tests across multiple laboratories, advanced neutralization assays such as microneutralization72 or high-content imaging-based neutralization test (HINT)73 and passage strategies minimizing antigenic mutations74.
In our experiments, we limit candidate vaccine strains to those circulating in previous seasons, as the antigenicity predictor is trained on a limited number of existing vaccine strains. However, because our antigenicity predictor relies solely on protein sequences, it can, in principle, compute coverage scores for any vaccine, including those that have not yet been isolated or do not exist in nature. By expanding the diversity of vaccines in the antigenicity dataset, VaxSeer can explore a much larger vaccine space, making it a valuable tool for virtual screening, functional optimization and de novo vaccine design. Nonetheless, further exploration is needed to assess the ability of our model to generalize to vaccines that are substantially different from those in our training data.
We showcase the feasibility of our method in influenza owing to the extensive efforts in data construction for this virus. This approach can be applied to other pandemic viruses provided that sequence and antigenicity data are available. However, the neural network architecture of our model requires sufficient training data to perform effectively, so they may face limitations when applied to emergent or rare viruses. For example, validating this approach for SARS-CoV-2 is more challenging due to the limited public availability of antigenicity data. Nevertheless, the overall concept of computationally forecasting antigenicity and dominance remains viable with alternative architectures.
Multiple directions can further improve this work. First, the current implementation of VaxSeer only considers the influenza virus’s HA protein, whereas studies have shown that other proteins, such as neuraminidase, may also influence viral fitness and vaccine antigenicity75–77. Thus, we expect that modeling larger portions of a viral genome will provide a more complete picture for vaccine selection. Second, we only evaluated the performance of VaxSeer over vaccines with sufficient HI test data from WHO CCs. There exist vaccine strains with higher predicted coverage scores, but we lacked the experimental resources to validate their empirical coverage scores. Third, we predict antigenicity based solely on HA protein sequences. Integrating posttranslational modifications of HA proteins, such as glycosylation, could enhance the accuracy of antigenicity predictions71. Finally, our current approach only computes the predicted coverage score over observed viral sequences, without considering novel sequences that may appear. In fact, during a given influenza season, over 40% of HA sequences were unseen in the previous year (Extended Data Fig. 6b). To account for these emergent sequences, we could leverage our dominance predictor as a generative model and sample viral sequences given by the predicted future distribution. This would enable us to compute the predicted coverage score over both current and potential viral strains. Unfortunately, we do not have sufficient experimental data to validate the antigenicity of these novel viruses.
In summary, this study showcases the potential of machine learning to assist humans in the discovery of more effective vaccines.
Methods
Dataset description
Dominance
The training corpora of the dominance prediction model were obtained from the GISAID. We downloaded 394,090 HA sequences submitted before 2 March 2023. We retained only HA amino acid sequences from human hosts and with almost full length (with a minimum length of 553 amino acids). Among the sequences, approximately 67.5% contain gender information, with a nearly equal distribution between male and female. The geographical distribution of the samples is as follows: 34.8% from North America, 29.2% from Europe, 20.0% from Asia, 7.1% from Oceania, 4.9% from South America and 4.1% from Africa. The distribution of passage histories for the samples is as follows: 44.0% from original specimens, 31.8% from cell lines or eggs and 24.1% from unknown sources. The distribution of host ages is as follows: 2.7% are younger than 2 years, 13.3% are 2–8 years, 7.8% are 9–17 years, 20.9% are 18–49 years, 7.2% are 50–64 years, 9.7% are 65 years or older and 38.4% do not have available age information. We used HA sequences collected worldwide, consistent with the WHO’s global mandate to recommend a vaccine strain for each subtype. Additionally, we incorporated sequences from all passage histories, as a large portion of viruses collected in earlier years originated from cell line passages (Extended Data Fig. 6a).
Starting from October 2003, we discretized every 6 months into one season. Two subtypes, A/H1N1 and A/H3N2, are considered separately. For each subtype, the seasons with fewer than 100 HA samples are discarded. After the preprocessing, 28,546 non-repeated HA sequences are obtained for A/H3N2, and 23,736 non-repeated HA sequences are obtained for A/H1N1. The sequences were further split into training and testing sets based on collection time. Specifically, to predict the dominance of sequences in the winter season of a particular year (for example, test set October 2018 to April 2019), we train on sequence data collected before February of that year (for example, before February 2018).
During training, we randomly split the sequences into training and validation sets with a 9:1 ratio. The best model checkpoint is selected based on the validation loss. The number of samples used for training the dominance prediction model is provided in Supplementary Table 2.
Antigenicity
The HI test data were extracted from the published reports that have been prepared for the WHO annual consultation on the composition of influenza vaccines from 2003 to February 2023. The HI test data comprise the names of virus and vaccine strains, along with the dilution of antibodies leading to hemagglutination inhibition. We retrieved the HA sequences with respect to strain names from the GISAID. Strains for which HA sequences could not be found were excluded from the dataset. Twenty-six percent of A/H3N2 virus strains (1,700 out of 6,541) and 20% of A/H1N1 virus strains (1,398 out of 7,068) were excluded. Eleven percent of A/H3N2 vaccine strains (18 out of 169) and 5% of A/H1N1 vaccine strains (4 out of 77) were excluded. If multiple HA sequences corresponded to one strain name, we enumerated all the possible HA sequences. If a pair of vaccine–virus sequences had multiple HI test values, their geometric mean is used. After processing, we obtained the HI test results for 70,631 distinct vaccine–virus HA protein pairs for A/H1N1 and 63,299 distinct pairs for A/H3N2. For A/H1N1, a total of 3,068 distinct virus sequences and 109 distinct vaccine sequences were included. For A/H3N2, a total of 2,731 distinct virus sequences and 255 distinct vaccine sequences were included. The HA protein sequences of the vaccine and testing circulating virus were aligned by MMseqs2 (ref. 79).
The HI assay determines the highest dilution of antibodies capable of inhibiting virus binding to red blood cells21. A higher dilution indicates stronger antibody binding to the viruses. Following previous work80, we quantify antigenic similarity h(v, x) by the relative (logarithmic) dilution:
| 1 |
where HI(v, x) is the highest dilution factor (in folds) of serum containing antibodies from vaccine v that can inhibit viral strain x. HI(v, v) is the dilution against the reference virus v used to produce vaccine v. A higher h(v, x) indicates a higher antigenic similarity between the vaccine strain v and the viral strain x.
When training each year’s HI predictor, we use only vaccine–virus pairs with sequences collected before 1 February. These pairs are further split into training, validation and test sets in an 8:1:1 ratio. The validation set is used to select the best model checkpoint, whereas the test set evaluates HI prediction error.
Evaluation
We provide details about the data in the evaluation tasks below.
- Vaccine effectiveness. To investigate the relationship between vaccine effectiveness and coverage scores, we built the test set including past recommended vaccines from 2012 to 2021. Vaccine effectiveness estimates of these past vaccines were obtained from three independent resources:
- CDC (United States): In the United States, influenza vaccine effectiveness is estimated through the US Influenza Vaccine Effectiveness Network. The network enrolls participants (aged at least 6 months) presenting with ARI (new or worsening cough) at outpatient facilities, including emergency departments, within 7 days of symptom onset4–11.
- SPSN (Canada): The SPSN estimates vaccine effectiveness based on patients aged at least 1 year of age who present within 7 days of onset of ILI to community-based sentinel practitioners in Canada56–63. ILI is defined as the acute onset of fever and cough, accompanied by at least one additional symptom that includes sore throat, myalgia, arthralgia or prostration.
Empirical coverage score. The empirical coverage score is used in two evaluation tasks: assessing the relationship between the predicted coverage score and empirical coverage score and evaluating the antigenic match of our selected vaccines against the WHO’s recommended vaccines. We constructed a test set comprising candidate vaccines with available empirical coverage scores. For each year, we included vaccine strains whose protein sequences appeared at least five times in the past 3 years in the GISAID dataset and had HI test results covering at least 40% of the circulating viruses for that year. This results in 51 vaccine strains for A/H3N2 and 50 vaccine strains for A/H1N1.
Reduction of disease burden. To examine the correlation between coverage score and the reduction in disease burden, we constructed a test set comprising past recommended vaccines used from 2012 to 2022, excluding 2020 due to unavailable data. Data on the number of medical visits and influenza illnesses averted by these vaccinations were obtained from the CDC. Because the data did not differentiate between subtypes, we combined the coverage scores for A/H1N1 and A/H3N2, resulting in a total of nine data points.
Definition of coverage score
Let denote the set of circulating viruses, and let v denote the vaccine in question. The mathematical definition of ‘coverage score (CS)’ for a vaccine v during a future season t is
| 2 |
where pt(x) is the dominance (probability) of virus x in season t, and h(v, x) measures the antigenicity between vaccine v and virus x. The probability is normalized over the set , meaning that . Large values of pt(x) indicate that virus x has high dominance, and large values of h(v, x) indicate that vaccine v is effective against virus x. The vaccine candidates with the highest coverage scores are those recommended by our algorithm.
Dominance predictor
Our dominance predictor aims to model the time-resolved distributions of protein sequences. Given an HA protein sequence and a particular time as input, our dominance predictor outputs the probability (dominance) of this HA protein sequence occurring at that particular time.
Given an amino acid sequence x ∈ VL with length L (where V is the set of all possible amino acids, typically ∣V∣ = 20), inspired by the SIR81 model, we define the change in frequency of occurrence (un-normalized dominance) of a viral protein x (annotated as nt(x)) by the following ODE:
| 3 |
in which represents the rate of change in the dominance of sequence x, and describes the initial condition. By solving this ODE, we can obtain the frequency nt(x) at time t as
| 4 |
To obtain the dominance (probability) pt(x) of sequence x, we need to normalize nt(x) over the entire protein sequence space. However, directly calculating probabilities is impractical due to the vast number of amino acid combinations. Instead, we express the probability in terms of ‘autoregressive’ conditional probabilities44. That is, we propose to model the conditional probability
| 5 |
in which xi is the i-th residue in the sequence x. Both R(xi∣x<i; θ) and b(xi∣x<i; θ) are parameterized by the Transformer-based language model44. The probability of the complete protein sequence x can be represented as the product of conditional probabilities for each residue:
| 6 |
During the training process, we sample pairs of protein sequences and their collection times. The parameters θ of the language model are optimized based on the maximum likelihood estimation objective:
| 7 |
Architecture
We use a 12-layer GPT-2 model44 to parameterize the R(xi∣x<i, θ) and another 12-layer GPT-2 model to parameterize b(xi∣x<i, θ). The dominance predictors are trained for 100 epochs with a batch size of 16 and a learning rate of 1 × 10−5.
Implementation details for dominance prediction baselines
The baseline annotated as ‘Last’ defines dominance based on the empirical frequencies calculated from the previous season. We incorporated sequence data collected during the 6-month period prior to 1 February.
CSCS39 adopts the mutation probability and functional dissimilarity calculated from a masked protein language model to estimate protein fitness and escapability from antibodies.
EVEscape42 scores for individual mutations are found by combining three sources of information: a deep generative model for fitness prediction, structural information about the HA protein to estimate antibody binding potential (weighted contact number for each residue position) and chemical distances in charge and hydrophobicity between mutated and wild-type residues. We used the EVEscape42 codebase to train models on the same dataset as ours. The structures of H3 and H1 proteins were obtained from the Protein Data Bank, using 1RVX for A/H1N1 and 4FNK for A/H3N2.
LM models the dominance of sequences in a static manner. Similarly to equation (5), the likelihood of amino acid sequence x is factorized autoregressively without using the collection time information:
| 8 |
in which F(xi∣x<i; θ) is the logits output by a GTP-2 (ref. 44).
Antigenicity predictor
Our antigenicity predictor is constructed using a 12-layer MSA Transformer45, with a linear output layer to regress the HI values as defined in equation (1). The model is trained with a batch size of 32 and a learning rate of 1 × 10−5 for 150,000 steps.
In our ablation studies, we considered both experimentally derived and simple machine learning baselines. In the BLOSUM baseline, for the vaccine–virus pairs collected before 1 February and with available experimental HI results, we use their experimental results. For the vaccine–virus pairs without available experimental HI results, we search for the most similar vaccine–virus pairs collected before 1 February that have tested the HI values in the dataset and use the average of them as the estimated HI results. The similarity is calculated from the BLOSUM62 matrix82, and the similarity between two virus–vaccine pairs is the summation of the similarity between virus sequences and the similarity between vaccine sequences.
In addition, we also considered two machine learning baselines. LR+ (ref. 80) is a linear regression model whose input features are the amino acid substitution of virus and vaccine protein sequences as well as their amino acids in each position. Finally, CNN is a convolutional neural network that consists of two one-dimensional convolution layers with 64 channels and kernel sizes equal to 15 intervening by two one-dimensional maximum pooling layers. The input is the concatenation of two one-hot representations of HA protein sequences from vaccine and virus. We use the hyperparameters suggested in ref. 83.
Evaluation metrics
We used Pearson correlation and/or Spearman rank correlation84 as the primary metrics to evaluate the relationships between coverage scores and vaccine effectiveness, between coverage scores and the reduction of disease burden and between predicted coverage scores and empirical coverage scores.
We used Pearson correlation when both data series being tested were normally distributed, as determined by the Shapiro–Wilk test (P > 0.01). When the data were not normally distributed, we used Spearman rank correlation. In the analysis of correlations between vaccine effectiveness and coverage scores, we also reported Spearman rank correlation to highlight their monotonic relationship. For evaluating the correlations between predicted coverage scores and empirical coverage scores, we additionally present results using mean absolute error (MAE) and root mean squared error (RMSE) for a more comprehensive assessment.
We used the empirical coverage scores to evaluate the antigenic match of different vaccine strains. To assess the statistical significance, we performed a one-sided Wilcoxon signed-rank test. We chose one-sided over two-sided because our goal was to assess the advantage of VaxSeer over past methods according to the empirical coverage scores rather than merely identifying any difference.
As a complementary evaluation for dominance predictor, we compared the frequency of occurrence of the top 20% of sequences predicted by different dominance prediction methods. To assess the statistical significance of our model’s higher frequency, we performed a one-sided Wilcoxon signed-rank test.
As a complementary evaluation for the antigenicity predictor, we compared the MAE of HI values predicted by different antigenicity prediction methods. To assess the statistical significance of our model’s lower error, we conducted a one-sided Wilcoxon signed-rank test, adjusted for multiple comparisons using the Benjamini–Hochberg procedure (false discovery rate).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41591-025-03917-y.
Supplementary information
Candidate vaccine strains included in our evaluation set, with their GISAID accession IDs, predicted coverage scores, empirical coverage scores, total frequencies of circulating viruses with available HI data used in the empirical coverage calculation, year and subtype.
The number of samples in training and validation sets for each season and subtype. For dominance prediction, we report both the total number of samples (including repeating protein sequences) and the count of unique protein sequences (non-repeating) in parentheses. For antigenicity, we show the number of vaccine–virus sequence pairs used.
Acknowledgements
We acknowledge the researchers who generated and submitted influenza sequence data to the GISAID as well as the GISAID team for maintaining this invaluable public resource. We thank the WHO CCs, particularly the Worldwide Influenza Centre laboratory at the Frank Crick Institute, for their efforts in producing the HI data. We appreciate the contributions of the CDC, I-MOVE and the SPSN for their work in estimating vaccine effectiveness. The contributions of these organizations are essential to the feasibility of this study. We thank everyone in R.B.’s group for their feedback about the writing of the manuscript and their discussions about the project. This work was supported by the Defense Threat Reduction Agency (grant no. HDTRA12110013, to R.B.) and the Massachusetts Institute of Technology Jameel Clinic.
Extended data
Author contributions
W.S., J.W. and R.B. conceived the study. W.S. designed the methodology, developed the software and performed the experiments and validation. W.S., J.W., M.W. and R.B. wrote the initial draft of the manuscript. M.W. and W.S. designed the visualizations. R.B. provided key resources and supervised the project. All authors reviewed, edited and approved the final version of the manuscript for submission.
Peer review
Peer review information
Nature Medicine thanks Leif Sander, Amalio Telenti and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lorenzo Righetto, in collaboration with the Nature Medicine team.
Data availability
The HA sequences and their metadata, including collection time and strain name, are from the GISAID (https://gisaid.org/). The accession IDs for the proteins used in this study are available in our GitHub repository at https://github.com/wxsh1213/vaxseer/blob/main/data/gisaid/ha_acc_ids.csv. The HI results are collected from the Worldwide Influenza Centre laboratory at the Francis Crick Institute and are available in their annual and interim reports (https://www.crick.ac.uk/research/platforms-and-facilities/worldwide-influenza-centre/annual-and-interim-reports). The human influenza vaccine composition is from https://gisaid.org/resources/human-influenza-vaccine-composition/. The vaccine effectiveness in United States is from the study of the CDC: https://www.cdc.gov/flu-vaccines-work/php/effectiveness-studies/. The estimated disease burden averted by vaccination can be found at https://www.cdc.gov/flu/vaccines-work/burden-averted.htm.
Code availability
All models and code used for data processing, training and evaluating VaxSeer are publicly available at https://github.com/wxsh1213/vaxseer.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Wenxian Shi, Email: wxsh@mit.edu.
Regina Barzilay, Email: regina@csail.mit.edu.
Extended data
is available for this paper at 10.1038/s41591-025-03917-y.
Supplementary information
The online version contains supplementary material available at 10.1038/s41591-025-03917-y.
References
- 1.Jackson, M. L. & Nelson, J. C. The test-negative design for estimating influenza vaccine effectiveness. Vaccine31, 2165–2168 (2013). [DOI] [PubMed] [Google Scholar]
- 2.Chung, J. R. et al. Late-season influenza vaccine effectiveness against medically attended outpatient illness, United States, December 2022–April 2023. Influenza Other Respir. Viruses18, e13342 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Trombetta, C. M., Kistner, O., Montomoli, E., Viviani, S. & Marchi, S. Influenza viruses and vaccines: the role of vaccine effectiveness studies for evaluation of the benefits of influenza vaccines. Vaccines (Basel)10, 714 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.McLean, H. Q. et al. Influenza vaccine effectiveness in the United States during 2012–2013: variable protection by age and virus type. J. Infect. Dis.211, 1529–1540 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gaglani, M. et al. Influenza vaccine effectiveness against 2009 pandemic influenza A(H1N1) virus differed by vaccine type during 2013–2014 in the United States. J. Infect. Dis.213, 1546–1556 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zimmerman, R. K. et al. 2014–2015 influenza vaccine effectiveness in the United States by vaccine type. Clin. Infect. Dis.63, 1564–1573 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jackson, M. L. et al. Influenza vaccine effectiveness in the United States during the 2015–2016 season. N. Engl. J. Med.377, 534–543 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Flannery, B. et al. Influenza vaccine effectiveness in the United States during the 2016–2017 season. Clin. Infect. Dis.68, 1798–1806 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Rolfes, M. A. et al. Effects of influenza vaccination in the United States during the 2017–2018 influenza season. Clin. Infect. Dis.69, 1845–1853 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Flannery, B. et al. Spread of antigenically drifted influenza A(H3N2) viruses and vaccine effectiveness in the United States during the 2018–2019 season. J. Infect. Dis.221, 8–15 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Tenforde, M. W. et al. Effect of antigenic drift on influenza vaccine effectiveness in the United States—2019–2020. Clin. Infect. Dis.73, e4244–e4250 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Price, A. M. et al. Influenza vaccine effectiveness against influenza A(H3N2)-related illness in the United States during the 2021–2022 influenza season. Clin. Infect. Dis.76, 1358–1363 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Han, A. X., de Jong, S. P. & Russell, C. A. Co-evolution of immunity and seasonal influenza viruses. Nat. Rev. Microbiol.21, 805–817 (2023). [DOI] [PubMed] [Google Scholar]
- 14.Lewnard, J. A. & Cobey, S. Immune history and influenza vaccine effectiveness. Vaccines (Basel)6, 28 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Meijers, M. et al. Concepts and methods for predicting viral evolution. In Influenza Virus: Methods and Protocols (eds Yamauchi, Y. & Amorim, M. J.) 253–290 (Humana Press, 2025). [DOI] [PubMed]
- 16.Tregoning, J. S., Russell, R. F. & Kinnear, E. Adjuvanted influenza vaccines. Hum. Vaccin. Immunother.14, 550–564 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Soema, P. C., Kompier, R., Amorij, J.-P. & Kersten, G. F. Current and next generation influenza vaccines: formulation and production strategies. Eur. J. Pharm. Biopharm.94, 251–263 (2015). [DOI] [PubMed] [Google Scholar]
- 18.DiazGranados, C. A. et al. Efficacy of high-dose versus standard-dose influenza vaccine in older adults. N. Engl. J. Med.371, 635–645 (2014). [DOI] [PubMed] [Google Scholar]
- 19.Morris, D. H. et al. Predictive modeling of influenza shows the promise of applied evolutionary biology. Trends Microbiol.26, 102–118 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Shu, Y. & McCauley, J. GISAID: global initiative on sharing all influenza data—from vision to reality. Euro Surveill.22, 30494 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hirst, G. K. Studies of antigenic differences among strains of influenza A by means of red cell agglutination. J. Exp. Med.78, 407–423 (1943). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gerdil, C. The annual production cycle for influenza vaccine. Vaccine21, 1776–1779 (2003). [DOI] [PubMed] [Google Scholar]
- 23.Smith, D. J. et al. Mapping the antigenic and genetic evolution of influenza virus. Science305, 371–376 (2004). [DOI] [PubMed] [Google Scholar]
- 24.Gouma, S., Weirick, M. & Hensley, S. E. Antigenic assessment of the H3N2 component of the 2019–2020 Northern Hemisphere influenza vaccine. Nat. Commun.11, 2445 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Barr, I. G. et al. WHO recommendations for the viruses used in the 2013–2014 Northern Hemisphere influenza vaccine: epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. Vaccine32, 4713–4725 (2014). [DOI] [PubMed] [Google Scholar]
- 26.Krammer, F. The human antibody response to influenza A virus infection and vaccination. Nat. Rev. Immunol.19, 383–397 (2019). [DOI] [PubMed] [Google Scholar]
- 27.Gamblin, S. J. & Skehel, J. J. Influenza hemagglutinin and neuraminidase membrane glycoproteins. J. Biol. Chem.285, 28403–28409 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bush, R. M., Bender, C. A., Subbarao, K., Cox, N. J. & Fitch, W. M. Predicting the evolution of human influenza A. Science286, 1921–1925 (1999). [DOI] [PubMed] [Google Scholar]
- 29.Gupta, V., Earl, D. J. & Deem, M. W. Quantifying influenza vaccine efficacy and antigenic distance. Vaccine24, 3881–3888 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Steinbrück, L. & McHardy, A. C. Allele dynamics plots for the study of evolutionary dynamics in viral populations. Nucleic Acids Res.39, e4 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Neher, R. A., Russell, C. A. & Shraiman, B. I. Predicting evolution from the shape of genealogical trees. eLife3, e03568 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Łuksza, M. & Lässig, M. A predictive fitness model for influenza. Nature507, 57–61 (2014). [DOI] [PubMed] [Google Scholar]
- 33.Obermeyer, F. et al. Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness. Science376, 1327–1332 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Doud, M. B., Lee, J. M. & Bloom, J. D. How single mutations affect viral escape from broad and narrow antibodies to H1 influenza hemagglutinin. Nat. Commun.9, 1386 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Huddleston, J. et al. Integrating genotypes and phenotypes improves long-term forecasts of seasonal influenza A/H3N2 evolution. eLife9, e60067 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Steinbrück, L., Klingen, T. & McHardy, A. Computational prediction of vaccine strains for human influenza A (H3N2) viruses. J. Virol.88, 12123–12132 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Suzuki, Y. Selecting vaccine strains for H3N2 human influenza A virus. Meta Gene4, 64–72 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gong, L. I., Suchard, M. A. & Bloom, J. D. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife2, e00631 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hie, B., Zhong, E. D., Berger, B. & Bryson, B. Learning the language of viral evolution and escape. Science371, 284–288 (2021). [DOI] [PubMed] [Google Scholar]
- 40.Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun.13, 4348 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Frazer, J. et al. Disease variant prediction with deep generative models of evolutionary data. Nature599, 91–95 (2021). [DOI] [PubMed] [Google Scholar]
- 42.Thadani, N. N. et al. Learning from prepandemic data to forecast viral escape. Nature622, 818–825 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Han, W. et al. Predicting the antigenic evolution of SARS-COV-2 with deep learning. Nat. Commun.14, 3478 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Radford, A. et al. Language models are unsupervised multitask learners. OpenAIhttps://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf (2019).
- 45.Rao, R. M. et al. MSA Transformer. In Proc. of the 38th International Conference on Machine Learning 8844–8856 (PMLR, 2021).
- 46.Kostova, D. et al. Influenza illness and hospitalizations averted by influenza vaccination in the United States, 2005–2011. PLoS ONE8, e66312 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Tokars, J. I., Rolfes, M. A., Foppa, I. M. & Reed, C. An evaluation and update of methods for estimating the number of influenza cases averted by vaccination in the United States. Vaccine36, 7331–7337 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Price, A. M. et al. Influenza vaccine effectiveness against influenza A(H3N2)-related illness in the United States during the 2021–2022 influenza season. Clin. Infect. Dis.76, 1358–1363 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Doll, M. K., Pettigrew, S. M., Ma, J. & Verma, A. Effects of confounding bias in coronavirus disease 2019 (COVID-19) and influenza vaccine effectiveness test-negative designs due to correlated influenza and COVID-19 vaccination behaviors. Clin. Infect. Dis.75, e564–e571 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kissling, E. et al. I-MOVE multicentre case–control study 2010/11 to 2014/15: is there within-season waning of influenza type/subtype vaccine effectiveness with increasing time since vaccination? Euro Surveill.21, 30201 (2016). [DOI] [PubMed] [Google Scholar]
- 51.Kissling, E. et al. 2015/16 I-MOVE/I-MOVE+ multicentre case-control study in Europe: moderate vaccine effectiveness estimates against influenza A(H1N1)pdm09 and low estimates against lineage-mismatched influenza B among children. Influenza Other Respir. Viruses12, 423–437 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kissling, E. et al. Effectiveness of influenza vaccine against influenza A in Europe in seasons of different A(H1N1)pdm09 and the same A(H3N2) vaccine components (2016–17 and 2017–18). Vaccine X3, 100042 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kissling, E. et al. Interim 2018/19 influenza vaccine effectiveness: six European studies, October 2018 to January 2019. Euro Surveill.24, 1900121 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Rose, A. et al. Interim 2019/20 influenza vaccine effectiveness: six European studies, September 2019 to January 2020. Euro Surveill.25, 2000153 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Kissling, E. et al. Influenza vaccine effectiveness against influenza A subtypes in Europe: results from the 2021–2022 I-MOVE primary care multicentre study. Influenza Other Respir. Viruses17, e13069 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Skowronski, D. et al. Interim estimates of influenza vaccine effectiveness in 2012/13 from Canada’s sentinel surveillance network, January 2013. Euro Surveill.18, 20394 (2013). [DOI] [PubMed] [Google Scholar]
- 57.Skowronski, D. et al. Interim estimates of 2013/14 vaccine effectiveness against influenza A(H1N1)pdm09 from Canada’s sentinel surveillance network, January 2014. Euro Surveill.19, 20690 (2014). [DOI] [PubMed]
- 58.Skowronski, D. et al. Interim estimates of 2014/15 vaccine effectiveness against influenza A(H3N2) from Canada’s sentinel physician surveillance network, January 2015. Euro Surveill.20, 21022 (2015). [DOI] [PubMed] [Google Scholar]
- 59.Chambers, C. et al. Interim estimates of 2015/16 vaccine effectiveness against influenza A(H1N1)pdm09, Canada, February 2016. Euro Surveill.21, 30168 (2016). [DOI] [PubMed] [Google Scholar]
- 60.Skowronski, D. M. et al. Interim estimates of 2016/17 vaccine effectiveness against influenza A(H3N2), Canada, January 2017. Euro Surveill.22, 30460 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Skowronski, D. M. et al. Early season co-circulation of influenza A(H3N2) and B (Yamagata): interim estimates of 2017/18 vaccine effectiveness, Canada, January 2018. Euro Surveill.23, 18–00035 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Skowronski, D. M. et al. Interim estimates of 2018/19 vaccine effectiveness against influenza A(H1N1)pdm09, Canada, January 2019. Euro Surveill.24, 1900055 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Skowronski, D. M. et al. Interim estimates of 2019/20 vaccine effectiveness during early-season co-circulation of influenza A and B viruses, Canada, February 2020. Euro Surveill.25, 2000103 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Reed, C. et al. Estimating influenza disease burden from population-based surveillance data in the United States. PLoS ONE10, e0118369 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Palache, A. et al. Influenza vaccines: the effect of vaccine dose on antibody response in primed populations during the ongoing interpandemic period. A review of the literature. Vaccine11, 892–908 (1993). [DOI] [PubMed] [Google Scholar]
- 66.Gambaryan, A., Robertson, J. & Matrosovich, M. Effects of egg-adaptation on the receptor-binding properties of human influenza A and B viruses. Virology258, 232–239 (1999). [DOI] [PubMed] [Google Scholar]
- 67.Mylius, S. D., Hagenaars, T. J., Lugnér, A. K. & Wallinga, J. Optimal allocation of pandemic influenza vaccine depends on age, risk and timing. Vaccine26, 3742–3749 (2008). [DOI] [PubMed] [Google Scholar]
- 68.Skowronski, D. M. et al. Low 2012–13 influenza vaccine effectiveness associated with mutation in the egg-adapted H3N2 vaccine strain not antigenic drift in circulating viruses. PLoS ONE9, e92153 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Van Baalen, C. et al. Detection of nonhemagglutinating influenza A(H3) viruses by enzyme-linked immunosorbent assay in quantitative influenza virus culture. J. Clin. Microbiol.52, 1672–1677 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Fonville, J. M. et al. Antigenic maps of influenza A(H3N2) produced with human antisera obtained after primary infection. J. Infect. Dis.213, 31–38 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Zost, S. J. et al. Contemporary H3N2 influenza viruses have a glycosylation site that alters binding of antibodies elicited by egg-adapted vaccine strains. Proc. Natl Acad. Sci. USA114, 12578–12583 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.WHO Global Influenza Surveillance Network. Manual for the laboratory diagnosis and virological surveillance of influenza. https://www.who.int/publications/i/item/manual-for-the-laboratory-diagnosis-and-virological-surveillance-of-influenza (World Health Organization, 2011).
- 73.Jorquera, P. A. et al. Insights into the antigenic advancement of influenza A(H3N2) viruses, 2011–2018. Sci. Rep.9, 2676 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Peck, H. et al. Enhanced isolation of influenza viruses in qualified cells improves the probability of well-matched vaccines. NPJ Vaccines6, 149 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Matrosovich, M. N., Matrosovich, T. Y., Gray, T., Roberts, N. A. & Klenk, H.-D. Neuraminidase is important for the initiation of influenza virus infection in human airway epithelium. J. Virol.78, 12665–12667 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Sylte, M. J. & Suarez, D. L. Influenza neuraminidase as a vaccine antigen. In Vaccines for Pandemic Influenza (eds Compans, R. W. & Orenstein, W. A.) 227–242 (Springer, 2009). [DOI] [PubMed]
- 77.Monto, A. S. et al. Antibody to influenza virus neuraminidase: an independent correlate of protection. J. Infect. Dis.212, 1191–1199 (2015). [DOI] [PubMed] [Google Scholar]
- 78.Neher, R. A. & Bedford, T. nextflu: real-time tracking of seasonal influenza virus evolution in humans. Bioinformatics31, 3546–3548 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol.35, 1026–1028 (2017). [DOI] [PubMed] [Google Scholar]
- 80.Neher, R. A., Bedford, T., Daniels, R. S., Russell, C. A. & Shraiman, B. I. Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses. Proc. Natl Acad. Sci. USA113, E1701–E1709 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Kermack, W. O. & McKendrick, A. G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. Ser. A115, 700–721 (1927). [Google Scholar]
- 82.Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA89, 10915–10919 (1992). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Xia, Y.-L. et al. A deep learning approach for predicting antigenic variation of influenza A H3N2. Comput. Math. Methods Med.2021, 9997669 (2021). [DOI] [PMC free article] [PubMed]
- 84.Schober, P., Boer, C. & Schwarte, L. A. Correlation coefficients: appropriate use and interpretation. Anesth. Analg.126, 1763–1768 (2018). [DOI] [PubMed] [Google Scholar]
- 85.Centers for Disease Control and Prevention. Influenza Hospitalization Surveillance Network (FluSurv-NET). https://www.cdc.gov/fluview/overview/influenza-hospitalization-surveillance.html (2023).
- 86.Aksamentov, I., Roemer, C., Hodcroft, E. B. & Neher, R. A. Nextclade: clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw.6, 3773 (2021). [Google Scholar]
- 87.Kim, S. et al. Influenza vaccine effectiveness against A(H3N2) during the delayed 2021/22 epidemic in Canada. Euro Surveill.27, 2200720 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Candidate vaccine strains included in our evaluation set, with their GISAID accession IDs, predicted coverage scores, empirical coverage scores, total frequencies of circulating viruses with available HI data used in the empirical coverage calculation, year and subtype.
The number of samples in training and validation sets for each season and subtype. For dominance prediction, we report both the total number of samples (including repeating protein sequences) and the count of unique protein sequences (non-repeating) in parentheses. For antigenicity, we show the number of vaccine–virus sequence pairs used.
Data Availability Statement
The HA sequences and their metadata, including collection time and strain name, are from the GISAID (https://gisaid.org/). The accession IDs for the proteins used in this study are available in our GitHub repository at https://github.com/wxsh1213/vaxseer/blob/main/data/gisaid/ha_acc_ids.csv. The HI results are collected from the Worldwide Influenza Centre laboratory at the Francis Crick Institute and are available in their annual and interim reports (https://www.crick.ac.uk/research/platforms-and-facilities/worldwide-influenza-centre/annual-and-interim-reports). The human influenza vaccine composition is from https://gisaid.org/resources/human-influenza-vaccine-composition/. The vaccine effectiveness in United States is from the study of the CDC: https://www.cdc.gov/flu-vaccines-work/php/effectiveness-studies/. The estimated disease burden averted by vaccination can be found at https://www.cdc.gov/flu/vaccines-work/burden-averted.htm.
All models and code used for data processing, training and evaluating VaxSeer are publicly available at https://github.com/wxsh1213/vaxseer.













