Skip to main content
Protein Engineering, Design and Selection logoLink to Protein Engineering, Design and Selection
. 2010 Nov 30;24(3):291–299. doi: 10.1093/protein/gzq105

A novel sequence-based antigenic distance measure for H1N1, with application to vaccine effectiveness and the selection of vaccine strains

Keyao Pan 1, Krystina C Subieta 3, Michael W Deem 1,2,4
PMCID: PMC3038458  PMID: 21123189

Abstract

H1N1 influenza causes substantial seasonal illness and was the subtype of the 2009 influenza pandemic. Precise measures of antigenic distance between the vaccine and circulating virus strains help researchers design influenza vaccines with high vaccine effectiveness. We here introduce a sequence-based method to predict vaccine effectiveness in humans. Historical epidemiological data show that this sequence-based method is as predictive of vaccine effectiveness as hemagglutination inhibition assay data from ferret animal model studies. Interestingly, the expected vaccine effectiveness is greater against H1N1 than H3N2, suggesting a stronger immune response against H1N1 than H3N2. The evolution rate of hemagglutinin in H1N1 is also shown to be greater than that in H3N2, presumably due to greater immune selection pressure.

Keywords: antigenic distance, antigenic drift, influenza, pepitope, vaccine effectiveness

Introduction

The annual trivalent vaccine for influenza contains one H3N2 strain, one H1N1 strain and one influenza B strain. This vaccine is currently the primary tool to prevent influenza infection and to control influenza epidemics. Due to the fast evolution of the influenza virus, the components of the influenza vaccine are changed for many flu seasons. Even though the vaccine is usually redesigned to match closely the newly evolved influenza virus strains, there occasionally has been a suboptimal match between vaccine and virus. Partly for this reason, vaccine effectiveness has varied in different years. The desire to have a vaccine with high effectiveness makes the prediction of the circulating influenza strain for the next influenza season a key step in vaccine design. A goal of the World Health Organization (WHO) is to recommend vaccine strains for the next flu season that will have the smallest antigenic distances to the dominant circulating strains in the next flu season, which often means using the dominant circulating strains in the current flu season as a reference.

A variety of distance measures have been developed to evaluate the degree of match between the vaccine strain and the dominant circulating strain. The hemagglutinin protein (HA) of influenza is primarily focused upon for this distance calculation since HA is the dominant antigen for protective human antibodies and exhibits the highest evolutionary rate among all the influenza genes (Rambaut et al., 2008). A widely used definition of antigenic distance is calculated from hemagglutination inhibition (HI) data from ferret animal model studies. To compare a pair of strains, a 2-by-2 HI titer matrix is built, and the antigenic distance is extracted from this matrix. This distance can be further refined by a dimensional projection technique termed antigenic cartography (Smith et al., 2004). The mathematical basis of antigenic cartography is the dimension reduction of the shape space in which each point represents an influenza virus strain and the distance between a pair of points represents the antigenic distance between the corresponding strains. Note that antigenic cartography does not yield the distance data itself, but assesses the distance between the given vaccine strain and dominant circulating strain by globally considering the effect of all the strains and the antigenic distances among them. In the original literature of antigenic cartography (Smith et al., 2004), HI data were the input of the antigenic cartography algorithm that obtains the final results of distances. Antigenic distances can also be defined by the amino acid sequences of the strains using computer-aided methods, in which the fraction of substituted amino acid in the dominant HA epitope bound by antibody is defined by pepitope as a sequence-based antigenic distance measure (Gupta et al., 2006; Deem and Pan 2009; Pan and Deem 2009). The amino acid sequences are downloaded from databases and processed to obtain these distance measures. The pepitope sequence-based method has been shown to be an effective antigenic distance measure between two strains of H3N2 (Deem and Lee 2003; Gupta et al., 2006; Pan and Deem 2009). To be clear, antigenic distance is a quantity that should define difference of viral strains, as determined by the human immune system. Ferret HI data are not the only or even the best measure of antigenic distances.

The vaccine effectiveness, which varies from year to year, correlates with the antigenic distance between the vaccine strain and the dominant circulating strain. Thus, the vaccine effectiveness can be predicted by calculating the antigenic distance. Such a priori estimation of the vaccine effectiveness guides health authorities to determine the appropriate strain for the vaccine component for the coming flu season. For H3N2 influenza, the pepitope method offers a prediction of vaccine effectiveness that has a higher correlation coefficient with vaccine effectiveness in humans than do distances derived by other methods (Gupta et al., 2006; Pan and Deem, 2009). In this paper, we develop the pepitope method for H1N1 influenza. In the section Materials and methods we describe the epidemiological data used to calculate vaccine effectiveness and the animal model or sequence data used to calculate antigenic distance. In results we show the correlation of antigenic distance with vaccine effectiveness. We discuss the results in the section Discussion.

Materials and Methods

Identities of vaccine strains and dominant circulating strains

The vaccine strain selection by WHO in each year follows a standard procedure. The vaccine strains are reviewed every year and are usually changed every 2 to 3 years. We used the H1N1 vaccine strains and H1N1 dominant circulating strains in the epidemiological literature that provided vaccine effectiveness data used in this study.

Estimation of vaccine effectiveness

The H1N1 vaccine effectiveness is gathered from epidemiological literature regarding the influenza-like illness (ILI) rate of unvaccinated (u) and vaccinated people (v). Vaccine effectiveness can be described by the following definition:

graphic file with name gzq105eq1.jpg (1)

To calculate vaccine effectiveness and its standard error, we let Nu and Nv denote the number of subjects in the unvaccinated and vaccinated groups, nu and nv denote the number of illness in the unvaccinated and vaccinated groups, respectively. The values and the standard errors of u, v, and vaccine effectiveness are

graphic file with name gzq105eq2.jpg (2)
graphic file with name gzq105eq3.jpg (3)
graphic file with name gzq105eq4.jpg (4)
graphic file with name gzq105eq5.jpg (5)
graphic file with name gzq105eq6.jpg (6)
graphic file with name gzq105eq7.jpg (7)

If the vaccine effectiveness is averaged from N studies, Inline graphic, where σVEi is the standard error of the ith study.

Compared to H3N2, subtype H1N1 viruses were dominant in fewer years. Based on the proportions of samples of H3N2, H1N1 and influenza B collected in each year during 1977–2009, widespread H1N1 circulation was observed in approximately 10 seasons. Epidemiological studies on vaccine effectiveness were absent for some years when H1N1 circulated. Additionally, we used the criteria listed below to filter all available literature.

To ensure that the vaccine effectiveness we collected from the literature is for H1N1, the seasons and the geographic regions of the epidemiological studies in the literature were compared with the influenza activity information in WHO Weekly Epidemiological Records to confirm that those regions were dominated by H1N1 in those seasons. Subjects were restricted to 18–64-year old healthy adult humans to avoid effects of an underdeveloped immune system in children or of immunosenescence in senior people. If more than one measure of vaccine effectiveness was collected for the same season, they were averaged to minimize the statistical noise.

In order to minimize the effect on vaccine effectiveness from co-circulating subtypes such as H3N2, only the epidemiological data collected in the regions and in the flu seasons in which the H1N1 subtype was dominant were applied to calculate the vaccine effectiveness in this study. The seasons in which the H1N1 subtype was dominant were reported by the literature on H1N1 vaccine effectiveness. The studies cited in Table II for the calculation of vaccine effectiveness gave the subtype of the predominant epidemic virus as well as of the virus sampled from the subjects with ILI. In addition, the dominance of H1N1 subtype is also available in the Centers for Disease Control (CDC) Morbidity and Mortality Weekly Reports and the WHO Weekly Epidemiological Record. For the data in Table II, the dominance of H1N1 subtype was shown in these references.

Table II.

Summary of results.

Season Vaccine strain Dominant circulating straina Vaccine effectiveness (%) nu Nu nv Nv Dominant epitope pepitope pall-epitope psequence d1 d2
1982–83 A/Brazil/11/78 A/England/333/80 37.0 ± 12.01 48 118 31 1211 A 0.083 0.0311 0.0184 010 1.4110
1983–84 A/Brazil/11/78 A/Victoria/7/83 38.1 ± 10.31–3 30 60 21 671 C 0.121 0.0497 0.0337 1.1311–13 13.6611,13
55 298 46 3002
1986–87 (a) A/Taiwan/1/86 A/Taiwan/1/86 64.8 ± 14.33,4 11 217 13 7234 0 0 0 0 1
1986–87 (b) A/Chile/1/83 A/Taiwan/1/86 18.5 ± 12.15 92 878 75 8785 B 0.318 0.0807 0.0399 412,14–18 24.4814,16–18
1988–89 A/Taiwan/1/86 A/Taiwan/1/86 43.1 ± 10.03,5 119 1125 89 11265 0 0 0 0 1
1995–96 (a) A/Texas/36/91 A/Texas/36/91 60.0 ± 27.86 6 12 2 106 0 0 0 0 1
1995–96 (b)* A/Singapore/6/86 A/Texas/36/91 32.2 ± 5.87 99 652 57 6847 A 0.125 0.0559 0.0307 0.8614,19,20 2.4314,20
176 652 149 6847
2006–07 A/New Caledonia/20/99 A/New Caledonia/20/99 40.5 ± 2.58 1085 230729 1221 4366008 0 0 0 0 1
2007–08* A/Solomon Islands/3/2006 A/Solomon Islands/3/2006 62.8 ± 12.69 94 262 8 609 0 0 0 0 1

Nine pairs of vaccine strains and dominant circulating strains in seven flu seasons in the Northern hemisphere were collected from literature. The quantities nu, Nu, nv, Nv, pepitope, pall-epitope, psequence, d1, and d2 are defined in the section Materials and methods. Only those seasons when H1N1 virus was dominant in at least one country or region where vaccine effectiveness data were available were considered. Two different vaccines have occasionally been adopted in different geographic regions for the same season, in which case two sets of data were added in this table. *signifies that co-circulating H3N2 was also found in the same country or region in that season; however, the interference to the final result from H3N2 is expected to be small, and so the sets of data with a single asterisk were preserved.

aMultiple strains are circulating in each season, while each strain has a specific proportion in the virus population in a certain region and season. The strain with the greatest proportion is defined as the dominant circulating strain, which is listed in this table. The dominant circulating strains in this table were chosen based on the literature on vaccine effectiveness, which also gave the region where the effectiveness data were collected.

The vaccine effectiveness collected from various flu seasons and regions were measured with standard errors. Biases in the vaccine effectiveness are due to the complexity of the vaccine effectiveness measurement, including the character of the human population studied, such as age, immune history, and health condition; the influence of co-circulating H3N2 influenza strains; the character of the vaccine distributed, such as live attenuated virus vaccine, inactivated split-virus vaccine produced by virion disassembly, or subunit vaccine only containing HA and neuraminidase; the method of epidemiological measurement of influenza infection, such as virus detection, confirmed symptomatic influenza, or ILI; the design of the experiment, such as natural infection or experimental challenge study; and the progression of the epidemic in the population under study. These biases are thus inevitable with current technology. Here, we applied the following methods to minimize biases in the vaccine effectiveness data. Subjects in the studies were confined to 18–64 years old healthy adult humans to preclude the interference of the feeble immune system in children or in senior people, because variation in the capability of the immune system is a determinant of the vaccine effectiveness given the same pair of vaccine strain and dominant circulating strain. Only epidemiological studies in the season and the region in which H1N1 subtype was dominant were used to obtain the vaccine effectiveness data. The vaccine involved in the referred studies is an inactivated vaccine. Other types such as cold-adapted nasal spray vaccine were excluded. The epidemiological measurement of infection in all the referred studies used ILI as the criterion. Not all studies designed the experiment as a challenge study. We assume that the epidemic propagates in the population in a similar way in each season. These criteria are used to filter the available references and to obtain vaccine effectiveness data with minimum bias. The standard errors of the data are presented here. These criteria reduced the number of practical references for each season. Our meta-analysis considered 50 peer-reviewed papers, all we could find in the literature. We list the ones that satisfy our selection criteria for each of the years, typically 1–3 per year.

Antigenic distance measured by sequence data

Figure 1 shows the HA1 domain with five epitopes of the H1 subtype HA. As the improvement of a previous definition of H1 epitopes (Caton et al., 1982), these five H1 epitopes are recognized by host antibodies and are identified by mapping the well-defined epitopes in H3 HA (Wiley et al., 1981; Macken et al., 2001) to H1 HA and using sequence entropy to find additional sites under selection (Deem and Pan, 2009).

Fig. 1.

Fig. 1

HA1 domain of the H1 HA in the ribbon format (PDB code: 1RU7). Epitope A (blue), B (red), C (cyan), D (yellow), and E (red) are space filling. These five H1 epitopes are the analogs of the well-defined H3 epitopes (Deem and Pan, 2009).

The antigenic distance between the vaccine strain and the dominant circulating strain is the input for the vaccine effectiveness prediction. The fraction of mutated amino acids in the epitope region of HA, or the P-value, is an antigenic distance measure to quantify the similarity between two strains (Gupta et al., 2006). One P-value is calculated for each H1 epitope

graphic file with name gzq105eq8.jpg (8)

The pepitope is defined as the maximum of five P-values for the five epitopes, and the dominant epitope is defined as the corresponding epitope. This definition, i.e. assumption, has lead for H3N2 to vaccine effectiveness predictions that correlate with those observed (Gupta et al., 2006).

Another sequence-based antigenic distance measure uses the fraction of mutated amino acid in all the five epitopes

graphic file with name gzq105eq9.jpg (9)

As an alternative to pepitope and pall-epitope, psequence is also used with the definition

graphic file with name gzq105eq10.jpg (10)

Antigenic distance measured by HI

The animal model method to determine the distance between the vaccine strain and the dominant circulating strain employs the HI assay to give the HI table. See Table I: Here Hij, i, j=1, 2 are four HI titers measuring the capability of antibody j to inhibit HA i. Note that in reality, health authorities including WHO and CDC provide HI tables with at least eight antisera to evaluate the antigenic distance between candidate vaccine strains and dominant circulating strain. These HI tables are mathematically equivalent to several 2 × 2 HI tables each of which defines the antigenic distance between one pair of strains in the original HI table. For each pair of strains, we picked up four entries determined by the identities of these two strains and the two corresponding antisera from the original HI table. The 2 × 2 HI tables in this manuscript are used to elaborate the formulae for d1 and d2. In this context Strain 1 is the vaccine strain and Strain 2 is the dominant circulating strain. Two distance measures have been derived from these four HI titers in the HI table (Smith et al., 1999; Lee and Chen, 2004):

graphic file with name gzq105eq11.jpg (11)
graphic file with name gzq105eq12.jpg (12)

Note that antigenic cartography is carried out on the asymmetrical distance, d1 (Smith et al., 2004). When the vaccine strain and the dominant circulating strain in one season were not identical, we searched the literature for the HI tables with these two strains. The d1 and d2 values were averaged if multiple HI tables were found for one season.

Table I.

HI table with two strains and four HI titers.

Ferret antisera against Strain 1 Ferret antisera against Strain 2
Strain 1 H11 H12
Strain 2 H21 H22

Results

We performed a meta-analysis of identities of the vaccine strains and dominant circulating strains, vaccine effectiveness, and antigenic distances between vaccine strains and dominant circulating strains measured with the HI assay using ferret antisera. In one season dominated by H1N1, epidemiological statistics in a certain region reported in literature was used to fix the values of nu, Nu, nv, Nv, and the mean and standard error of the vaccine effectiveness. HI assay data in literature are also used to determine antigenic distance d1 and d2 between the vaccine strain and dominant circulating strain. Results of the meta-analysis are listed in Table II. Sequence-based antigenic distances pepitope, pall-epitope, and psequence are calculated from the sequences of the vaccine strain and dominant circulating strain by equations 8, 9 and 10, respectively. Values of pepitope, pall-epitope, and psequence in each season dominated by H1N1 are also listed in Table II.

While the number of data points is limited, a linear relationship exists between vaccine effectiveness and pepitope by using least squares. Similar to the case for H3N2 influenza (Gupta et al., 2006), pepitope strongly correlates with H1N1 vaccine effectiveness, with R2=0.68. The fitted model predicts a vaccine effectiveness of 52.7% when pepitope=0, and vaccine effectiveness is greater than 0 when pepitope<0.442. In Fig. 2, the fitted trend line is within one standard error of all data points with pepitope > 0, validating the ability of the pepitope model to predict the vaccine effectiveness with only the sequences of the vaccine strain and the dominant circulating strain.

Fig. 2.

Fig. 2

Vaccine effectiveness for ILI correlates with pepitope, R2=0.68 (solid line). Data from Table II. The trend line quantifies vaccine effectiveness as a decreasing linear function of pepitope. Vaccine effectiveness=–1.19 pepitope+0.53. Also shown is the vaccine effectiveness to H3N2 (dashed line) (Gupta et al. 2006).

Although statistical errors exist in the observed vaccine effectiveness, the collected vaccine effectiveness data reject the null hypothesis that the vaccine effectiveness is independent of pepitope. The nine pairs of vaccine strains and dominant circulating strains in Table II have five difference antigenic distances between vaccine strain and dominant circulating strain defined by pepitope. The nine pairs of strains were thus categorized into groups 1–5 with pepitope equal to 0, 0.083, 0.121, 0.125, and 0.318, respectively, and the average vaccine effectiveness and standard error were calculated for each group. The vaccine effectiveness differences between these five groups were significant, such as groups 1 and 4 (P=0.0079) and groups 1 and 5 (P=0.0054). Moreover, statistical analysis shows that the introduction of pepitope is valuable in the selection process of vaccine strains. The slope of the fit line is significantly smaller than 0 (P=0.0027). Hence the linear model is able to predict the vaccine effectiveness with the knowledge of pepitope. In other words the non-zero slope of vaccine effectiveness as a function of pepitope is significant at the level of 0.27%.

Two other sequence-based antigenic distance measures alternative to pepitope are pall-epitope and psequence. Unlike pepitope, which focuses upon the mutations in the antibody binding regions, pall-epitope calculates the fraction of mutated amino acids in all the five epitopes, and psequence calculates the fraction of mutated amino acids in the whole HA1 domain of HA. The psequence measure is also one of the optional distance measures for phylogenetic softwares. In Fig. 3, the correlation between H1N1 vaccine effectiveness and pall-epitope has R2=0.70. In Fig. 4, the correlation between H1N1 vaccine effectiveness and psequence has R2=0.66. The predicted 54% vaccine effectiveness when pall-epitope =0 in Fig. 3 and when psequence=0 in Fig. 4 are almost the same as the 53% predicted by the pepitope method. By contrast pall-epitope and psequence for H3N2 have less impressive correlations with H3N2 vaccine effectiveness (Gupta et al., 2006; Sun et al., 2006), and pall-epitope and psequence are not as effective as pepitope as antigenic distance measures and vaccine effectiveness predictors for H3N2.

Fig. 3.

Fig. 3

Vaccine effectiveness for ILI correlates with pall-epitope with R2=0.70. Data from Table II. The trend line quantifies vaccine effectiveness as a decreasing linear function of pall-epitope. Vaccine effectiveness=–4.16 pall-epitope+0.54.

Fig. 4.

Fig. 4

Vaccine effectiveness for ILI correlates with psequence with R2=0.66. Data from Table II. The trend line quantifies vaccine effectiveness as a decreasing linear function of psequence. Vaccine effectiveness=−7.37 psequence+0.54.

The HI assay and derived distance measures d1 and d2 are still the most widely used measures by researchers and health authorities to identify newly collected circulating strains. These methods are used to recommend the vaccine strain for the coming flu season (Cox et al., 2003, 2007; WHO Collaborating Center for Surveillance and Control of Influenza, 2008), to draw the antigenic map (Smith et al., 2004), and to support the phylogenetic data (Cox et al., 2003). Figures 5 and 6 describe the correlation between vaccine effectiveness and antigenic distances d1 and d2 from the HI assay. A correlation is found in both figures. In the season 1995–96 in Israel, the vaccine strain is A/Singapore/6/86 (H1N1) and the dominant circulating strain is A/Texas/36/91 (H1N1), between which the averaged d1 is 0.86. Since the vaccine effectiveness is only 32.2%, its discrepancy to the corresponding effectiveness 42.5% in the trend line is much larger than 1 standard error of vaccine effectiveness. Similarly, the same pair of vaccine strain and dominant circulating strain introduces a data point further from the trend line if d2 is used as the distance measure. We also notice that two strains could be antigenically identical as measured with HI assay but antigenically distinct as measured with pepitope. As shown in Table II, in the season 1982–1983, the H1N1 vaccine strain A/Brazil/11/78 and dominant circulating strain A/England/333/80 presented the antigenic distance measured with HI assay d1=0 and the sequence-based antigenic distance measure pepitope=0.083. The H3N2 vaccine strain and dominant circulating strain showed identical d1 and d2 values but distinct pepitope values in the seasons 1996–1997 and 2004–2005 (Gupta et al., 2006). Note that if pepitope is incorporated into the linear models shown in Figs 5 and 6, the R2 value is increased. We fit a linear model vaccine effectiveness=α + β1pepitope + β2d1 + β3d2 + ε in which ϵ is an error term. The fitted model is vaccine effectiveness=0.54–2.179pepitope + 0.068d1 + 0.003d2 with R2=0.72.

Fig. 5.

Fig. 5

The correlation with R2=0.53 between vaccine effectiveness for ILI and d1, the antigenic distance defined by HI assay using ferret antisera. Data from Table II. The d1 values were averaged if multiple HI assay experimental data were found. The trend line quantifies vaccine effectiveness as a decreasing linear function of d1. Vaccine effectiveness=–0.085 d1+0.50.

Fig. 6.

Fig. 6

The correlation with R2=0.46 between vaccine effectiveness for ILI and d2, the antigenic distance defined by HI assay using ferret antisera. Data from Table II. The d2 values were averaged if multiple HI assay experimental data were found. The trend line quantifies vaccine effectiveness as a decreasing linear function of d2. Vaccine effectiveness=–0.013 d2+0.51.

Discussion

Verification of the pepitope model

Originally the pepitope model was implemented for the H3N2 virus, where pepitope correlates with H3N2 vaccine effectiveness with a significantly larger R2 than do pall-epitope and psequence (Gupta et al., 2006; Sun et al., 2006). In the case of H1N1, the advantage of pepitope over pall-epitope and psequence is not as remarkable as for H3N2. We speculate that antibodies against the H3N2 virus may bind to a small fixed region on the surface of H3 HA while antibodies against the H1N1 virus may have multiple binding regions available. In other words, we speculate that the dominant epitope in H3 HA may contribute substantially to the escape of the H3N2 virus from host antibodies, while escape mutations may occur in the dominant epitope as well as perhaps the subdominant epitopes of H1 HA. Our speculation comes from the fact that the epitope region in H1N1 contains more amino acid positions than does that in H3N2 (Deem and Pan 2009) and the apparently less well defined nature of the HINI epitopes.

Two recent epidemiological studies (Centers for Disease Control and Prevention (CDC), 2009a; Skowronski et al., 2010) present further support of the pepitope model. Before the emergence of the H1N1 pandemic flu in April 2009, the 2008–2009 flu season was dominated by subtype H1N1 seasonal flu. Both the dominant circulating strain and the vaccine strain in the 2008–2009 season were A/Brisbane/57/2007 (H1N1) (Centers for Disease Control and Prevention (CDC), 2009d). The observed vaccine effectiveness against seasonal flu was 44% (95% confidence interval, CI: 33–59%) (Skowronski et al., 2010). The pepitope model predicts the vaccine effectiveness as 53%, which falls into the 95% CI of the reported vaccine effectiveness.

After April 2009, a new peak of influenza activity emerged. The dominant circulating strain in this period was the pandemic H1N1 strain A/California/7/2009 (Centers for Disease Control and Prevention (CDC), 2009b,c). The reported effectiveness of the 2008–2009 seasonal flu vaccine against the H1N1 pandemic flu was −50 to 150% (Skowronski et al., 2010) and −10% (95% CI: −43 to 15%) (Centers for Disease Control and Prevention (CDC), 2009a). The value of pepitope between A/California/7/2009 and A/Brisbane/57/2007 is 0.77 with epitope B as the dominant epitope. The vaccine effectiveness forecast by the pepitope model is −39%, which agrees with the measured vaccine effectiveness values.

Comparison of H3N2 and H1N1 vaccine effectiveness and evolution rates

The pepitope model has been previously applied to the prediction of H3N2 vaccine effectiveness (Gupta et al., 2006). The H3N2 vaccine effectiveness with pepitope=0 is 44.6%, and vaccine effectiveness is >0 for pepitope<0.184 (Gupta et al., 2006). Thus, H1N1 vaccines tend to have higher vaccine effectiveness compared with H3N2 vaccines, as shown in Fig. 2. The comparison between H3N2 and H1N1 vaccine effectiveness [Fig. 2 versus Fig. 2 of (Gupta et al., 2006)] illustrates that H1N1 vaccine has higher effectiveness than the H3N2 vaccine as a function of pepitope. This observation suggests that the host immune system is more effective at recognizing and eliminating the H1N1 virus (pepitope=0), and that humoral cross-immunity is stronger for H1 HA (pepitope > 0). This observation also explains why an H3N2 epidemic is usually a more severe health threat than an H1N1 epidemic. We propose that H1N1 has a longer history of circulation in the human population, so human immune system may recognize H1N1 more effectively, and this may be the reason that under stronger immune pressure, the H1N1 virus may have a higher degree of adaptation to the human host. In the following discussion, we verify this hypothesis by two facts. First, the H1N1 virus has a larger antigenic diversity than does the H3N2 virus. Second, the H1N1 virus presents higher evolutionary rate in the per dominant season basis.

To compare the antigenic diversities of H1N1 and H3N2, we downloaded from the NCBI database on 13 August 2009 all the amino acid sequences of H3 HA collected in the 18 years with H3N2 dominant circulating strains (Gupta et al., 2006) and those of H1 HA collected in 7 years with H1N1 dominant circulating strains (Table II). Thus, 18 subsets of H3N2 sequences and 7 subsets of H1N1 sequences were formed. The centers of these subsets are the corresponding vaccine strains in the same season of the circulating virus. The radius of each subset is obtained by the calculation of pepitope. First, the strains with the top 5% pepitope antigenic distance measure to the center of each subset were selected, to focus on the extent of viral evolution. Second, the pepitope between these selected strains and the center were averaged in each year as the radius. Third, the radii were averaged over all the 18 years for H3N2 and over 7 years for H1N1. That is, the average radius of the top 5% was calculated in each year. As a result, the average H3N2 subset radius with the vaccine strains as the centers is 0.211. The average H1N1 radius is 0.520 with the vaccine strains as the centers. This difference between the H3N2 radius and the H1N1 radius is significant with the P-value 0.0118 using the Wilcoxon rank-sum test. Consequently, the H1N1 virus has a larger antigenic diversity in each season compared with the H3N2 virus, as shown in Fig. 7.

Fig. 7.

Fig. 7

The comparison between H3N2 (triangle up) and H1N1 (triangle down) in regard to the antigenic diversity, the evolutionary rate between 1980 and 2000 (left), the evolutionary rate between 2000 and 2007 (right), and the mutation rate on a short-time scale without fixation. The antigenic diversity is measured with pepitope, the unit of evolutionary rate is 10–3 nucleotide substitution/site/year, and the unit of mutation rate is 10–6 nucleotide substitution/site/day.

We also compared the evolutionary rates of H1N1 and H3N2 because evolutionary rate of the virus is an index of the selection pressure of the virus. The virus undergoes less immune pressure in a non-dominant season and high immune pressure in a dominant season. It has been noticed that in H1 and H3 HA, the region outside epitopes presents significantly lower evolutionary rate than do the epitopes (Ferguson et al., 2003; Deem and Pan, 2009). This phenomenon indicates that without immune pressure, the spontaneous evolutionary rates of both H1N1 and H3N2 are low. Therefore, a higher evolutionary rate of one virus subtype in a dominant season comes from the higher immune pressure rather than neutral evolution, and we reject the alternative scenario that the higher evolutionary rate causes a virus subtype to be dominant in one season. So the evolutionary rate per dominant season is a natural measure of the virus evolution. Between 1983–1997, H3N2 was dominant in 8 of 15 years, and between 1977–2000, H1N1 was dominant in 5 of 24 years (Ferguson et al., 2003). Between 1980 and 2000, the HA1 domain of H3 HA has a higher annual evolutionary rate of 3.7 × 10−3 nucleotide substitution/site/year than does the HA1 domain of H1 HA, which has the annual evolutionary rate of 1.8 × 10−3 nucleotide substitution/site/year (Ferguson et al., 2003). Measured on a per dominant season basis, however, the HA1 domain of H1 HA evolves faster in its dominant season with the rate of 8.6 × 10−3 nucleotide substitution/site/dominant season than does the H3 HA with the rate of 6.9 × 10−3 nucleotide substitution/site/dominant season. The difference is significant with a P-value of 0.0008. Similarly, between 2000 and 2007, the HA1 domain of H1 HA evolves faster in its dominant season with the rate of 10.2 × 10−3 nucleotide substitution/site/dominant season than does the H3 HA with the rate of 7.4 × 10−3 nucleotide substitution/site/dominant season. The difference is significant with a P-value of 0.0005 (Zaraket et al., 2009). Here we have divided the annual evolutionary rate by the proportion of dominant years for both H1 and H3 HA. Even on a short-time scale without fixation, H1 HA shows a comparable or higher mutation rate of 9.1 × 10−6 nucleotide substitution/site/day than H3 HA of 4.2 × 10−6 nucleotide substitution/site/day (P=0.26) (Nobusawa and Sato, 2006), probably caused by the adaptation to the higher immune pressure, at least for some strains. To make this last point, we have assumed that the mutation rate of the HA gene is the same as that of the NS gene. We assume that the same polymerase is operating on these two genes, and so the mutation rates are expected to be the same. The comparisons of evolutionary rates and mutation rates between H3N2 and H1N1 are summarized in Fig. 7.

The pepitope model as a supplement to HI assay

For both H1N1 (this paper) and H3N2 (Gupta et al., 2006), the HI assay correlates less well with vaccine effectiveness than does pepitope. Collection of HI assay data measuring antigenic distance is also more time consuming and more expensive compared with the pepitope model. Many hundreds of strains are circulating and collected in an average flu season, thus an HI table with tens of thousands of entries needs to be built to assess the antigenic distance between each pair of strains. With the high-throughput sequencing technology generating HA sequence data, such antigenic distances are easily measured with the sequence-based antigenic distance measure pepitope, which correlates to a greater degree with vaccine effectiveness than do the HI data.

The pepitope model is developed to provide researchers and health authorities with a new tool to quantify antigenic distance and design the vaccine. We do not suggest that pepitope should substitute for the current HI assay, but rather suggest that pepitope serves as an additional assessment when selecting vaccine strains. Using pepitope to supplement to HI assay data may allow researchers and health authorities to more precisely quantify the antigenic distance between dominant circulating strains and candidate vaccine strains. The adoption of the pepitope theory may also allow researchers to minimize the cost and the number of ferret experiments and to correct HI assay data in some situations.

Funding

Krystina C. Subieta's work at Rice University was funded by the HHMI Summer Undergraduate Research Internship Program in Bionanotechnology. This project was also partially supported by DARPA grant HR 00110510057. Funding to pay the Open Access publication charges for this article was provided by DARPA.

Supplementary Material

Supplementary data

Acknowledgements

Keyao Pan's research was supported by the Gulf Coast Consortia Nanobiology Training Program (NBTP).

References

  1. Belongia E., Kieke B., Coleman L., et al. J. Am. Med. Assoc. 2008;299:2381–2384. [Google Scholar]
  2. Brown I.H., Harris P.A., McCauley J.W., Alexander D.J. J. Gen. Virol. 1998;79:2947–2955. doi: 10.1099/0022-1317-79-12-2947. [DOI] [PubMed] [Google Scholar]
  3. Caton A.J., Brownlee G.G., Yewdell J.W., Gerhard W. Cell. 1982;31:417–427. doi: 10.1016/0092-8674(82)90135-0. [DOI] [PubMed] [Google Scholar]
  4. Centers for Disease Control and Prevention (CDC) MMWR Morb. Mortal. Wkly. Rep. 2009a;58:1241–1245. [PubMed] [Google Scholar]
  5. Centers for Disease Control and Prevention (CDC) MMWR Morb. Mortal. Wkly. Rep. 2009b;58:1009–1012. [PubMed] [Google Scholar]
  6. Centers for Disease Control and Prevention (CDC) MMWR Morb. Mortal. Wkly. Rep. 2009c;58:1236–1241. [PubMed] [Google Scholar]
  7. Centers for Disease Control and Prevention (CDC) MMWR Morb. Mortal. Wkly. Rep. 2009d;58:369–374. [PubMed] [Google Scholar]
  8. Chakraverty P., Cunningham P., Shen G.Z., Pereira M.S. J. Hyg. Camb. 1986;97:347–358. doi: 10.1017/s0022172400065438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Couch R.B., Quarles J.M., Cate T.R., Zahradnik J.M. In: Options for the Control of Influenza, UCLA Symposia on Molecular and Cellular Biology, Vol. 36. Kendal A.P., Patriarca P.A., editors. New York: Alan R. Liss; 1986. pp. 223–241. [Google Scholar]
  10. Couch R.B., Keitel W.A., Cate T.R., Quarles J.A., Taber L.A., Glezen W.P. In: Options for the Control of influenza III. Brown L.E., Hampson A.W., Webster R.G., editors. Amsterdam: Elsevier Science B.V.; 1996. pp. 97–106. [Google Scholar]
  11. Cox N., Balish A., Brammer L., et al. Information for the Vaccines and Related Biological Products Advisory Committee, CBER, FDA. WHO Collaborating Center for Surveillance, Epidemiology and Control of Influenza; 2003. Atlanta, USA. [Google Scholar]
  12. Cox N., Balish A., Berman L., et al. Information for the Vaccines and Related Biological Products Advisory Committee, CEBR, FDA. WHO Collaborating Center for Surveillance, Epidemiology and Control of Influenza; 2007. Atlanta, USA. [Google Scholar]
  13. Daniels R.S., Douglas A.R., Skehel J.J., Wiley D.C. Bull. World Health Organ. 1985;63:273–277. [PMC free article] [PubMed] [Google Scholar]
  14. Deem M.W., Lee H.Y. Phys. Rev. Lett. 2003;91:068101. doi: 10.1103/PhysRevLett.91.068101. [DOI] [PubMed] [Google Scholar]
  15. Deem M.W., Pan K. Protein Eng., Des. Sel. 2009;22:543–546. doi: 10.1093/protein/gzp027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Donatelli I., Campitelli L., Ruggieri A., Castrucci M.R., Calzoletti L., Oxford J.S. Eur. J. Epidemiol. 1993;9:241–250. doi: 10.1007/BF00146258. [DOI] [PubMed] [Google Scholar]
  17. Edwards K.M., Dupont W.D., Westrich M.K., Plummer W.D., Palmer P.S., Wright P.F. J. Infect. Dis. 1994;169:68–76. doi: 10.1093/infdis/169.1.68. [DOI] [PubMed] [Google Scholar]
  18. Ferguson N.M., Galvani A.P., Bush R.M. Nature. 2003;422:428–433. doi: 10.1038/nature01509. Note the standard error of the evolution rate is misprinted, and we use the corrected value of 10–4. [DOI] [PubMed] [Google Scholar]
  19. Grotto I., Mandel Y., Green M.S., Varsano N., Gdalevich M., Ashkenazi I. Clin. Infect. Dis. 1998;26:913–917. doi: 10.1086/513934. [DOI] [PubMed] [Google Scholar]
  20. Gupta V., Earl D.J., Deem M.W. Vaccine. 2006;24:3881–3888. doi: 10.1016/j.vaccine.2006.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hay A.J., Gregory V., Douglas A.R., Lin Y.P. Phil. Trans. R. Soc. Lond. B. 2001;356:1861–1870. doi: 10.1098/rstb.2001.0999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Keitel W.A., Cate T.R., Couch R.B. Am. J. Epidemiol. 1988;127:353–364. doi: 10.1093/oxfordjournals.aje.a114809. [DOI] [PubMed] [Google Scholar]
  23. Keitel W.A., Cate T.R., Couch R.B., Huggins L.L., Hess K.R. Vaccine. 1997;15:1114–1122. doi: 10.1016/s0264-410x(97)00003-0. [DOI] [PubMed] [Google Scholar]
  24. Kendal A.P., Cox N.J., Harmon M.W. In: Applied Virology Research: Virus Variability, Epidemiology and Control. Kurstak E., Marusyk R. G., Murphy F. A., van Regenmortel M. H. V., editors. Springer; 1990. pp. 119–130. [Google Scholar]
  25. Lee M.S., Chen J.S. Emerg. Infect. Dis. 2004;10:1385–1390. doi: 10.3201/eid1008.040107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Macken C., Lu H., Goodman J., Boykin L. In: Options for the Control of Influenza IV. Osterhaus A.D.M.E., Cox N., Hampson A.W., editors. Elsevier: 2001. accession number ISDN38157 http://www.flu.lanl.gov/ [Google Scholar]
  27. Nobusawa E., Sato K. J. Virol. 2006;80:3675–3678. doi: 10.1128/JVI.80.7.3675-3678.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Pan K., Deem M.W. Vaccine. 2009;27:5033–5034. doi: 10.1016/j.vaccine.2009.05.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Rambaut A., Pybus O.G., Nelson M.I., Viboud C., Taubenberger J.K., Holmes E.C. Nature. 2008;453:615–619. doi: 10.1038/nature06945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Rimmelzwaan G.F., de Jong J.C., Bestebroer T.M., van Loon A.M., Claas E.C.J., Fouchier R.A.M., Osterhaus A.D.M. Virology. 2001;282:301–306. doi: 10.1006/viro.2000.0810. [DOI] [PubMed] [Google Scholar]
  31. Skowronski D.M., De Serres G., Crowcroft N., Janjua N., Boulianne N., Hottes T.S., Rosella L.C. Int. J. Infect. Dis. 2010;S114:e321–e322. [Google Scholar]
  32. Smith D.J., Forrest S., Ackley D.H., Perelson A.S. Proc. Natl Acad. Sci. USA. 1999;96:14001–14006. doi: 10.1073/pnas.96.24.14001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Smith D.J., Lapedes A.S., de Jong J.C., Bestebroer T.M., Rimmelzwaan G.F., Osterhaus A.D.M.E., Fouchier R.A.M. Science. 2004;305:371–376. doi: 10.1126/science.1097211. [DOI] [PubMed] [Google Scholar]
  34. Sun J., Earl D.J., Deem M.W. Mod. Phys. Lett. B. 2006;20:63–95. [Google Scholar]
  35. Treanor J.J., Kotloff K., Betts R.F., Belshe R., Newman F., Iacuzio D., Wittes J., Bryant M. Vaccine. 1999;18:899–906. doi: 10.1016/s0264-410x(99)00334-5. [DOI] [PubMed] [Google Scholar]
  36. Wang Z., Tobler S., Roayaei J., Eick A. J. Am. Med. Assoc. 2009;301:945–953. doi: 10.1001/jama.2009.265. [DOI] [PubMed] [Google Scholar]
  37. WHO. Wkly. Epidemiol. Rec. 1984;59:53–60. [Google Scholar]
  38. WHO. Wkly. Epidemiol. Rec. 1986;61:237–244. [Google Scholar]
  39. WHO. Wkly. Epidemiol. Rec. 1992;67:57–64. [Google Scholar]
  40. WHO Collaborating Center for Surveillance, e. and Control of Influenza. Preliminary Information for the Vaccines and Related Biological Products Advisory Committee. CEBR, FDA; 2008. [Google Scholar]
  41. Wiley D.C., Wilson I.A., Skehel J.J. Nature. 1981;289:373–378. doi: 10.1038/289373a0. [DOI] [PubMed] [Google Scholar]
  42. Zaraket H., Saito R., Sato I., Suzuki Y., Li D.J., Dapat C., Caperig-Dapat I., Oguma T., Sasaki A., Suzuki H. Arch. Virol. 2009;154:285–295. doi: 10.1007/s00705-009-0309-9. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary data

Articles from Protein Engineering, Design and Selection are provided here courtesy of Oxford University Press

RESOURCES