Predict for . Ranked list of candidate epitope positions on the HA antigen. Input: • : mAb binding profile across a set of HA antigens from a given group. • : binding profiles for a set of mAbs with known binding region (head or stalk) across the set of HA strains. • : multiple sequence alignment of the set of HA antigen sequences. • : given parameters where is a given binding threshold, is given parameter to KNN classification, and is seed epitope patch length. (1) Classify binding region (head vs. stalk) of using KNN classification over . (2) Binarize using a given binding threshold : . (3) Define the set of binding HA strains and the set of non-binding strains . (4) Compute position -specific score, , over for all positions within the predicted binding region using the following formula: where , and : number of pairs of sequences of binding strains in . : the number of binding sequences in times the number of non-binding sequences in . : a modified BLOSUM62 distance measure (see STAR Methods) between amino acid in position from sequence and sequence in . (5) Rank all positions on the HA region and select the top positions as the seed epitope patch, . (6) Rank all non- remaining positions on the HA binding region using the geometric mean distance to the center to obtain final ranked list of all HA binding region positions. Output: ranked list of candidate epitope positions within the HA binding region. |