Skip to main content
. 2023 Aug 22;3(8):100566. doi: 10.1016/j.crmeth.2023.100566
 Predict for mAbi. Ranked list of candidate epitope positions on the HA antigen.
 Input:
 • P(mAbi)=P(mAbi1,mAbi2,,mAbiN): mAb i binding profile across a set of j=1N HA antigens from a given group.
 • P(mAbset): binding profiles for a set of mAbs with known binding region (head or stalk) across the set of N HA strains.
 • A: multiple sequence alignment of the set of N HA antigen sequences.
 • t,k,Q: given parameters where t is a given binding threshold, k is given parameter to KNN classification, and Q is seed epitope patch length.
 (1) Classify binding region (head vs. stalk) of mAbi using KNN classification over P(mAbset).
 (2) Binarize P(mAbi) using a given binding threshold t:
Pbinarized(mAbij)={1ifP(mAbij)>t0otherwise..}.
 (3) Define the set of binding HA strains HAbound={HAj|Pbinarized(mAbij)=1} and the set of non-binding strains HAunbound={HAj|Pbinarized(mAbij)=0}.
 (4) Compute position a-specific score, S(a), over A for all positions within the predicted binding region using the following formula:
 S(a)=1Sw(a)Sb(a),
 where
Sw(a)=1Nwm,nHAboundDBLOSUM62(an,am)×(P(mAbin)+P(mAbim))2, and
Sb(a)=1NbnHAbound,mHAunboundDBLOSUM62(an,am)×(P(mAbin)+P(mAbim))2.
 Nw: number of pairs of sequences of binding strains in HAbound.
Nb: the number of binding sequences in HAbound times the number of non-binding sequences in HAunbound.
DBLOSUM62(aX,aY): a modified BLOSUM62 distance measure (see STAR Methods) between amino acid in position a from sequence X and sequence Y in A.
 (5) Rank all positions on the HA region and select the top Q positions as the seed epitope patch, SEP.
 (6) Rank all non-SEP remaining positions on the HA binding region using the geometric mean distance to the SEP center to obtain final ranked list of all HA binding region positions.
 Output: ranked list of candidate epitope positions within the HA binding region.