Mapping antibody footprints using binding profiles

. 2023 Aug 22;3(8):100566. doi: 10.1016/j.crmeth.2023.100566

Predict for

m A b_{i}

. Ranked list of candidate epitope positions on the HA antigen.
Input:
•

P (m {A b}_{i}) = P ({m A b}_{i}^{1}, {m A b}_{i}^{2}, \dots, {m A b}_{i}^{N})

: mAb

i

binding profile across a set of

j = 1 \dots N

HA antigens from a given group.
•

P (m A b_{s e t})

: binding profiles for a set of mAbs with known binding region (head or stalk) across the set of

N

HA strains.
•

A

: multiple sequence alignment of the set of

N

HA antigen sequences.
•

t, k, Q

: given parameters where

t

is a given binding threshold,

k

is given parameter to KNN classification, and

Q

is seed epitope patch length.
(1) Classify binding region (head vs. stalk) of

m A b_{i}

using KNN classification over

P (m A b_{s e t})

.
(2) Binarize

P (m A b_{i})

using a given binding threshold

t

P_{b i n a r i z e d} (m A b_{i}^{j}) = {\begin{array}{c} 1 i f P (m A b_{i}^{j}) > t \\ 0 o t h e r w i s e \dots \dots . . \end{array}}

.
(3) Define the set of binding HA strains

{H A}_{b o u n d} = {{H A}^{j} | P_{b i n a r i z e d} (m A b_{i}^{j}) = 1}

and the set of non-binding strains

{H A}_{u n b o u n d} = {{H A}^{j} | P_{b i n a r i z e d} (m A b_{i}^{j}) = 0}

.
(4) Compute position

a

-specific score,

S (a)

, over

A

for all positions within the predicted binding region using the following formula:

S (a) = 1 - \frac{S_{w} (a)}{S_{b} (a)},

where

S_{w} (a) = \frac{1}{N_{w}} \sum_{m, n \in {H A}_{b o u n d}} D_{B L O S U M 62} (a^{n}, a^{m}) \times \frac{(P (m A b_{i}^{n}) + P (m A b_{i}^{m}))}{2}

, and

S_{b} (a) = \frac{1}{N_{b}} \sum_{n \in {H A}_{b o u n d}, m \in {H A}_{u n b o u n d}} D_{B L O S U M 62} (a^{n}, a^{m}) \times \frac{(P (m A b_{i}^{n}) + P (m A b_{i}^{m}))}{2} .

N_{w}

: number of pairs of sequences of binding strains in

{H A}_{b o u n d}

N_{b}

: the number of binding sequences in

{H A}_{b o u n d}

times the number of non-binding sequences in

{H A}_{u n b o u n d}

D_{B L O S U M 62} (a^{X}, a^{Y})

: a modified BLOSUM62 distance measure (see STAR Methods) between amino acid in position

a

from sequence

X

and sequence

Y

A

.
(5) Rank all positions on the HA region and select the top

Q

positions as the seed epitope patch,

S E P

.
(6) Rank all non-

S E P

remaining positions on the HA binding region using the geometric mean distance to the

S E P

center to obtain final ranked list of all HA binding region positions.
Output: ranked list of candidate epitope positions within the HA binding region.