Skip to main content
Computational and Structural Biotechnology Journal logoLink to Computational and Structural Biotechnology Journal
. 2025 Jun 27;27:2915–2923. doi: 10.1016/j.csbj.2025.06.026

AI-based antibody design targeting recent H5N1 avian influenza strains

Nicholas Santolla a, Colby T Ford a,b,c,d,
PMCID: PMC12270607  PMID: 40677246

Abstract

In 2025 alone, H5N1 avian influenza is responsible for thousands of infections across various animal species, including avian and mammalian livestock such as chickens and cows, and poses a threat to human health due to avian-to-mammalian transmission. There have been 70 human cases of H5N1 influenza in the United States since April 2024 and, as shown in recent studies, our current antibody defenses are waning. Thus, it is imperative to discover new therapeutics in the fight against more recent strains of the virus.

In this study, we present the Frankies framework for automated antibody diffusion and assessment. This pipeline was used to automate the generation of 30 novel anti-HA1 Fv antibody fragment sequences, fold them into 3-dimensional structures, and then dock against a recent H5N1 HA1 antigen structure for binding evaluation. Here we show the utility of artificial intelligence in the discovery of novel antibodies against specific H5N1 strains of interest, which bind similarly to known therapeutic and elicited antibodies.

Keywords: Avian influenza, Antibodies, Protein modeling, Docking, AI

Graphical abstract

graphic file with name gr001.jpg

1. Introduction

Highly pathogenic avian influenza A (H5N1) remains a persistent global health threat due to its zoonotic potential and historically high mortality rate in humans. Despite sporadic outbreaks and ongoing disease surveillance by the World Health Organization, United States Department of Agriculture, and US Centers for Disease Control, the continual antigenic evolution of the hemagglutinin (HA) glycoprotein, particularly within the HA1 subunit, presents significant challenges to the development of broadly neutralizing therapeutics. The HA1 domain is critical for receptor binding and immune recognition, making it a prime target for antibody-based intervention [1].

To address the urgent need for rapid, scalable design of high-affinity antibodies against emerging H5N1 variants, we introduce Frankies, a computational pipeline that integrates state-of-the-art generative artificial intelligence (AI) modeling, protein structure prediction, and docking simulations. This pipeline begins by generating novel Fv antibody fragment sequences via conditional protein diffusion using EvoDiff, leveraging a curated set of reference sequences from the Therapeutic Structural Antibody Database (Thera-SAbDab). The resulting sequences were folded into 3-dimensional structures using structure prediction models such as AlphaFold3 or ESMFold v3, which have demonstrated near-experimental accuracy in modeling antibody-antigen interfaces [2], [3]. Finally, we employ HADDOCK3 to perform rigid and flexible docking of the modeled Fv regions to the HA1 target, enabling the assessment of binding orientation, stability, and epitope accessibility.

By systematically combining data-guided AI generation with physics-based modeling, Frankies represents a promising step toward automated antibody discovery and optimization. In this study, we apply the Frankies pipeline to a recent H5N1 HA1 antigen and characterize a set of de novo antibody candidates with favorable structural and biophysical properties. This work lays the foundation for future in vitro empirical validation and paves the way for accelerated response to future influenza outbreaks.

2. Results

2.1. Pipeline performance

The protein generation pipeline successfully produced 30 novel anti-HA1 Fv antibody fragment sequences, with a 100% success rate in AbNumber sequence validation. In our systematic tests, the reliability of Fv antibody sequence generation was consistently greater than 90%. The complete set of 30 experimental runs took 11 hours and 34 minutes to complete in serial, averaging 21.13 minutes per run. In a distributed computing context, this walltime would scale linearly with ∼20 minutes per pipeline run. Individual component processing times were distributed as follows: approximately 5 minutes for EvoDiff operations, less than 1 minute for structure predictions with ESM3 (or around 20 minutes with AlphaFold3), 16 minutes for HADDOCK3 computations, and less than 1 minute for the output report generation, shown in Fig. 1.

Fig. 1.

Fig. 1

Frankies output dashboard report. Generated with Quarto.

2.2. Generated Fv sequences

The 30 diffused Fv heavy and light chain sequences are listed in Table 1. Note that all diffused sequences are a defined length, with heavy and light chains having 132 and 125 amino acids, respectively. This is controlled by the diffusion step, which defines a maximum sequence length to be conditionally generated. The top 10 candidates, by lowest HADDOCK score, are shown in Fig. 2. Note the epitope diversity given the random surface residue detection for unbiased binding site selection.

Table 1.

The 30 Fv heavy and light chain sequences diffused and evaluated in this study.

Sequence ID Heavy Chain Light Chain
kinetic-template EVQLARSGAEPTMPGETVKLSCKTSGYNASDTSFYIGAARQTGGKGLEWMGHISPTNGNPIYYSEK IQARLTLTADTTTETTYIQLLAFKSEDSAMFYAARHRTGHYYGYGSYWPLNGGDIWGGGTLVTVSA ESVLTQSPGSLIISVGERATISCKASQDLVNDTGHSFPHWRYLGKPGTAPKLLGYGASNRAS GAPGRFNGSGSGTDFSLTISRTASELKPKDVATYYCQQYNATPPTYRSQTYGGGTRAEIKSQP
partial-lagoon QVGLVQSGPEVKTPGESVKISCTAGGYSFSSGLYWIDWVRERHGQGLEWMGMIHPSTSENTKYNPS FQSRVTISVNNSTNTAKEELSSLKAEDTATYLAARSAAVDVYTRGAYGKADRFEAWGQGALATVSS ELVLTQSPLSTSVSPGERATLSCRASQSLVYGNGYNNLAAVTRQMPAEATRLLISGGSTRAT GVGSRLSGSGSGTDYTLTINSQTKQLQSEDFAAYYAMRYNNWPTRLIKQTFGGGTKLEIKDFP
glowing-avocet QARLVESGAEVTKTGQSTKVSAKMSANTYSSADYTVSAVHQYPGKALFWIGHIYYPNGYYRDNAGS FQGRVTISTDASKSTTYLTLSSLKSEATAIYRCQYRRRYYKVGVSSYVPLDWYDNLGQGTLVTVSS DVVMTQSPESLAVNPGERISISCKGSQSLIASDGQNYVHWRYKAKPGQSPKILAYSASNRAT GVPARISGSGPGTGFTLAISSISNAIQAEGLTTNRCQRELETAIVRSFRPFAGGTVLEIKALK
approximate-entrepreneur EVQLLQSGAGAKKPGATVTISCVVSGYSYSSYYSGIDKVRQRPGHGKDCVGGIYPRSGYYTHYTEK FQGRVTYPSGKQTNNAYLQLNSVTTEDTAVYYCARPGLESFYGVGVNWSHNNSMSIGQGTLVTVSE GIRITQSRASLSVSAGEPASIQARVSQSLIKGDISNYLEYYYQRKPGQPPKLFIYSASTRAT GVPARLPGSGSGTDFTLTISSRDGFVGSEDFGIYYCQRVENTPPQLGQPTFDQGTKLEIKGYK
recursive-basin EVQLVEIGPEDKKPGTTLKLSCVPSGVSFHSTNYAVSGVKQSPGQGPEAMGNTYPSSGCDTDYCQK FQVRATITTRRSTSTVYLELNSLKPDDPAVYYCARVSHNTEKTPGSYFHPGDEELWGQGTLVTALS DIVLTDSPSGLSASTGERPTISKRVSNSLSDFDRSTTLAWFFQARPGRSEKALIRAASSRAS GVPVRFSGSGSMTSFAFTISSAAAALEAEHPATYYCQQSRDDPAARASAAFGGGTRVEIAPLA
free-hearth EVQEVDEGGEVVKPGKSARISCKNSGYGFSSQSYAVSWIREAEGKGLAAMTVISPTGAAGIHRNEK VQGRVTISKDHSSNTVYLEMNSLKSEGTAIYAAARNGQYNFVVVGSYWGYGLFDSWGEGALCTVSS ETVLTQSPFTLSVSPGEPITVSCRSPQNLVDSNVYNNINWEYQKKPGQAPPRLIYVNSNGAS GAPDELKGSGSGTDFVLTIRRCPKEIKDEEVVVYYCQKSIVSPVEWVVATFGQNTRVEIKKKV
smoky-latitude EVRLVQSGREVVKVGESLKISCKASEYNFSSTNYYTGWVRKPPGQVRMWISYIYHTTTEGTNYSAV VKPYAGVCYGKSINTVTLHMNSVKASDTAVYYCANTIRYDVYSVSPTWDHDWVDSWGQGTLVTVPQ DSQLTQSQSSWSLSVGDFVHGTCKTSQSLSQPDVYSYTHWSGRRHPAQTPKLLIYLASTRGS GVGTHISGSGSGTDYTLTISSFGSTTRAGGFATRFCQQSRADTTRAGRQTDGSGTKVGTKGSR
internal-rundown QAGLVQSGAELKKTGSSLRVSCKSSWYTYSSSYYAIHIIREAPSKGLEWVSRINSRHGYYTTYAPS IQGRVTFSTDKSTSTIYMPLSSLSSEDTAVYFCAPHASEHYVGSGTSWLHDWAESSGQGTTVTASN DIVITQSPLTCSVSLGETTSVSRRTSQNLIDNNKYHYFAWLYQQKPGQAPTLLIYAGSYRAS AVSDRPSRSGSGIDYTLTISSKSTIVECDGVAVRYCQRSNSTPTRLAALDFAAGTKPEIKNPS
critical-tin QVFLVQSGAELTNPGASVKVSCKTSGYSFSTTSYGMSWIRRAGGQGLEAIGWISHRSGYRTNASPK FQGRVTINTGASTSTVYTQLRSLKPEDTTVYYSARDGAHSFVAAGSAWGLDWGGYAGEGTIVTVSA DIVITTSPMTRSVSVGEAASISCCRSQSCIDGNGYNYMNWIYRQKSGQAPRELIYGASKVAT GFPARFSASGSGTDFTLTISYTAVNVEPGGVGSYYCQRARSTPSKRAYQTFGAGTRVEIKLAN
gilded-stud VAQTLQSGPELMQPGASVKISCTDSGYTISSTSYAFSWARGSGGKSLEWVGWIHWHTGVGTQYADS FQGRATSDRDKSKNTASAQFNSARSEHSGVAYGASDRTTTTYGLGRPVVVGWADSWNQGTLATASS DLVLTQSPASVSVTPGTSASISCQSSQSQVDSNDYNYANRAYQQMPGQAPTLMIRSASYRPS GVPSRISSSGSGTSASLTISRVNANLQEENEANYRCQTSSVSGNRICSPVFNSGTKLEIKGPP
concave-glove EVQLQESGAGLVKTSESGSISCTTSGGGGPSGGYWMSWGRQGPGGGLEWSGRIYGVSGDGTNGRGS LKERGTLSPDTSTNTASLGMSSVTASDTALYYGARGAMGGPVGGGSYGGLNGGDGQGQGTLVTVSS DYVLTGSPASASVSPGESPTISCRATQTFVDGDGTKYVAWAAQAKPGQAPKLLISLDSNRPT GVPSRFSGSGSGTDYSLTITGAKNTLQNEDVADYYCQQVRSSPPARGSPSYAALTKLDAKNPS
boolean-burbot EVELCESGAEVEKPGSSVKVTCKVTGYAFSSTSYAISKVVQAGNPSLASIGELSPSSGDYTRYNEK VTAKVTLTADKSTNTTYLELTPLTSEGTAIYICTRRARYDRSGVGSDYVGDWQDPAGQGTLATVSS ETVLTQSPGTVTVSPGERATMSAKVSISTRMSVSTNYLNWAYEQKPGQAPRLLIHGASNRAS GVSARLSGSGSGSLFSRTISSNEDEVEAFQLAIYYCDQNTSDPECLPRDTYGGGTKLEIKEVP
magenta-food EAQLQESGAELNKTGASAKVSCTHSGYSLSDTSYYINGAKQAPDKGPFALGGLYASSRYGDDTAQS TKSLVHVTRDRTKNTTSLELSSLKAEGTGIYYALGWGSYGKAGLGCSGLDGYFAYWAQSTLVTASS EIVLTRSPATVTVSPLQRATVSCRNSNSNVDSDGYSYLHWYYQQKPGQAPKLAIGSASNRVS GVPSRFSGSGSLTDYALTISSDAAALQAGDAADYACGAAANDTPGRGSATFGPGTRVTIKGQL
exponential-cymbal EVELVQSGPETVKPDKSVKVSCKTEAYSFSTPSHYVSAARSSTGQGLEWMPGIYASSGYKTDYAEP VQSRVTKTVDKTTTTAYTELSSLTAKDTAVYYIARDGTYDRYAGGHYGHHNWEDYWGQATMATVSH EGVMTQSPATLRLSEGERVTISCTYSQNNISINSYNDIGWTYIQKPGQPPETLIYLSSVRAT GIQDHFSGSGARTDYALTITRATAAMQPEDLAVYYCQQSNEDPPNGGPTTFGGGSRVEIKGQP
soft-spook QAERVRSGEELKQPADSVKISCKTSGNTFSSSHGEMNWVKHAPCQGREWLGYTLARSGYGTHYSPK FVGRTTITAGKTSSTTKMQLSSLMSEGSAVYRCARVSTTSCNGLPSYYPHGGADVWGQGTTVTVSS ESVITQSPSSQPASPGELLTISCQASPINIVNKSYNHIACEYQQMPGQVSKLLTAGASIRPS VVPSRHSGSSSGTLYTLTISFIASILCSEDFAVYVCQNFCSLKACWGSVAGGGETKVEIKGQL
creative-halftone EAQLVFSGAELTQPGNSLAISAKSSEDSIYSVNYVVSWVREAPGQGHLIMGGIHPVPNTGTKYGQV FQGRVTITADNSTNTAYVKSTSFPSDDTAVYYCTRHTFCGGVNLGSGYLQTAFDYWGQGTAVIVSS GIVMTQSPATLSASPGETATISCKGSNSISDNAGPNYLAWVYQQKPAPPPKLLIYSASNRAN GDPEGFSGSGSAPGVSLTSSSVPKIVEEGDAAARYCQQTNVVPAKWETKTFVPGIKLEIAGQG
brilliant-charge TVQLRQSGSEAKRPVESLKVSAKASSVSFSSGAYYASDIRQAPGNTLEWMGAANAANSNDTAYNQS FQGRVTINRDKTITTAYLQLNNLTAEDTDCFYCATDASCDFITNGPYYFNDWADTWGQGTMVVVFS EIVTTQSGSTMSVPLGEHATISCRGSESPVSSYESNLGAWSYQQKPAKAPKRLIYRYSNRPS GVPSRFSGSFSGTDVTLDISGWGSSLQSEDVAIRYAQQFSNLPSTFGLTTFGQGTKVVIKDCS
strong-bear EVQRQQSGAEVTKPGGSLKVSCKTSGYTFSSTTAAVSWVKQPHGTGLEWTGWLYHESGDGTNYAES VRGRVTVSYGKSTSTASLQMSSLRSEGTHVYYSARPGTGDWWGVGWGWGGNWFDSWAQGTTVTGSS EAPLTQSGLSLSASSGNRATHTCRTSQAKVQNSIYIYIHWGYQQKPAKSPQLLIYGASSRGT GVPSRFSGSGSGTDYTLTISSNSLTLQPEDYATYFCQQSNVSPNNYESQTHEQGTKEEIQDQT
minty-cylinder QVGFVEWAGGVKIPSASKKLSCKASVGSVSSTNYGISAVRQAGAEGLKAVGWISGMGGTYTDYSES LKGVVTISAAKSTTTTFIELSSLRPSSTTVRYCAPPASQDRVGSGSPGGPGWFKPWGEGTLITVSS DLEMTQSPLSLAVSLGESISIPCRTSQSLVDSDKYNFPDLLYAQKPGISPRLLIYTGSSRAT GSPDRISASGSGTDFTLTITKQGDGVEAERIATYYCQQPRNTPRRINSQAFAQSTKLEKKAKA
bitter-folder EVRLMESGAVVKQPGQSLKVSAKDSGYAFRNTSYSISWPRGAPGQGLEWMGYIYPNSGDGTNRSQS VQGLVTISTNKSISTASLQLSSLKAEDTPVYATARHDGYHWFAYTCHWMHGAADHWGQITLVTESS ETVLTQSAATLSVTPGEGASLSCKALQSLVHNNGYNFIAAFYNQKPGQSPKRLIRGGANVGS GIPSRYNASGSGTDTSQTITSDHSALQSEGVQVYYCEQYTTTPKSPTSKTFPGGTKVEILPQV
contemporary-wine KVERTQRGAEVKKPDKSLKISCAASGYSASDTSHYINWVQQAPGKALEWIGIIYPSSGDRTKYAEA FQGRVTITRDGSKNTAYARCNSVTPEGTAVRYCARHGSQTRFAIGSYWPVDQEGFWRQGTFVTVCS GIVLTQSPPSLSAPVGESATASARGSQSEVDADGYNYLQADYQQKPGQAGQLLIYGVSNRES DVPARLSNAGAGTGYTTTISSAAVWIQSADFGVYFCQQANNTPSGRVSTRFAGGAATLPKGKT
quadratic-format EVDTTQSLASAKMLGESVRISCKASGYTFTKPYYTYQWVKQTKAEILYWVGVTDPANSDVINYQPK EQGRVTLGVRKSTSTNWMRRRSLRSEDTNVYYCRRVRTYHYVNNGGGWVDNWFHNFGEGTMVTVSS EIVLTQSPASIALSTGERATISCRANHPFIHSDGSNYLDWVRQQKPGQSPTRHIYGASYHET DIPDWFSGSGTGTDFTLTIRRSTSVVEAEDTGVYYCQQFSVSPPDWEASNYGDGTRVEIPGVH
avocado-bumper EAERVESGAEVKKPGASTKISCKAAGYSFSSTSYWMHWVRQMPGEGLEWMGRIYPSKSCGSNRSMK CQGRVTLSTDTSTNTASLQLRSLTPSDTATYRAARQAFHGWVGIGSTWPDDWADVWAQMTLVTVSS EIVNTQSPGTLSVSPGERATITCKASESTIAGNSYPYIGVNYLKKPGQAPKFLIYSASNRIS GIPSKFNESWSGHDFALTISNPPQIIQSFDFADYYCQHINSSPPRYQSLTFGAETKVEIKTQP
cold-electricity EVFLLQAGPFLTHTGSSLKVTCKNSGNSFTTGSYTIKAVRQSGGTATFWIGSIIPSNGYGTNTAKT IKGRATISADTSTNTAYMELSSLASEGSALYSCARDAQNSWVGRGWYYGLNGFGMAGQGTTVTVSS ESVMTGSEASLSVCPGESATISCRTTQSLIYSDGTNYLHWTTQQKTGQSPKLCIYSHSKRAS GVSGRTSGSGFRTDATLTISSHSYSTTAEDVSTYYDQQALNPPAHHGSSTYGQGTRLEIKNAP
flat-gutter EVDLNQSGAETKITGQSIKVSCKTSGVSFPEADYATPLTRQHHGKALEWMGNTNYGTGYTTNYGPK IQVRVTLNSGKSTSTAYLPKKSLKAEYTTIYYCVRDGHQTNVESTGQGQIGYFNYWGEGTLVTVSN GIAMTGSGSTISVSPGERATISCRASQSTVDKSVSNYVHWVFQQLPATSTKRIINGSSNRES DVPSRTSGSKSGHDPTLTISRRSSDLEPEDVAVYYCQSYTSTPSELVSQTYGQATKAEITGQD
symmetric-pad QAQLVQSGAGVTKGAASVKLSCKTSGYSISSYSYGVSAVRQAPGQGPEWVGGISPMSGPYTHYAQS VQARLTLTVDKSTSTAYLELTASNPEDTATYYAARNARGTRVGVGPHYLLDWHDYWGAGTLVTVST DIVMHQSPTTLSVNVYEPATISCKTSNTLANGDGSNYVVWYYQQKAGQSPKRLIAGISTRAT GVEHKFSGNASGTDITLTISSTHTAVEPEDFAVHYDQQYRNWLKKLISPTFGGGTKGNRPSKV
antique-structure QAQLEQSGVEVVKPGSSVKVSSKTSGYWASTTSHWISWVRRSPAKGFEWMGGIQPGSGNYTNFNEK YQGRATITAGKSSNTAYTQLTSLTAEGTTTYYCARNNTHDTYGSGSSYPLDYFDVWGQATTITVSS EVVLTQSPGTTSLPPGERATLSIHASHHLVDSDGSTYVSWVYQEKSGQATRRTIYGASNRAS GIVGRFSGSGSATGYTLTIRRADVSVESEQSAVFFAQQFSSTPQKWGSVTFGHVTRLEIKGSP
cream-callback GNFLVESGAGATKPAPSLSVSCKVSGESFSSGSYGISWARQAGGPGLEWMGGIIPSSGEFINRGPS FQGKATITAGRSTTTAFFELSSLTSEDTAVYYCMRPRRFDFYGLTSYQPLGWHGYWGQGTLATVSS DAVMTQSPPTLPVSVGESASISKKAAESVVSSDAYNYLNWAYEENPGQSPEMLIWAGTNRES GIPDRFSGSGSGTGFTLSISRVRSATEAGAVAVTYAMGSIAHPKPWGTKTFGQGTKVEIKGQD
inventive-amarone QAQIVQSGPGLVKTGTSVKVSAKTTGYNFSNKNYIVSWVREVPGRGLEAMGRIYGRDGDYTDRAEK VVGKVTISTDKDKNTWYLQMSSLKAEDTAVSYAARNDLVCYGGGGRYGLHNAYDNAGQGTLVTVSS DIVTTQTSGKLSISLGERVTINYKTSQSYVDGSGYNYTHHAYEQKDGKYPKLLIYGGSNRES GVPDRDSGSNAGTDVTLTISEVVMVVQSDDKINRYCSQSTDYTLYLDAVTFLQGTTYEIKYNP
messy-discriminator EVRLVQAGPEVKQQKESAKLSCKTFGLSVSSTHYGNNWAHGAPGNGPEAIGHILPMNGYGIHYCPK VQGNSTISTDKTTSTAYMDLSSATSEDTAIYYCTVPATKLTYGTACGWGLSYFDPWAQGTLATVSS EIVITQSPITLPVSPGEPASITCRASQSVLHSDGYNYLDWGVKQKPGQAPQHLIALASRRAS GVGARFSGSGSGHDFTLKIRAYNAIVQSEGVGVYYCQAANQTPQGFGQQTFGGGTKLEIKNDP

Fig. 2.

Fig. 2

Top 10 diffused Fv candidates by HADDOCK score. EPI3009174 HA1 is shown as a white surface model and the diffused Fv structure is shown in a colorized cartoon style.

2.2.1. Epitope variation

In the Frankies pipeline, users have the option of defining active residues on the antigen to guide the docking process. In this study, however, we allowed for the diffused antibodies to bind to any surface residue on the antigen structure, reducing the bias in the binding process. This produced an interesting pattern of particular residues commonly forming polar contacts with the antibody CDR loops. See Fig. 3.

Fig. 3.

Fig. 3

Structure heatmap showing the prevalence of polar contacts made between the diffused antibodies and the HA1 antigen. Annotated residues indicate those that are interfaced ≥20% of cases across the 30 diffused antibodies in this study.

2.2.2. Structure biochemistry

Further evaluating the biochemistry of the diffused Fv structures shows consistent desirable traits across various metrics.

Evaluating protein stability, desirable solvation metrics were predicted using FoldX. As shown in Fig. 4A, the average total energy predicted by FoldX was -121.9 kcal/mol, ranging from -144.8 to -60.7 kcal/mol. Also, favorable polar solvation and hydrophobic solvation was predicted, indicating desired polar interactions with the solvent along with proper burial of hydrophic residues in the Fv structures. Futhermore, the predicted hydrophic solvation indicates a low propensity for aggregation of the proteins in vivo [4]. See Figs. 4B and 4C.

Fig. 4.

Fig. 4

Boxplots depicting the distribution in solvation metrics. A) Total energy - More negative values indicate better overall stability. B) Polar solvation - Lower values indicate favorable polar interactions with solvent. C) Hydrophobic solvation - lower values indicate better burial of hydrophobic residues (favorable for folding) and high values might suggest exposed hydrophobic surfaces (i.e., risk of aggregation or low solubility). Arrows indicate the “better” direction of each metric.

Regarding the humanness of the diffused heavy and light chain Fv sequences, we see a bimodal distribution where about half of the chain sequences are predicted to be human (whereas the remaining are of hybrid- to mostly murine-level of composition), shown in Fig. 5.

Fig. 5.

Fig. 5

Density plots showing distribution in the predicted humanness of the diffused Fv heavy and light chain sequences.

Note that none of the sequences in this study have been humanized. Rather, the diffusion process simply generated sequences on a spectrum between human and murine, based on the input set of reference antibodies, preferring either extrema rather than creating hybrid sequences. Thus, the humanness of these diffused antibodies can be improved using standard humanization tools/processes. The average humanness probability for the heavy and light chains of the top 10 performing antibodies, shown in Fig. 2, was 44.7% and 62.2%, respectively.

2.3. Sequence analyses

Note the sequence diversity depicted in the logos in Figs. 6a and 6b. While natural heavy chain sequences often start with EVQ or QVQ, there is additional sequence diversity in position 3 with the introduction of arginine (R), phenylalanine (F), glycine (G), and aspartate (D) amino acids, which are atypical here. However, the conditional sampling shows a consistent selection of glycine at the 6th position and VTSS at the end of the VH sequence, which are very common in natural antibodies. For the light chains, the common starting sequences of EIV and DIV were seen in the diffused sequences along with expected sampling of methionine (M) or leucine (L) at position 4, and EIK toward the end of the VL sequence (positions 120-122) [5]. Note that all of these sequences were successfully numbered using the Chothia numbering scheme [6].

Fig. 6.

Fig. 6

Position-level amino acid probabilities of the diffused Fv chain sequences. Created with WebLogo [7].

Comparing the sequences of the diffused and reference antibodies shows a 47.24% (± 0.06%) identity in the heavy chains and a 52.37% (± 0.08%) identity in the light chains. This is within an expected range given that the generated sequences were diffused by sampling an input distribution and that all of the sequences folded into the desired Fv antibody shape. Sequence identity distributions and more detailed pairwise identity comparisons are shown in Supplementary Fig. 1.

2.3.1. Folding performance

Of the 30 diffused pairs or Fv sequences presented in this study, all folded with an average Predicted Local Distance Difference Test (pLDDT) confidence >0.5. As shown in Fig. 7, most sequences folded with a mean and median pLDDT >0.75, indicating a highly confident structure prediction.

Fig. 7.

Fig. 7

Scatterplot of mean versus median pLDDT across the 30 diffused Fv structures folded with ESM3. Marginal boxplots show the distribution of the mean and median pLDDT values.

This is expected given that each diffused Fv sequence was numbered through the Chothia numbering system as a quality control step in the Frankies pipeline before advancing through to the evaluation steps.

2.4. Comparison to existing antibodies

The Frankies pipeline consistently generated Fv sequences and structures that bound well to the HA1 epitope.

2.4.1. Binding performance

For comparison purposes, as shown in Fig. 8, the binding performance of the 11 antibodies used in Ford et al. 2025 that were bound to H5N1 isolate EPI3009174 were selected as a reference. Then, these were compared to the 30 Fv structures generated in this study, which were docked against the same isolate antigen structure.

Fig. 8.

Fig. 8

Comparison of the docking metrics between reference and diffused antibodies. Pairwise comparisons are shown as p-values from the Wilcoxon signed-rank test. For all metrics except buried surface area and desolvation energy, lower is likely indicative of better binding, indicated by arrows.

The Van der Waals energy is significantly better (more negative, statistically) in the diffused antibodies, indicating stronger non-bonded atom-atom interactions. This could imply tighter packing or improved interface complementarity in the diffused antibodies.

In contrast, the reference antibody set displayed better electrostatic energy and buried surface area, though there are multiple examples of diffused antibodies that are within the same range. A larger buried surface often correlates with stronger binding and greater stability. In this case, the diffusion process may be generating antibodies with slightly reduced interfacial engagement.

However, the desolvation energy, total score (electrostatic + Van der Waals energies), and HADDOCK score were quite similar overall and showed no statistical difference according to the Wilcoxon signed-rank test at the α=0.05 level.

Thus, these metrics show that this pipeline was able to consistently generate antibody sequences with similar performance to therapeutically-derived or elicited antibodies.

3. Methods

The Frankies pipelines was designed as an automated workflow for producing antibody candidates and predicting their binding affinity at-scale. This pipeline is written in Snakemake [8], [8], [9], which provides a reproducible framework for running all of the steps mentioned below.

Inside the Snakemake pipeline, various steps are run as Python or Shell scripts, while other complex tools are executed using Docker Containers. The modularity of the pipeline is designed such that each run produces a single antibody candidate, tagged with a randomly generated experiment name, along with an evaluation of its performance against a given target. Thus, the Frankies pipeline can be run n times, without modifying the initial configuration, to produce n unique Fv candidates. The overall workflow is shown in Fig. 9.

Fig. 9.

Fig. 9

Frankies pipeline workflow steps.

3.1. Reference dataset preparation

To guide conditional sequence generation, we curated a reference set of HA1-targeting Fv sequences from the Therapeutic Structural Antibody Database (Thera-SAbDab) combined with additional antibodies used in Ford et al. 2025 listed in Table 2.

Table 2.

Reference HA1-neutralizing antibodies from TheraSAbDab and those used in Ford et al. 2025.

Antibody ID PDB ID
Year
(Reference)
H/L Chain
Subgroups
Source Information
5dur00F4 5dur 2015 [8], [8], [8], [8], [10], [10] II/I Human Memory B-Cell, Recovered from H5N1 Infection
12H5 7fah 2022 [11], [11] I/IV Mouse, Immunised with three H1N1 strains, Humanised
13D4 6a0z 2018 [12], [12] I/I Mouse, Immunised with five H5N1 strains, Humanised
3C11 6iuv 2019 [13], [13] I/II Human Memory B-cell, Infected by H5N1 viruses
65C6 5dum 2015 [8], [8], [8], [8], [10], [10] I/III Human, Infected by H5N1 viruses
AVFluIgG01 6iut 2019 [13], [13] II/I Human, Infected by H5N1 viruses
AVFluIgG03 5dup 2015 [8], [8], [8], [8], [10], [10] III/I Human, Infected by H5N1 viruses
FLD194 5a3i 2015 [14], [14] II/I Human Memory B-cell, Recovered from H5N1 infection
FLD21.140 6a67 2018 [15], [15] ?/I Human, Recovered from H5N1 Infection
H5M9 4mhh 2013 [16], [16] I/IV Mouse, Immunised with H5N1, Humanised
H5.3 4xrc 2015 [17], [17] II/? Human, Immunised with one H5N1 strain
Firivumab/ CT-P22/ CT120 None 2014 [18] I/III Human, derived from the human immunoglobulin repertoire with heavy chain from IgHV1-69 and light chain from IGKV3-15 gene segments.
Gedivumab/ MHAA4549A/ RG7745 4kvn 2016 [19] III/III Human, cloned from a single plasmablast, derived from an influenza virus-vaccinated donor.
Navivumab/ CT-P23 4r8w 2015 [20] I/III Human, produced recombinantly in Chinese Hamster Ovary (CHO) cells
Sonavibart/ HY-P990944/ VIR-2482/ MEDI8852 5jw3 2024 [21] II/I Human, Derived from a vaccinated human plasmablast—expressed recombinantly in CHO mammalian cells

Sequences were trimmed to retain only the Fv portion of the antibody chains and were then aligned using MUSCLE v3.8.425 [22]. This produced separate input .a3m alignment files for heavy chain and light chain sequences.

3.2. Conditional sequence generation

We used EvoDiff's MSA_OA_DM_MAXSUB model for conditional sequence generation [23]. The conditioned sequence generation sampled from the curated reference sequence files using the ‘MaxHamming’ distance to produce novel, HA1-targeted Fv chain sequences. EvoDiff operates as a protein diffusion model trained on general sequence databases and can be guided by user-provided templates or alignments of desired reference sequences in .a3m format.

Thirty Fv candidate sequences were generated and filtered for naturalness using AbNumber, a Python wrapper for ANARCI [24]. This attempted to number the diffused sequences with the Chothia numbering system [6] and, if it failed, the diffusion process would restart generate another sequence. This was performed to help ensure that the diffused sequences were as antibody-like as possible, lending to more confident subsequent folding and improve future protein synthesis capabilities.

Sequences were analyzed to predict their overall stability, including polar and hydrophobic solubility using FoldX v5.1 [25] using the ‘Stability’ command. This provided a variety of energy metrics for each diffused Fv structure, all of which are reported in the supplementary GitHub link.

Using Humatch's ‘classify’ functionality, each sequence's “humanness” was also evaluated, which reports a predicted human probability percentage for a given input heavy and light chain sequence pair [26].

Also, sequence identity of each diffused sequence was compared to the set of reference sequences using the Biostrings v2.66.0 library in R v4.2.2 [27].

3.3. Structure prediction

Diffused heavy and light chain sequences were folded using either AlphaFold3 [2] or ESM3 [3].

AlphaFold3 may be desirable for groups wishing to perform purely local folding. While AlphaFold3 provides highly accurate structure predictions, it requires the user to download and store >600 GB of reference databases and model weights to run. Plus, the multimer prediction of the heavy and light sequences together take approximately 15 minutes on a GPU.

Alternatively, Evolutionary Scale's ESM package and esm3-medium-multimer-2024-09 model allow for API-based multimer structure prediction without any reference databases. This package returns the predicted structure in a few seconds.

Both AlphaFold3 and ESM3 consistently produce reliable antibody structures and thus we provide support for both models.

Complementarity-determining region (CDR) loops were detected using ANARCI and residues belonging to the CDR loop structures were selected as “active residues” in the subsequent docking process.

3.4. HA1 antigen preparation

A reference target antigen structure—HA1 subunit of H5N1 hemagglutinin—was obtained from the Protein Data Bank (PDB ID: 2VIR). This structure was used to test and validate the Frankies pipeline.

Then, the HA1 structure of a more recent isolate EPI3009174 was folded and used for subsequent novel antigen docking. Isolate EPI3009174 was collected in from a 9-year-old Cambodian male patient who passed away in 2024. This isolate was previously analyzed in Ford et al. 2025 against 11 reference antibodies.

For docking, the residue numbers in the antigen structure were incremented +1000, to avoid overlapping numbers with the antibody structure, and all residues were assigned to chain B.

Surface residues are automatically detected on the antigen structure and a 25% random subset of the surface residues was selected as “active residues”.

3.5. Antibody-antigen docking

Fv structures were docked to the HA1 antigen using HADDOCK3 [8], [8], [28], [29] following the methodologies shown our recent previous studies [30], [31], [32]. Docking configuration files were generated using an antibody-antigen docking template from the Bonvin Lab [33]. These configuration files, along with the other required docking files, were generated in a docking preparatory step in the Frankies pipeline.

Docking was performed in a multi-stage protocol including rigid-body energy minimization, semi-flexible refinement, and explicit (water-based) solvent modeling. Docked complexes were scored using HADDOCK's built-in scoring function and further evaluated using the other biochemical/biophysical binding metrics listen below. The template for the HADDOCK protocol is available in the GitHub repository.

  • Van der Waals intermolecular energy (vdw) in kcal/mol

  • Electrostatic intermolecular energy (elec) in kcal/mol

  • Desolvation energy (desolv) in kcal/mol

  • Restraints violation energy (air) in arbitrary units

  • Buried surface area (bsa) in Å2

  • Total energy (total): 1.0vdw+1.0elec in kcal/mol

  • HADDOCK score: 1.0vdw+0.2elec+1.0desolv+0.1air

3.6. Candidate ranking, reporting, and visualization

The “best” complex is selected from the output of complexes based on the docking conformation with the best (lowest) HADDOCK score of the best cluster of complexes.

This best cluster and the best scoring complexes are reported in a rendered Quarto dashboard as the final step of the Frankies pipeline [34]. This report shows the distribution of the various binding metrics of all complexes in the best cluster. This also renders the 3D structure of the best model as an interactive object using Py3Dmol, a Python wrapper for 3Dmol.js [35]. A screenshot of the dashboard is shown in Fig. 1.

4. Discussion

As H5N1 continues to evolve, it is imperative that therapeutic advances continue to better target modern influenza clades of interest (e.g., 2.3.4.4b [36]) and therefore reduce the risk of mortality and morbidity in humans. As of late 2024, there are over 50 licensed H5 vaccine candidates [37], though many of these are based on significantly older strains. For example, in the United States, the 3 licensed vaccines are from 2007, 2013 and 2020 [38].

Some vaccine candidates are currently being developed using mRNA technologies on newer strains in the U.S. by Moderna (mRNA-1018) [39] and Arcturus Therapeutics (ARCT-2304) [40]. These were shown to be effective in animal trials and are currently in Phase I/II clinical trials. Also, therapeutic antibodies are being developed that target H5 HA1 from clade 2.3.4.4b. Multiple promising candidates were generated through hybridoma technologies, as reported in a recent preprint by Alzua et al. 2025.

AI-based antibody discovery is already well underway with a few candidates having moved into clinical testing, including anti-TL1A antibody by Absci (ABS-101), [41], [41], [42], [43], [44]. Such previous studies generated antibody sequences or structures using large-language or diffusion models such as RFdiffusion [45], AntiBARTy [46], and Abdiffuser [47] and then optimized the candidates with a lab-in-the-loop iterative process.

The Frankies pipeline presented in this study offers a streamlined and modular approach for the design of de novo antibody candidates against rapidly evolving viral targets such as H5N1 influenza hemagglutinin. By integrating generative protein diffusion, structure prediction, and flexible molecular docking, Frankies enables end-to-end discovery of Fv candidates with structural and biophysical characteristics that support high-affinity binding to the desired antigen.

A key innovation in Frankies lies in its conditional generative architecture. By seeding EvoDiff with therapeutically validated antibody sequences from Thera-SAbDab, we impose domain-specific constraints that preserve critical structural motifs while enabling conditional exploration of novel sequence space. This balances diversity with developability, reducing the likelihood of generating non-functional or unstable designs. Furthermore, our use of ANARCI and other quality control steps help to ensure that only the most promising candidates advance through the pipeline.

The application of Frankies to the H5N1 HA1 domain highlights the feasibility of using AI-driven approaches for pandemic preparedness. HA1 remains a challenging target due to its rapid antigenic drift and glycan shielding. Nonetheless, several Frankies-designed Fv candidates exhibited strong binding affinity scores and favorable interface properties, suggesting potential for neutralization. Importantly, these designs were generated in silico within minutes, underscoring the value of generative pipelines for rapid therapeutic prototyping.

Despite its strengths, Frankies has some limitations. The current conditional sequence generation can be improved in the future by implementing structure-aware diffusion (i.e., diffusing the CDR loops while a reference antibody is bound at the epitope site on the antigen). Also, better flexibility in CDR loop lengths is necessary.

While our predictions showcase the consistency of the Fv sequence generation, experimental validation will be crucial to confirm the binding specificity, affinity, and neutralization potency of the proposed candidates. Plus, the Fv sequences generated will need to be expanded to include the other parts of the antibody (the rest of the Fab region, the hinge, and the Fc region).

Today, the pipeline utilizes Docker to run the diffusion, folding, and docking steps, enabling users to run these processes without complex dependency installation in their local environment. While this is useful for cloud-based scalability, we will also implement the ability to choose the containerization engine. For example, Apptainer/Singularity are more common in on-premises high-performance computing (HPC) environments.

The current docking approach does not account for glycosylation, which plays a significant role in HA1 surface shielding and antibody accessibility. While HADDOCK3 provides valuable binding predictions, it lacks the full thermodynamic and kinetic accuracy of more computationally intensive free energy calculations. Thus, future improvements to the pipeline will include the integration of a molecular dynamics step, such as OpenMM [48], to model the trajectory of the antibody-antigen complex and to predict the stability of the interaction.

Future directions include fine-tuning EvoDiff on antibody–antigen co-evolution datasets, and benchmarking against existing broadly neutralizing antibodies targeting H5N1. Additionally, the modular nature of Frankies allows for straightforward extension to other antigens, including other influenza subtypes, entirely different viral families, or even targets in other diseases (such as in oncology, as shown in Ford 2024).

In conclusion, Frankies demonstrates how recent advances in protein generative modeling and structure prediction can be combined into an automated and scalable pipeline for therapeutic antibody design. As AI tools continue to mature, we anticipate that pipelines like Frankies will become central components of the next-generation biosecurity and drug discovery infrastructure for pandemic preparedness and beyond.

5. Contributors

Authors NS and CTF developed the Frankies pipeline. Author CTF performed data curation of the anti-HA1 antibody Fv sequences and their multiple sequence alignment. Author NS performed the docking experiments and statistical analyses. CTF and NS generated all visualization figures. NS and CTF performed the formal analysis of the structure and docking predictions of the antibodies. All authors wrote the original draft of the manuscript. All authors read and approved the final version of the manuscript.

Contributors

Authors NS and CTF developed the Frankies pipeline. Author CTF performed data curation of the anti-HA1 antibody Fv sequences and their multiple sequence alignment. Author NS performed the docking experiments and statistical analyses. CTF and NS generated all visualization figures. NS and CTF performed the formal analysis of the structure and docking predictions of the antibodies. All authors wrote the original draft of the manuscript. All authors read and approved the final version of the manuscript.

Funding

No external funding was used for this study.

CRediT authorship contribution statement

Nicholas Santolla: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Investigation, Formal analysis. Colby T. Ford: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization.

Declaration of Competing Interest

Author Colby T. Ford is the owner of Tuple, LLC, a biotechnology consulting firm. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgements

We gratefully acknowledge all GISAID data contributors (i.e., the authors and their originating laboratories) responsible for obtaining the specimens, and their submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based.

We acknowledge the following entities at the University of North Carolina at Charlotte: Academic Affairs, The Office of Research, The Center for Computational Intelligence to Predict Health and Environmental Risks (CIPHER), The Department of Bioinformatics and Genomics, The College of Computing and Informatics, and the University Research Computing group. We gratefully acknowledge the support of the Belk Family.

Footnotes

Appendix A

Supplementary material related to this article can be found online at https://doi.org/10.1016/j.csbj.2025.06.026.

Appendix A. Supplementary material

The following is the Supplementary material related to this article.

MMC

Sequence identity comparisons between the diffused and reference antibody sequences. A) Violin/box plots showing the distribution of the diffused sequences' identities by chain. B) Heatmap showing the pairwise identity comparisons.

mmc1.pdf (115.6KB, pdf)

Data availability

All code, data, results, and additional analyses are openly available on GitHub at: https://github.com/Santollan/Frankies. This repository includes the open-source logic for running the Frankies pipeline. Also, this includes all sequences and folded structures for the reference antibodies and H5N1 isolates used in this H5N1 study, analysis scripts, and docking metrics. This also includes the experimental outputs for the 30 diffused Fv candidates.

References

  • 1.Ison Michael G., Marrazzo Jeanne. The emerging threat of h5n1 to human health. N Engl J Med. 2025;392(9):916–918. doi: 10.1056/NEJMe2416323. [DOI] [PubMed] [Google Scholar]
  • 2.Abramson Josh, Adler Jonas, Dunger Jack, Evans Richard, Green Tim, Pritzel Alexander, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. Jun 2024;630(8016):493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hayes Thomas, Rao Roshan, Akin Halil, Sofroniew Nicholas J., Oktay Deniz, Lin Zeming, et al. Simulating 500 million years of evolution with a language model. Science. 2025;387(6736):850–858. doi: 10.1126/science.ads0018. [DOI] [PubMed] [Google Scholar]
  • 4.Kumar Avishek, Singh Nitin Kumar, Ghosh Deepshikha, Radhakrishna Mithun. Understanding the role of hydrophobic patches in protein disaggregation. Phys Chem Chem Phys. 2021;23:12620–12629. doi: 10.1039/D1CP00954K. [DOI] [PubMed] [Google Scholar]
  • 5.Chiu Mark L., Goulet Dennis R., Teplyakov Alexey, Gilliland Gary L. Antibody structure and function: the basis for engineering therapeutics. Antibodies. 2019;8(4) doi: 10.3390/antib8040055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chothia Cyrus, Lesk Arthur M., Tramontano Anna, Levitt Michael, Smith-Gill Sandra J., Air Gillian, et al. Conformations of immunoglobulin hypervariable regions. Nature. Dec 1989;342(6252):877–883. doi: 10.1038/342877a0. [DOI] [PubMed] [Google Scholar]
  • 7.Crooks Gavin E., Hon Gary, Chandonia John-Marc, Brenner Steven E. WebLogo: a sequence logo generator. Genome Res. June 2004;14(6):1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ford Colby T., Yasa Shirish, Obeid Khaled, Jaimes Rafael, III., Tomezsko Phillip J., Guirales-Medrano Sayal, et al. Large-scale computational modelling of H5 influenza variants against HA1-neutralising antibodies. eBioMedicine. Apr 2025;114 doi: 10.1016/j.ebiom.2025.105632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Köster Johannes, Rahmann Sven. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics. 2018;34(20) doi: 10.1093/bioinformatics/bty350. [DOI] [PubMed] [Google Scholar]
  • 10.Zuo Teng, Sun Jianfeng, Wang Guiqin, Jiang Liwei, Zuo Yanan, Li Danyang, et al. Comprehensive analysis of antibody recognition in convalescent humans from highly pathogenic avian influenza h5n1 infection. Nat Commun. Dec 2015;6(1):8855. doi: 10.1038/ncomms9855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Li Tingting, Chen Junyu, Zheng Qingbing, Xue Wenhui, Zhang Limin, Rong Rui, et al. Identification of a cross-neutralizing antibody that targets the receptor binding site of h1n1 and h5n1 influenza viruses. Nat Commun. Sep 2022;13(1):5182. doi: 10.1038/s41467-022-32926-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lin Qingshan, Li Tingting, Chen Yixin, Lau Siu-Ying, Wei Minxi, Zhang Yuyun, et al. Structural basis for the broad, antibody-mediated neutralization of h5n1 influenza virus. J Virol. 2018;92(17) doi: 10.1128/jvi.00547-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wang Pengfei, Zuo Yanan, Sun Jianfeng, Zuo Teng, Zhang Senyan, Guo Shichun, et al. Structural and functional definition of a vulnerable site on the hemagglutinin of highly pathogenic avian influenza A virus h5n1. J Biol Chem. Mar 2019;294(12):4290–4303. doi: 10.1074/jbc.RA118.007008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Xiong Xiaoli, Corti Davide, Liu Junfeng, Pinna Debora, Foglierini Mathilde, Calder Lesley J., et al. Structures of complexes formed by h5 influenza hemagglutinin with a potent broadly neutralizing human monoclonal antibody. Proc Natl Acad Sci. 2015;112(30):9430–9435. doi: 10.1073/pnas.1510816112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zuo Yanan, Wang Pengfei, Sun Jianfeng, Guo Shichun, Wang Guiqin, Zuo Teng, et al. Complementary recognition of the receptor-binding site of highly pathogenic h5n1 influenza viruses by two human neutralizing antibodies. J Biol Chem. Oct 2018;293(42):16503–16517. doi: 10.1074/jbc.RA118.004604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zhu Xueyong, Guo Yong-Hui, Jiang Tao, Wang Ya-Di, Chan Kwok-Hung, Li Xiao-Feng, et al. A unique and conserved neutralization epitope in h5n1 influenza viruses identified by an antibody against the a/goose/guangdong/1/96 hemagglutinin. J Virol. 2013;87(23):12619–12635. doi: 10.1128/jvi.01577-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Winarski Katie L., Thornburg Natalie J., Yu Yingchun, Sapparapu Gopal, Crowe James.E., Spiller Benjamin W. Vaccine-elicited antibody that neutralizes h5n1 influenza and variants binds the receptor site and polymorphic sites. Proc Natl Acad Sci. 2015;112(30):9346–9351. doi: 10.1073/pnas.1502762112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yi Kye Sook, Choi Jung-ah, Kim Pankyeom, Ryu Dong-Kyun, Yang Eunji, Son Dain, et al. Broader neutralization of ct-p27 against influenza A subtypes by combining two human monoclonal antibodies. PLoS ONE. 2020;15(7):1–14. doi: 10.1371/journal.pone.0236172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lim Jeremy J., Deng Rong, Derby Michael A., Larouche Richard, Horn Priscilla, Anderson Malia, et al. Two phase 1, randomized, double-blind, placebo-controlled, single-ascending-dose studies to investigate the safety, tolerability, and pharmacokinetics of an anti-influenza A virus monoclonal antibody, MHAA4549A, in healthy volunteers. Antimicrob Agents Chemother. 2016;60(9):5437–5444. doi: 10.1128/aac.00607-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wu Ying, Cho MyungSam, Shore David, Song Manki, Choi JungAh, Jiang Tao, et al. A potent broad-spectrum protective human monoclonal antibody crosslinking two haemagglutinin monomers of influenza A virus. Nat Commun. Jul 2015;6(1):7708. doi: 10.1038/ncomms8708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kallewaard Nicole L., Corti Davide, Collins Patrick J., Neu Ursula, McAuliffe Josephine M., Benjamin Ebony, et al. Structure and function analysis of an antibody recognizing all influenza A subtypes. Cell. Jul 2016;166(3):596–608. doi: 10.1016/j.cell.2016.05.073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Edgar Robert C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Alamdari Sarah, Thakkar Nitya, van den Berg Rianne, Tenenholtz Neil, Strome Robert, Moses Alan M., et al. Protein generation with evolutionary diffusion: sequence is all you need. bioRxiv. 2024 doi: 10.1101/2023.09.11.556673. [DOI] [Google Scholar]
  • 24.Dunbar James, Deane Charlotte M. Anarci: antigen receptor numbering and receptor classification. Bioinformatics. 2015;32(2):298–300. doi: 10.1093/bioinformatics/btv552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Delgado Javier, Radusky Leandro G., Cianferoni Damiano, Serrano Luis. FoldX 5.0: working with RNA, small molecules and a new graphical interface. Bioinformatics. 2019;35(20):4168–4169. doi: 10.1093/bioinformatics/btz184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Chinery Lewis, Jeliazkov Jeliazko R., Deane Charlotte M. Humatch - fast, gene-specific joint humanisation of antibody heavy and light chains. bioRxiv. 2024 doi: 10.1101/2024.09.16.613210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Pagès H., Aboyoun P., Gentleman R., DebRoy S. 2022. Biostrings: efficient manipulation of biological strings. R package version 2.66.0. [Google Scholar]
  • 28.van Zundert G.C.P., Rodrigues J.P.G.L.M., Trellet M., Schmitz C., Kastritis P.L., Karaca E., et al. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428(4):720–725. doi: 10.1016/j.jmb.2015.09.014. [DOI] [PubMed] [Google Scholar]
  • 29.Teixeira João M.C., Vargas Honorato Rodrigo, Giulini Marco, Bonvin Alexandre, SarahAlidoost, Reys Victor, et al. haddocking/haddock3: v3.0.0-beta.5. January 2024. https://doi.org/10.5281/zenodo.10527751
  • 30.Ford Colby T. PD-1 targeted antibody discovery using AI protein diffusion. Technol Cancer Res Treat. 2024;23 doi: 10.1177/15330338241275947. PMID: 39228166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Tomezsko Phillip J., Ford Colby T., Meyer Avery E., Michaleas Adam M., Jaimes Rafael. Human cytokine and coronavirus nucleocapsid protein interactivity using large-scale virtual screens. Front Bioinform. 2024;4 - 2024 doi: 10.3389/fbinf.2024.1397968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ford Colby T., Yasa Shirish, Jacob Machado Denis, White Richard Allen, Janies Daniel A. Predicting changes in neutralizing antibody activity for sars-cov-2 xbb. 1.5 using in silico protein modeling. Front Virol. 2023;3 - 2023 doi: 10.3389/fviro.2023.1172027. [DOI] [Google Scholar]
  • 33.Bonvin Lab. HADDOCK3 antibody-antigen tutorial. 2024. https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/
  • 34.Allaire J.J., Teague Charles, Scheidegger Carlos, Xie Yihui, Dervieux Christophe, Woodhull Gordon. Quarto. November 2024. https://doi.org/10.5281/zenodo.5960048
  • 35.Rego Nicholas, Koes David. 3dmol.js: molecular visualization with webgl. Bioinformatics. 2014;31(8):1322–1324. doi: 10.1093/bioinformatics/btu829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Janies Daniel, Ocaña Kary, Guirales-Medrano Sayal, Obeid Khaled, Alexander Rachel, Ford Colby T. Analyses of phylogenetics, natural selection, and protein structure of clade 2.3.4.4b h5n1 influenza A reveal that recent viral lineages have evolved promiscuity in host range and improved replication in mammals in North America. bioRxiv. 2025 doi: 10.1101/2025.03.15.641219. [DOI] [Google Scholar]
  • 37.Taaffe Jessica, Zhong Shuyi, Goldin Shoshanna, Rawlings Kate S., Cowling Benjamin J., Zhang Wenqing. An overview of influenza H5 vaccines. Lancet Respir Med. Apr 2025;13(4):e20–e21. doi: 10.1016/S2213-2600(25)00052-9. [DOI] [PubMed] [Google Scholar]
  • 38.U.S. Food and Drug Administration . 2024 – highly pathogenic avian influenza (H5) virus vaccines. 2024. FDA briefing document: vaccines and related biological products advisory committee meeting October 10, 2024 – highly pathogenic avian influenza (H5) virus vaccines. Accessed: 2025-04-24. [Google Scholar]
  • 39.Moderna Inc. 2025. Moderna announces updates on pandemic influenza program. Accessed: 2025-04-24. [Google Scholar]
  • 40.Arcturus Therapeutics Holdings Inc. 2023. Arcturus therapeutics receives U.S. FDA fast track designation for arct-810, MRNA therapeutic candidate for ornithine transcarbamylase deficiency. Accessed: 2025-04-24. [Google Scholar]
  • 41.Garazi Peña Alzua, Nicolás León André, Yellin Temima, Bhavsar Disha, Loganathan Madhumathi, Bushfield Kaitlyn, et al. Human monoclonal antibodies that target clade 2.3.4.4b H5N1 hemagglutinin. bioRxiv. 2025 doi: 10.1101/2025.02.21.639446. [DOI] [Google Scholar]
  • 42.Absci Corporation . February 2024. Absci initiates IND-enabling studies for ABS-101, a potential best-in-class anti-TL1A antibody de novo designed and optimized using generative AI. Accessed: 2025-04-30. [Google Scholar]
  • 43.Santuari Luca, Bachmann Salvy Marianne, Xenarios Ioannis, Arpat Bulak. AI-accelerated therapeutic antibody development: practical insights. Front Drug Discov. 2024;4 doi: 10.3389/fddsv.2024.1447867. [DOI] [Google Scholar]
  • 44.He Xin-heng, Li Jun-rui, Xu James, Shan Hong, Shen Shi-yi, Gao Si-han, et al. AI-driven antibody design with generative diffusion models: current insights and future directions. Acta Pharmacol Sin. Mar 2025;46(3):565–574. doi: 10.1038/s41401-024-01380-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Watson Joseph L., Juergens David, Bennett Nathaniel R., Trippe Brian L., Yim Jason, Eisenach Helen E., et al. De novo design of protein structure and function with RFdiffusion. Nature. Aug 2023;620(7976):1089–1100. doi: 10.1038/s41586-023-06415-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Venderley Jordan. 2023. AntiBARTy diffusion for property guided antibody design. [Google Scholar]
  • 47.Martinkus Karolis, Ludwiczak Jan, Cho Kyunghyun, Liang Wei-Ching, Lafrance-Vanasse Julien, Hotzel Isidro, et al. 2024. AbDiffuser: full-atom generation of in vitro functioning antibodies. [Google Scholar]
  • 48.Eastman Peter, Galvelis Raimondas, Peláez Raúl P., Abreu Charlles R.A., Farr Stephen E., Gallicchio Emilio, et al. Openmm 8: molecular dynamics simulation with machine learning potentials. J Phys Chem B. Jan 2024;128(1):109–116. doi: 10.1021/acs.jpcb.3c06662. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

MMC

Sequence identity comparisons between the diffused and reference antibody sequences. A) Violin/box plots showing the distribution of the diffused sequences' identities by chain. B) Heatmap showing the pairwise identity comparisons.

mmc1.pdf (115.6KB, pdf)

Data Availability Statement

All code, data, results, and additional analyses are openly available on GitHub at: https://github.com/Santollan/Frankies. This repository includes the open-source logic for running the Frankies pipeline. Also, this includes all sequences and folded structures for the reference antibodies and H5N1 isolates used in this H5N1 study, analysis scripts, and docking metrics. This also includes the experimental outputs for the 30 diffused Fv candidates.


Articles from Computational and Structural Biotechnology Journal are provided here courtesy of AAAS Science Partner Journal Program

RESOURCES