Abstract
In 2025 alone, H5N1 avian influenza is responsible for thousands of infections across various animal species, including avian and mammalian livestock such as chickens and cows, and poses a threat to human health due to avian-to-mammalian transmission. There have been 70 human cases of H5N1 influenza in the United States since April 2024 and, as shown in recent studies, our current antibody defenses are waning. Thus, it is imperative to discover new therapeutics in the fight against more recent strains of the virus.
In this study, we present the Frankies framework for automated antibody diffusion and assessment. This pipeline was used to automate the generation of 30 novel anti-HA1 Fv antibody fragment sequences, fold them into 3-dimensional structures, and then dock against a recent H5N1 HA1 antigen structure for binding evaluation. Here we show the utility of artificial intelligence in the discovery of novel antibodies against specific H5N1 strains of interest, which bind similarly to known therapeutic and elicited antibodies.
Keywords: Avian influenza, Antibodies, Protein modeling, Docking, AI
Graphical abstract

1. Introduction
Highly pathogenic avian influenza A (H5N1) remains a persistent global health threat due to its zoonotic potential and historically high mortality rate in humans. Despite sporadic outbreaks and ongoing disease surveillance by the World Health Organization, United States Department of Agriculture, and US Centers for Disease Control, the continual antigenic evolution of the hemagglutinin (HA) glycoprotein, particularly within the HA1 subunit, presents significant challenges to the development of broadly neutralizing therapeutics. The HA1 domain is critical for receptor binding and immune recognition, making it a prime target for antibody-based intervention [1].
To address the urgent need for rapid, scalable design of high-affinity antibodies against emerging H5N1 variants, we introduce Frankies, a computational pipeline that integrates state-of-the-art generative artificial intelligence (AI) modeling, protein structure prediction, and docking simulations. This pipeline begins by generating novel Fv antibody fragment sequences via conditional protein diffusion using EvoDiff, leveraging a curated set of reference sequences from the Therapeutic Structural Antibody Database (Thera-SAbDab). The resulting sequences were folded into 3-dimensional structures using structure prediction models such as AlphaFold3 or ESMFold v3, which have demonstrated near-experimental accuracy in modeling antibody-antigen interfaces [2], [3]. Finally, we employ HADDOCK3 to perform rigid and flexible docking of the modeled Fv regions to the HA1 target, enabling the assessment of binding orientation, stability, and epitope accessibility.
By systematically combining data-guided AI generation with physics-based modeling, Frankies represents a promising step toward automated antibody discovery and optimization. In this study, we apply the Frankies pipeline to a recent H5N1 HA1 antigen and characterize a set of de novo antibody candidates with favorable structural and biophysical properties. This work lays the foundation for future in vitro empirical validation and paves the way for accelerated response to future influenza outbreaks.
2. Results
2.1. Pipeline performance
The protein generation pipeline successfully produced 30 novel anti-HA1 Fv antibody fragment sequences, with a 100% success rate in AbNumber sequence validation. In our systematic tests, the reliability of Fv antibody sequence generation was consistently greater than 90%. The complete set of 30 experimental runs took 11 hours and 34 minutes to complete in serial, averaging 21.13 minutes per run. In a distributed computing context, this walltime would scale linearly with ∼20 minutes per pipeline run. Individual component processing times were distributed as follows: approximately 5 minutes for EvoDiff operations, less than 1 minute for structure predictions with ESM3 (or around 20 minutes with AlphaFold3), 16 minutes for HADDOCK3 computations, and less than 1 minute for the output report generation, shown in Fig. 1.
Fig. 1.

Frankies output dashboard report. Generated with Quarto.
2.2. Generated Fv sequences
The 30 diffused Fv heavy and light chain sequences are listed in Table 1. Note that all diffused sequences are a defined length, with heavy and light chains having 132 and 125 amino acids, respectively. This is controlled by the diffusion step, which defines a maximum sequence length to be conditionally generated. The top 10 candidates, by lowest HADDOCK score, are shown in Fig. 2. Note the epitope diversity given the random surface residue detection for unbiased binding site selection.
Table 1.
The 30 Fv heavy and light chain sequences diffused and evaluated in this study.
| Sequence ID | Heavy Chain | Light Chain |
|---|---|---|
| kinetic-template | EVQLARSGAEPTMPGETVKLSCKTSGYNASDTSFYIGAARQTGGKGLEWMGHISPTNGNPIYYSEK IQARLTLTADTTTETTYIQLLAFKSEDSAMFYAARHRTGHYYGYGSYWPLNGGDIWGGGTLVTVSA | ESVLTQSPGSLIISVGERATISCKASQDLVNDTGHSFPHWRYLGKPGTAPKLLGYGASNRAS GAPGRFNGSGSGTDFSLTISRTASELKPKDVATYYCQQYNATPPTYRSQTYGGGTRAEIKSQP |
| partial-lagoon | QVGLVQSGPEVKTPGESVKISCTAGGYSFSSGLYWIDWVRERHGQGLEWMGMIHPSTSENTKYNPS FQSRVTISVNNSTNTAKEELSSLKAEDTATYLAARSAAVDVYTRGAYGKADRFEAWGQGALATVSS | ELVLTQSPLSTSVSPGERATLSCRASQSLVYGNGYNNLAAVTRQMPAEATRLLISGGSTRAT GVGSRLSGSGSGTDYTLTINSQTKQLQSEDFAAYYAMRYNNWPTRLIKQTFGGGTKLEIKDFP |
| glowing-avocet | QARLVESGAEVTKTGQSTKVSAKMSANTYSSADYTVSAVHQYPGKALFWIGHIYYPNGYYRDNAGS FQGRVTISTDASKSTTYLTLSSLKSEATAIYRCQYRRRYYKVGVSSYVPLDWYDNLGQGTLVTVSS | DVVMTQSPESLAVNPGERISISCKGSQSLIASDGQNYVHWRYKAKPGQSPKILAYSASNRAT GVPARISGSGPGTGFTLAISSISNAIQAEGLTTNRCQRELETAIVRSFRPFAGGTVLEIKALK |
| approximate-entrepreneur | EVQLLQSGAGAKKPGATVTISCVVSGYSYSSYYSGIDKVRQRPGHGKDCVGGIYPRSGYYTHYTEK FQGRVTYPSGKQTNNAYLQLNSVTTEDTAVYYCARPGLESFYGVGVNWSHNNSMSIGQGTLVTVSE | GIRITQSRASLSVSAGEPASIQARVSQSLIKGDISNYLEYYYQRKPGQPPKLFIYSASTRAT GVPARLPGSGSGTDFTLTISSRDGFVGSEDFGIYYCQRVENTPPQLGQPTFDQGTKLEIKGYK |
| recursive-basin | EVQLVEIGPEDKKPGTTLKLSCVPSGVSFHSTNYAVSGVKQSPGQGPEAMGNTYPSSGCDTDYCQK FQVRATITTRRSTSTVYLELNSLKPDDPAVYYCARVSHNTEKTPGSYFHPGDEELWGQGTLVTALS | DIVLTDSPSGLSASTGERPTISKRVSNSLSDFDRSTTLAWFFQARPGRSEKALIRAASSRAS GVPVRFSGSGSMTSFAFTISSAAAALEAEHPATYYCQQSRDDPAARASAAFGGGTRVEIAPLA |
| free-hearth | EVQEVDEGGEVVKPGKSARISCKNSGYGFSSQSYAVSWIREAEGKGLAAMTVISPTGAAGIHRNEK VQGRVTISKDHSSNTVYLEMNSLKSEGTAIYAAARNGQYNFVVVGSYWGYGLFDSWGEGALCTVSS | ETVLTQSPFTLSVSPGEPITVSCRSPQNLVDSNVYNNINWEYQKKPGQAPPRLIYVNSNGAS GAPDELKGSGSGTDFVLTIRRCPKEIKDEEVVVYYCQKSIVSPVEWVVATFGQNTRVEIKKKV |
| smoky-latitude | EVRLVQSGREVVKVGESLKISCKASEYNFSSTNYYTGWVRKPPGQVRMWISYIYHTTTEGTNYSAV VKPYAGVCYGKSINTVTLHMNSVKASDTAVYYCANTIRYDVYSVSPTWDHDWVDSWGQGTLVTVPQ | DSQLTQSQSSWSLSVGDFVHGTCKTSQSLSQPDVYSYTHWSGRRHPAQTPKLLIYLASTRGS GVGTHISGSGSGTDYTLTISSFGSTTRAGGFATRFCQQSRADTTRAGRQTDGSGTKVGTKGSR |
| internal-rundown | QAGLVQSGAELKKTGSSLRVSCKSSWYTYSSSYYAIHIIREAPSKGLEWVSRINSRHGYYTTYAPS IQGRVTFSTDKSTSTIYMPLSSLSSEDTAVYFCAPHASEHYVGSGTSWLHDWAESSGQGTTVTASN | DIVITQSPLTCSVSLGETTSVSRRTSQNLIDNNKYHYFAWLYQQKPGQAPTLLIYAGSYRAS AVSDRPSRSGSGIDYTLTISSKSTIVECDGVAVRYCQRSNSTPTRLAALDFAAGTKPEIKNPS |
| critical-tin | QVFLVQSGAELTNPGASVKVSCKTSGYSFSTTSYGMSWIRRAGGQGLEAIGWISHRSGYRTNASPK FQGRVTINTGASTSTVYTQLRSLKPEDTTVYYSARDGAHSFVAAGSAWGLDWGGYAGEGTIVTVSA | DIVITTSPMTRSVSVGEAASISCCRSQSCIDGNGYNYMNWIYRQKSGQAPRELIYGASKVAT GFPARFSASGSGTDFTLTISYTAVNVEPGGVGSYYCQRARSTPSKRAYQTFGAGTRVEIKLAN |
| gilded-stud | VAQTLQSGPELMQPGASVKISCTDSGYTISSTSYAFSWARGSGGKSLEWVGWIHWHTGVGTQYADS FQGRATSDRDKSKNTASAQFNSARSEHSGVAYGASDRTTTTYGLGRPVVVGWADSWNQGTLATASS | DLVLTQSPASVSVTPGTSASISCQSSQSQVDSNDYNYANRAYQQMPGQAPTLMIRSASYRPS GVPSRISSSGSGTSASLTISRVNANLQEENEANYRCQTSSVSGNRICSPVFNSGTKLEIKGPP |
| concave-glove | EVQLQESGAGLVKTSESGSISCTTSGGGGPSGGYWMSWGRQGPGGGLEWSGRIYGVSGDGTNGRGS LKERGTLSPDTSTNTASLGMSSVTASDTALYYGARGAMGGPVGGGSYGGLNGGDGQGQGTLVTVSS | DYVLTGSPASASVSPGESPTISCRATQTFVDGDGTKYVAWAAQAKPGQAPKLLISLDSNRPT GVPSRFSGSGSGTDYSLTITGAKNTLQNEDVADYYCQQVRSSPPARGSPSYAALTKLDAKNPS |
| boolean-burbot | EVELCESGAEVEKPGSSVKVTCKVTGYAFSSTSYAISKVVQAGNPSLASIGELSPSSGDYTRYNEK VTAKVTLTADKSTNTTYLELTPLTSEGTAIYICTRRARYDRSGVGSDYVGDWQDPAGQGTLATVSS | ETVLTQSPGTVTVSPGERATMSAKVSISTRMSVSTNYLNWAYEQKPGQAPRLLIHGASNRAS GVSARLSGSGSGSLFSRTISSNEDEVEAFQLAIYYCDQNTSDPECLPRDTYGGGTKLEIKEVP |
| magenta-food | EAQLQESGAELNKTGASAKVSCTHSGYSLSDTSYYINGAKQAPDKGPFALGGLYASSRYGDDTAQS TKSLVHVTRDRTKNTTSLELSSLKAEGTGIYYALGWGSYGKAGLGCSGLDGYFAYWAQSTLVTASS | EIVLTRSPATVTVSPLQRATVSCRNSNSNVDSDGYSYLHWYYQQKPGQAPKLAIGSASNRVS GVPSRFSGSGSLTDYALTISSDAAALQAGDAADYACGAAANDTPGRGSATFGPGTRVTIKGQL |
| exponential-cymbal | EVELVQSGPETVKPDKSVKVSCKTEAYSFSTPSHYVSAARSSTGQGLEWMPGIYASSGYKTDYAEP VQSRVTKTVDKTTTTAYTELSSLTAKDTAVYYIARDGTYDRYAGGHYGHHNWEDYWGQATMATVSH | EGVMTQSPATLRLSEGERVTISCTYSQNNISINSYNDIGWTYIQKPGQPPETLIYLSSVRAT GIQDHFSGSGARTDYALTITRATAAMQPEDLAVYYCQQSNEDPPNGGPTTFGGGSRVEIKGQP |
| soft-spook | QAERVRSGEELKQPADSVKISCKTSGNTFSSSHGEMNWVKHAPCQGREWLGYTLARSGYGTHYSPK FVGRTTITAGKTSSTTKMQLSSLMSEGSAVYRCARVSTTSCNGLPSYYPHGGADVWGQGTTVTVSS | ESVITQSPSSQPASPGELLTISCQASPINIVNKSYNHIACEYQQMPGQVSKLLTAGASIRPS VVPSRHSGSSSGTLYTLTISFIASILCSEDFAVYVCQNFCSLKACWGSVAGGGETKVEIKGQL |
| creative-halftone | EAQLVFSGAELTQPGNSLAISAKSSEDSIYSVNYVVSWVREAPGQGHLIMGGIHPVPNTGTKYGQV FQGRVTITADNSTNTAYVKSTSFPSDDTAVYYCTRHTFCGGVNLGSGYLQTAFDYWGQGTAVIVSS | GIVMTQSPATLSASPGETATISCKGSNSISDNAGPNYLAWVYQQKPAPPPKLLIYSASNRAN GDPEGFSGSGSAPGVSLTSSSVPKIVEEGDAAARYCQQTNVVPAKWETKTFVPGIKLEIAGQG |
| brilliant-charge | TVQLRQSGSEAKRPVESLKVSAKASSVSFSSGAYYASDIRQAPGNTLEWMGAANAANSNDTAYNQS FQGRVTINRDKTITTAYLQLNNLTAEDTDCFYCATDASCDFITNGPYYFNDWADTWGQGTMVVVFS | EIVTTQSGSTMSVPLGEHATISCRGSESPVSSYESNLGAWSYQQKPAKAPKRLIYRYSNRPS GVPSRFSGSFSGTDVTLDISGWGSSLQSEDVAIRYAQQFSNLPSTFGLTTFGQGTKVVIKDCS |
| strong-bear | EVQRQQSGAEVTKPGGSLKVSCKTSGYTFSSTTAAVSWVKQPHGTGLEWTGWLYHESGDGTNYAES VRGRVTVSYGKSTSTASLQMSSLRSEGTHVYYSARPGTGDWWGVGWGWGGNWFDSWAQGTTVTGSS | EAPLTQSGLSLSASSGNRATHTCRTSQAKVQNSIYIYIHWGYQQKPAKSPQLLIYGASSRGT GVPSRFSGSGSGTDYTLTISSNSLTLQPEDYATYFCQQSNVSPNNYESQTHEQGTKEEIQDQT |
| minty-cylinder | QVGFVEWAGGVKIPSASKKLSCKASVGSVSSTNYGISAVRQAGAEGLKAVGWISGMGGTYTDYSES LKGVVTISAAKSTTTTFIELSSLRPSSTTVRYCAPPASQDRVGSGSPGGPGWFKPWGEGTLITVSS | DLEMTQSPLSLAVSLGESISIPCRTSQSLVDSDKYNFPDLLYAQKPGISPRLLIYTGSSRAT GSPDRISASGSGTDFTLTITKQGDGVEAERIATYYCQQPRNTPRRINSQAFAQSTKLEKKAKA |
| bitter-folder | EVRLMESGAVVKQPGQSLKVSAKDSGYAFRNTSYSISWPRGAPGQGLEWMGYIYPNSGDGTNRSQS VQGLVTISTNKSISTASLQLSSLKAEDTPVYATARHDGYHWFAYTCHWMHGAADHWGQITLVTESS | ETVLTQSAATLSVTPGEGASLSCKALQSLVHNNGYNFIAAFYNQKPGQSPKRLIRGGANVGS GIPSRYNASGSGTDTSQTITSDHSALQSEGVQVYYCEQYTTTPKSPTSKTFPGGTKVEILPQV |
| contemporary-wine | KVERTQRGAEVKKPDKSLKISCAASGYSASDTSHYINWVQQAPGKALEWIGIIYPSSGDRTKYAEA FQGRVTITRDGSKNTAYARCNSVTPEGTAVRYCARHGSQTRFAIGSYWPVDQEGFWRQGTFVTVCS | GIVLTQSPPSLSAPVGESATASARGSQSEVDADGYNYLQADYQQKPGQAGQLLIYGVSNRES DVPARLSNAGAGTGYTTTISSAAVWIQSADFGVYFCQQANNTPSGRVSTRFAGGAATLPKGKT |
| quadratic-format | EVDTTQSLASAKMLGESVRISCKASGYTFTKPYYTYQWVKQTKAEILYWVGVTDPANSDVINYQPK EQGRVTLGVRKSTSTNWMRRRSLRSEDTNVYYCRRVRTYHYVNNGGGWVDNWFHNFGEGTMVTVSS | EIVLTQSPASIALSTGERATISCRANHPFIHSDGSNYLDWVRQQKPGQSPTRHIYGASYHET DIPDWFSGSGTGTDFTLTIRRSTSVVEAEDTGVYYCQQFSVSPPDWEASNYGDGTRVEIPGVH |
| avocado-bumper | EAERVESGAEVKKPGASTKISCKAAGYSFSSTSYWMHWVRQMPGEGLEWMGRIYPSKSCGSNRSMK CQGRVTLSTDTSTNTASLQLRSLTPSDTATYRAARQAFHGWVGIGSTWPDDWADVWAQMTLVTVSS | EIVNTQSPGTLSVSPGERATITCKASESTIAGNSYPYIGVNYLKKPGQAPKFLIYSASNRIS GIPSKFNESWSGHDFALTISNPPQIIQSFDFADYYCQHINSSPPRYQSLTFGAETKVEIKTQP |
| cold-electricity | EVFLLQAGPFLTHTGSSLKVTCKNSGNSFTTGSYTIKAVRQSGGTATFWIGSIIPSNGYGTNTAKT IKGRATISADTSTNTAYMELSSLASEGSALYSCARDAQNSWVGRGWYYGLNGFGMAGQGTTVTVSS | ESVMTGSEASLSVCPGESATISCRTTQSLIYSDGTNYLHWTTQQKTGQSPKLCIYSHSKRAS GVSGRTSGSGFRTDATLTISSHSYSTTAEDVSTYYDQQALNPPAHHGSSTYGQGTRLEIKNAP |
| flat-gutter | EVDLNQSGAETKITGQSIKVSCKTSGVSFPEADYATPLTRQHHGKALEWMGNTNYGTGYTTNYGPK IQVRVTLNSGKSTSTAYLPKKSLKAEYTTIYYCVRDGHQTNVESTGQGQIGYFNYWGEGTLVTVSN | GIAMTGSGSTISVSPGERATISCRASQSTVDKSVSNYVHWVFQQLPATSTKRIINGSSNRES DVPSRTSGSKSGHDPTLTISRRSSDLEPEDVAVYYCQSYTSTPSELVSQTYGQATKAEITGQD |
| symmetric-pad | QAQLVQSGAGVTKGAASVKLSCKTSGYSISSYSYGVSAVRQAPGQGPEWVGGISPMSGPYTHYAQS VQARLTLTVDKSTSTAYLELTASNPEDTATYYAARNARGTRVGVGPHYLLDWHDYWGAGTLVTVST | DIVMHQSPTTLSVNVYEPATISCKTSNTLANGDGSNYVVWYYQQKAGQSPKRLIAGISTRAT GVEHKFSGNASGTDITLTISSTHTAVEPEDFAVHYDQQYRNWLKKLISPTFGGGTKGNRPSKV |
| antique-structure | QAQLEQSGVEVVKPGSSVKVSSKTSGYWASTTSHWISWVRRSPAKGFEWMGGIQPGSGNYTNFNEK YQGRATITAGKSSNTAYTQLTSLTAEGTTTYYCARNNTHDTYGSGSSYPLDYFDVWGQATTITVSS | EVVLTQSPGTTSLPPGERATLSIHASHHLVDSDGSTYVSWVYQEKSGQATRRTIYGASNRAS GIVGRFSGSGSATGYTLTIRRADVSVESEQSAVFFAQQFSSTPQKWGSVTFGHVTRLEIKGSP |
| cream-callback | GNFLVESGAGATKPAPSLSVSCKVSGESFSSGSYGISWARQAGGPGLEWMGGIIPSSGEFINRGPS FQGKATITAGRSTTTAFFELSSLTSEDTAVYYCMRPRRFDFYGLTSYQPLGWHGYWGQGTLATVSS | DAVMTQSPPTLPVSVGESASISKKAAESVVSSDAYNYLNWAYEENPGQSPEMLIWAGTNRES GIPDRFSGSGSGTGFTLSISRVRSATEAGAVAVTYAMGSIAHPKPWGTKTFGQGTKVEIKGQD |
| inventive-amarone | QAQIVQSGPGLVKTGTSVKVSAKTTGYNFSNKNYIVSWVREVPGRGLEAMGRIYGRDGDYTDRAEK VVGKVTISTDKDKNTWYLQMSSLKAEDTAVSYAARNDLVCYGGGGRYGLHNAYDNAGQGTLVTVSS | DIVTTQTSGKLSISLGERVTINYKTSQSYVDGSGYNYTHHAYEQKDGKYPKLLIYGGSNRES GVPDRDSGSNAGTDVTLTISEVVMVVQSDDKINRYCSQSTDYTLYLDAVTFLQGTTYEIKYNP |
| messy-discriminator | EVRLVQAGPEVKQQKESAKLSCKTFGLSVSSTHYGNNWAHGAPGNGPEAIGHILPMNGYGIHYCPK VQGNSTISTDKTTSTAYMDLSSATSEDTAIYYCTVPATKLTYGTACGWGLSYFDPWAQGTLATVSS | EIVITQSPITLPVSPGEPASITCRASQSVLHSDGYNYLDWGVKQKPGQAPQHLIALASRRAS GVGARFSGSGSGHDFTLKIRAYNAIVQSEGVGVYYCQAANQTPQGFGQQTFGGGTKLEIKNDP |
Fig. 2.
Top 10 diffused Fv candidates by HADDOCK score. EPI3009174 HA1 is shown as a white surface model and the diffused Fv structure is shown in a colorized cartoon style.
2.2.1. Epitope variation
In the Frankies pipeline, users have the option of defining active residues on the antigen to guide the docking process. In this study, however, we allowed for the diffused antibodies to bind to any surface residue on the antigen structure, reducing the bias in the binding process. This produced an interesting pattern of particular residues commonly forming polar contacts with the antibody CDR loops. See Fig. 3.
Fig. 3.
Structure heatmap showing the prevalence of polar contacts made between the diffused antibodies and the HA1 antigen. Annotated residues indicate those that are interfaced ≥20% of cases across the 30 diffused antibodies in this study.
2.2.2. Structure biochemistry
Further evaluating the biochemistry of the diffused Fv structures shows consistent desirable traits across various metrics.
Evaluating protein stability, desirable solvation metrics were predicted using FoldX. As shown in Fig. 4A, the average total energy predicted by FoldX was -121.9 kcal/mol, ranging from -144.8 to -60.7 kcal/mol. Also, favorable polar solvation and hydrophobic solvation was predicted, indicating desired polar interactions with the solvent along with proper burial of hydrophic residues in the Fv structures. Futhermore, the predicted hydrophic solvation indicates a low propensity for aggregation of the proteins in vivo [4]. See Figs. 4B and 4C.
Fig. 4.
Boxplots depicting the distribution in solvation metrics. A) Total energy - More negative values indicate better overall stability. B) Polar solvation - Lower values indicate favorable polar interactions with solvent. C) Hydrophobic solvation - lower values indicate better burial of hydrophobic residues (favorable for folding) and high values might suggest exposed hydrophobic surfaces (i.e., risk of aggregation or low solubility). Arrows indicate the “better” direction of each metric.
Regarding the humanness of the diffused heavy and light chain Fv sequences, we see a bimodal distribution where about half of the chain sequences are predicted to be human (whereas the remaining are of hybrid- to mostly murine-level of composition), shown in Fig. 5.
Fig. 5.
Density plots showing distribution in the predicted humanness of the diffused Fv heavy and light chain sequences.
Note that none of the sequences in this study have been humanized. Rather, the diffusion process simply generated sequences on a spectrum between human and murine, based on the input set of reference antibodies, preferring either extrema rather than creating hybrid sequences. Thus, the humanness of these diffused antibodies can be improved using standard humanization tools/processes. The average humanness probability for the heavy and light chains of the top 10 performing antibodies, shown in Fig. 2, was 44.7% and 62.2%, respectively.
2.3. Sequence analyses
Note the sequence diversity depicted in the logos in Figs. 6a and 6b. While natural heavy chain sequences often start with EVQ or QVQ, there is additional sequence diversity in position 3 with the introduction of arginine (R), phenylalanine (F), glycine (G), and aspartate (D) amino acids, which are atypical here. However, the conditional sampling shows a consistent selection of glycine at the 6th position and VTSS at the end of the VH sequence, which are very common in natural antibodies. For the light chains, the common starting sequences of EIV and DIV were seen in the diffused sequences along with expected sampling of methionine (M) or leucine (L) at position 4, and EIK toward the end of the VL sequence (positions 120-122) [5]. Note that all of these sequences were successfully numbered using the Chothia numbering scheme [6].
Fig. 6.
Position-level amino acid probabilities of the diffused Fv chain sequences. Created with WebLogo [7].
Comparing the sequences of the diffused and reference antibodies shows a 47.24% (± 0.06%) identity in the heavy chains and a 52.37% (± 0.08%) identity in the light chains. This is within an expected range given that the generated sequences were diffused by sampling an input distribution and that all of the sequences folded into the desired Fv antibody shape. Sequence identity distributions and more detailed pairwise identity comparisons are shown in Supplementary Fig. 1.
2.3.1. Folding performance
Of the 30 diffused pairs or Fv sequences presented in this study, all folded with an average Predicted Local Distance Difference Test (pLDDT) confidence >0.5. As shown in Fig. 7, most sequences folded with a mean and median pLDDT >0.75, indicating a highly confident structure prediction.
Fig. 7.
Scatterplot of mean versus median pLDDT across the 30 diffused Fv structures folded with ESM3. Marginal boxplots show the distribution of the mean and median pLDDT values.
This is expected given that each diffused Fv sequence was numbered through the Chothia numbering system as a quality control step in the Frankies pipeline before advancing through to the evaluation steps.
2.4. Comparison to existing antibodies
The Frankies pipeline consistently generated Fv sequences and structures that bound well to the HA1 epitope.
2.4.1. Binding performance
For comparison purposes, as shown in Fig. 8, the binding performance of the 11 antibodies used in Ford et al. 2025 that were bound to H5N1 isolate EPI3009174 were selected as a reference. Then, these were compared to the 30 Fv structures generated in this study, which were docked against the same isolate antigen structure.
Fig. 8.
Comparison of the docking metrics between reference and diffused antibodies. Pairwise comparisons are shown as p-values from the Wilcoxon signed-rank test. For all metrics except buried surface area and desolvation energy, lower is likely indicative of better binding, indicated by arrows.
The Van der Waals energy is significantly better (more negative, statistically) in the diffused antibodies, indicating stronger non-bonded atom-atom interactions. This could imply tighter packing or improved interface complementarity in the diffused antibodies.
In contrast, the reference antibody set displayed better electrostatic energy and buried surface area, though there are multiple examples of diffused antibodies that are within the same range. A larger buried surface often correlates with stronger binding and greater stability. In this case, the diffusion process may be generating antibodies with slightly reduced interfacial engagement.
However, the desolvation energy, total score (electrostatic + Van der Waals energies), and HADDOCK score were quite similar overall and showed no statistical difference according to the Wilcoxon signed-rank test at the level.
Thus, these metrics show that this pipeline was able to consistently generate antibody sequences with similar performance to therapeutically-derived or elicited antibodies.
3. Methods
The Frankies pipelines was designed as an automated workflow for producing antibody candidates and predicting their binding affinity at-scale. This pipeline is written in Snakemake [8], [8], [9], which provides a reproducible framework for running all of the steps mentioned below.
Inside the Snakemake pipeline, various steps are run as Python or Shell scripts, while other complex tools are executed using Docker Containers. The modularity of the pipeline is designed such that each run produces a single antibody candidate, tagged with a randomly generated experiment name, along with an evaluation of its performance against a given target. Thus, the Frankies pipeline can be run n times, without modifying the initial configuration, to produce n unique Fv candidates. The overall workflow is shown in Fig. 9.
Fig. 9.
Frankies pipeline workflow steps.
3.1. Reference dataset preparation
To guide conditional sequence generation, we curated a reference set of HA1-targeting Fv sequences from the Therapeutic Structural Antibody Database (Thera-SAbDab) combined with additional antibodies used in Ford et al. 2025 listed in Table 2.
Table 2.
Reference HA1-neutralizing antibodies from TheraSAbDab and those used in Ford et al. 2025.
| Antibody ID | PDB ID |
|
|
Source Information | ||||
|---|---|---|---|---|---|---|---|---|
| 5dur00F4 | 5dur | 2015 [8], [8], [8], [8], [10], [10] | II/I | Human Memory B-Cell, Recovered from H5N1 Infection | ||||
| 12H5 | 7fah | 2022 [11], [11] | I/IV | Mouse, Immunised with three H1N1 strains, Humanised | ||||
| 13D4 | 6a0z | 2018 [12], [12] | I/I | Mouse, Immunised with five H5N1 strains, Humanised | ||||
| 3C11 | 6iuv | 2019 [13], [13] | I/II | Human Memory B-cell, Infected by H5N1 viruses | ||||
| 65C6 | 5dum | 2015 [8], [8], [8], [8], [10], [10] | I/III | Human, Infected by H5N1 viruses | ||||
| AVFluIgG01 | 6iut | 2019 [13], [13] | II/I | Human, Infected by H5N1 viruses | ||||
| AVFluIgG03 | 5dup | 2015 [8], [8], [8], [8], [10], [10] | III/I | Human, Infected by H5N1 viruses | ||||
| FLD194 | 5a3i | 2015 [14], [14] | II/I | Human Memory B-cell, Recovered from H5N1 infection | ||||
| FLD21.140 | 6a67 | 2018 [15], [15] | ?/I | Human, Recovered from H5N1 Infection | ||||
| H5M9 | 4mhh | 2013 [16], [16] | I/IV | Mouse, Immunised with H5N1, Humanised | ||||
| H5.3 | 4xrc | 2015 [17], [17] | II/? | Human, Immunised with one H5N1 strain | ||||
| Firivumab/ CT-P22/ CT120 | None | 2014 [18] | I/III | Human, derived from the human immunoglobulin repertoire with heavy chain from IgHV1-69 and light chain from IGKV3-15 gene segments. | ||||
| Gedivumab/ MHAA4549A/ RG7745 | 4kvn | 2016 [19] | III/III | Human, cloned from a single plasmablast, derived from an influenza virus-vaccinated donor. | ||||
| Navivumab/ CT-P23 | 4r8w | 2015 [20] | I/III | Human, produced recombinantly in Chinese Hamster Ovary (CHO) cells | ||||
| Sonavibart/ HY-P990944/ VIR-2482/ MEDI8852 | 5jw3 | 2024 [21] | II/I | Human, Derived from a vaccinated human plasmablast—expressed recombinantly in CHO mammalian cells |
Sequences were trimmed to retain only the Fv portion of the antibody chains and were then aligned using MUSCLE v3.8.425 [22]. This produced separate input .a3m alignment files for heavy chain and light chain sequences.
3.2. Conditional sequence generation
We used EvoDiff's MSA_OA_DM_MAXSUB model for conditional sequence generation [23]. The conditioned sequence generation sampled from the curated reference sequence files using the ‘MaxHamming’ distance to produce novel, HA1-targeted Fv chain sequences. EvoDiff operates as a protein diffusion model trained on general sequence databases and can be guided by user-provided templates or alignments of desired reference sequences in .a3m format.
Thirty Fv candidate sequences were generated and filtered for naturalness using AbNumber, a Python wrapper for ANARCI [24]. This attempted to number the diffused sequences with the Chothia numbering system [6] and, if it failed, the diffusion process would restart generate another sequence. This was performed to help ensure that the diffused sequences were as antibody-like as possible, lending to more confident subsequent folding and improve future protein synthesis capabilities.
Sequences were analyzed to predict their overall stability, including polar and hydrophobic solubility using FoldX v5.1 [25] using the ‘Stability’ command. This provided a variety of energy metrics for each diffused Fv structure, all of which are reported in the supplementary GitHub link.
Using Humatch's ‘classify’ functionality, each sequence's “humanness” was also evaluated, which reports a predicted human probability percentage for a given input heavy and light chain sequence pair [26].
Also, sequence identity of each diffused sequence was compared to the set of reference sequences using the Biostrings v2.66.0 library in R v4.2.2 [27].
3.3. Structure prediction
Diffused heavy and light chain sequences were folded using either AlphaFold3 [2] or ESM3 [3].
AlphaFold3 may be desirable for groups wishing to perform purely local folding. While AlphaFold3 provides highly accurate structure predictions, it requires the user to download and store >600 GB of reference databases and model weights to run. Plus, the multimer prediction of the heavy and light sequences together take approximately 15 minutes on a GPU.
Alternatively, Evolutionary Scale's ESM package and esm3-medium-multimer-2024-09 model allow for API-based multimer structure prediction without any reference databases. This package returns the predicted structure in a few seconds.
Both AlphaFold3 and ESM3 consistently produce reliable antibody structures and thus we provide support for both models.
Complementarity-determining region (CDR) loops were detected using ANARCI and residues belonging to the CDR loop structures were selected as “active residues” in the subsequent docking process.
3.4. HA1 antigen preparation
A reference target antigen structure—HA1 subunit of H5N1 hemagglutinin—was obtained from the Protein Data Bank (PDB ID: 2VIR). This structure was used to test and validate the Frankies pipeline.
Then, the HA1 structure of a more recent isolate EPI3009174 was folded and used for subsequent novel antigen docking. Isolate EPI3009174 was collected in from a 9-year-old Cambodian male patient who passed away in 2024. This isolate was previously analyzed in Ford et al. 2025 against 11 reference antibodies.
For docking, the residue numbers in the antigen structure were incremented +1000, to avoid overlapping numbers with the antibody structure, and all residues were assigned to chain B.
Surface residues are automatically detected on the antigen structure and a 25% random subset of the surface residues was selected as “active residues”.
3.5. Antibody-antigen docking
Fv structures were docked to the HA1 antigen using HADDOCK3 [8], [8], [28], [29] following the methodologies shown our recent previous studies [30], [31], [32]. Docking configuration files were generated using an antibody-antigen docking template from the Bonvin Lab [33]. These configuration files, along with the other required docking files, were generated in a docking preparatory step in the Frankies pipeline.
Docking was performed in a multi-stage protocol including rigid-body energy minimization, semi-flexible refinement, and explicit (water-based) solvent modeling. Docked complexes were scored using HADDOCK's built-in scoring function and further evaluated using the other biochemical/biophysical binding metrics listen below. The template for the HADDOCK protocol is available in the GitHub repository.
-
•
Van der Waals intermolecular energy (vdw) in kcal/mol
-
•
Electrostatic intermolecular energy (elec) in kcal/mol
-
•
Desolvation energy (desolv) in kcal/mol
-
•
Restraints violation energy (air) in arbitrary units
-
•
Buried surface area (bsa) in Å2
-
•
Total energy (total): in kcal/mol
-
•
HADDOCK score:
3.6. Candidate ranking, reporting, and visualization
The “best” complex is selected from the output of complexes based on the docking conformation with the best (lowest) HADDOCK score of the best cluster of complexes.
This best cluster and the best scoring complexes are reported in a rendered Quarto dashboard as the final step of the Frankies pipeline [34]. This report shows the distribution of the various binding metrics of all complexes in the best cluster. This also renders the 3D structure of the best model as an interactive object using Py3Dmol, a Python wrapper for 3Dmol.js [35]. A screenshot of the dashboard is shown in Fig. 1.
4. Discussion
As H5N1 continues to evolve, it is imperative that therapeutic advances continue to better target modern influenza clades of interest (e.g., 2.3.4.4b [36]) and therefore reduce the risk of mortality and morbidity in humans. As of late 2024, there are over 50 licensed H5 vaccine candidates [37], though many of these are based on significantly older strains. For example, in the United States, the 3 licensed vaccines are from 2007, 2013 and 2020 [38].
Some vaccine candidates are currently being developed using mRNA technologies on newer strains in the U.S. by Moderna (mRNA-1018) [39] and Arcturus Therapeutics (ARCT-2304) [40]. These were shown to be effective in animal trials and are currently in Phase I/II clinical trials. Also, therapeutic antibodies are being developed that target H5 HA1 from clade 2.3.4.4b. Multiple promising candidates were generated through hybridoma technologies, as reported in a recent preprint by Alzua et al. 2025.
AI-based antibody discovery is already well underway with a few candidates having moved into clinical testing, including anti-TL1A antibody by Absci (ABS-101), [41], [41], [42], [43], [44]. Such previous studies generated antibody sequences or structures using large-language or diffusion models such as RFdiffusion [45], AntiBARTy [46], and Abdiffuser [47] and then optimized the candidates with a lab-in-the-loop iterative process.
The Frankies pipeline presented in this study offers a streamlined and modular approach for the design of de novo antibody candidates against rapidly evolving viral targets such as H5N1 influenza hemagglutinin. By integrating generative protein diffusion, structure prediction, and flexible molecular docking, Frankies enables end-to-end discovery of Fv candidates with structural and biophysical characteristics that support high-affinity binding to the desired antigen.
A key innovation in Frankies lies in its conditional generative architecture. By seeding EvoDiff with therapeutically validated antibody sequences from Thera-SAbDab, we impose domain-specific constraints that preserve critical structural motifs while enabling conditional exploration of novel sequence space. This balances diversity with developability, reducing the likelihood of generating non-functional or unstable designs. Furthermore, our use of ANARCI and other quality control steps help to ensure that only the most promising candidates advance through the pipeline.
The application of Frankies to the H5N1 HA1 domain highlights the feasibility of using AI-driven approaches for pandemic preparedness. HA1 remains a challenging target due to its rapid antigenic drift and glycan shielding. Nonetheless, several Frankies-designed Fv candidates exhibited strong binding affinity scores and favorable interface properties, suggesting potential for neutralization. Importantly, these designs were generated in silico within minutes, underscoring the value of generative pipelines for rapid therapeutic prototyping.
Despite its strengths, Frankies has some limitations. The current conditional sequence generation can be improved in the future by implementing structure-aware diffusion (i.e., diffusing the CDR loops while a reference antibody is bound at the epitope site on the antigen). Also, better flexibility in CDR loop lengths is necessary.
While our predictions showcase the consistency of the Fv sequence generation, experimental validation will be crucial to confirm the binding specificity, affinity, and neutralization potency of the proposed candidates. Plus, the Fv sequences generated will need to be expanded to include the other parts of the antibody (the rest of the Fab region, the hinge, and the Fc region).
Today, the pipeline utilizes Docker to run the diffusion, folding, and docking steps, enabling users to run these processes without complex dependency installation in their local environment. While this is useful for cloud-based scalability, we will also implement the ability to choose the containerization engine. For example, Apptainer/Singularity are more common in on-premises high-performance computing (HPC) environments.
The current docking approach does not account for glycosylation, which plays a significant role in HA1 surface shielding and antibody accessibility. While HADDOCK3 provides valuable binding predictions, it lacks the full thermodynamic and kinetic accuracy of more computationally intensive free energy calculations. Thus, future improvements to the pipeline will include the integration of a molecular dynamics step, such as OpenMM [48], to model the trajectory of the antibody-antigen complex and to predict the stability of the interaction.
Future directions include fine-tuning EvoDiff on antibody–antigen co-evolution datasets, and benchmarking against existing broadly neutralizing antibodies targeting H5N1. Additionally, the modular nature of Frankies allows for straightforward extension to other antigens, including other influenza subtypes, entirely different viral families, or even targets in other diseases (such as in oncology, as shown in Ford 2024).
In conclusion, Frankies demonstrates how recent advances in protein generative modeling and structure prediction can be combined into an automated and scalable pipeline for therapeutic antibody design. As AI tools continue to mature, we anticipate that pipelines like Frankies will become central components of the next-generation biosecurity and drug discovery infrastructure for pandemic preparedness and beyond.
5. Contributors
Authors NS and CTF developed the Frankies pipeline. Author CTF performed data curation of the anti-HA1 antibody Fv sequences and their multiple sequence alignment. Author NS performed the docking experiments and statistical analyses. CTF and NS generated all visualization figures. NS and CTF performed the formal analysis of the structure and docking predictions of the antibodies. All authors wrote the original draft of the manuscript. All authors read and approved the final version of the manuscript.
Contributors
Authors NS and CTF developed the Frankies pipeline. Author CTF performed data curation of the anti-HA1 antibody Fv sequences and their multiple sequence alignment. Author NS performed the docking experiments and statistical analyses. CTF and NS generated all visualization figures. NS and CTF performed the formal analysis of the structure and docking predictions of the antibodies. All authors wrote the original draft of the manuscript. All authors read and approved the final version of the manuscript.
Funding
No external funding was used for this study.
CRediT authorship contribution statement
Nicholas Santolla: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Investigation, Formal analysis. Colby T. Ford: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization.
Declaration of Competing Interest
Author Colby T. Ford is the owner of Tuple, LLC, a biotechnology consulting firm. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgements
We gratefully acknowledge all GISAID data contributors (i.e., the authors and their originating laboratories) responsible for obtaining the specimens, and their submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based.
We acknowledge the following entities at the University of North Carolina at Charlotte: Academic Affairs, The Office of Research, The Center for Computational Intelligence to Predict Health and Environmental Risks (CIPHER), The Department of Bioinformatics and Genomics, The College of Computing and Informatics, and the University Research Computing group. We gratefully acknowledge the support of the Belk Family.
Footnotes
Supplementary material related to this article can be found online at https://doi.org/10.1016/j.csbj.2025.06.026.
Appendix A. Supplementary material
The following is the Supplementary material related to this article.
Sequence identity comparisons between the diffused and reference antibody sequences. A) Violin/box plots showing the distribution of the diffused sequences' identities by chain. B) Heatmap showing the pairwise identity comparisons.
Data availability
All code, data, results, and additional analyses are openly available on GitHub at: https://github.com/Santollan/Frankies. This repository includes the open-source logic for running the Frankies pipeline. Also, this includes all sequences and folded structures for the reference antibodies and H5N1 isolates used in this H5N1 study, analysis scripts, and docking metrics. This also includes the experimental outputs for the 30 diffused Fv candidates.
References
- 1.Ison Michael G., Marrazzo Jeanne. The emerging threat of h5n1 to human health. N Engl J Med. 2025;392(9):916–918. doi: 10.1056/NEJMe2416323. [DOI] [PubMed] [Google Scholar]
- 2.Abramson Josh, Adler Jonas, Dunger Jack, Evans Richard, Green Tim, Pritzel Alexander, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. Jun 2024;630(8016):493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hayes Thomas, Rao Roshan, Akin Halil, Sofroniew Nicholas J., Oktay Deniz, Lin Zeming, et al. Simulating 500 million years of evolution with a language model. Science. 2025;387(6736):850–858. doi: 10.1126/science.ads0018. [DOI] [PubMed] [Google Scholar]
- 4.Kumar Avishek, Singh Nitin Kumar, Ghosh Deepshikha, Radhakrishna Mithun. Understanding the role of hydrophobic patches in protein disaggregation. Phys Chem Chem Phys. 2021;23:12620–12629. doi: 10.1039/D1CP00954K. [DOI] [PubMed] [Google Scholar]
- 5.Chiu Mark L., Goulet Dennis R., Teplyakov Alexey, Gilliland Gary L. Antibody structure and function: the basis for engineering therapeutics. Antibodies. 2019;8(4) doi: 10.3390/antib8040055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chothia Cyrus, Lesk Arthur M., Tramontano Anna, Levitt Michael, Smith-Gill Sandra J., Air Gillian, et al. Conformations of immunoglobulin hypervariable regions. Nature. Dec 1989;342(6252):877–883. doi: 10.1038/342877a0. [DOI] [PubMed] [Google Scholar]
- 7.Crooks Gavin E., Hon Gary, Chandonia John-Marc, Brenner Steven E. WebLogo: a sequence logo generator. Genome Res. June 2004;14(6):1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ford Colby T., Yasa Shirish, Obeid Khaled, Jaimes Rafael, III., Tomezsko Phillip J., Guirales-Medrano Sayal, et al. Large-scale computational modelling of H5 influenza variants against HA1-neutralising antibodies. eBioMedicine. Apr 2025;114 doi: 10.1016/j.ebiom.2025.105632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Köster Johannes, Rahmann Sven. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics. 2018;34(20) doi: 10.1093/bioinformatics/bty350. [DOI] [PubMed] [Google Scholar]
- 10.Zuo Teng, Sun Jianfeng, Wang Guiqin, Jiang Liwei, Zuo Yanan, Li Danyang, et al. Comprehensive analysis of antibody recognition in convalescent humans from highly pathogenic avian influenza h5n1 infection. Nat Commun. Dec 2015;6(1):8855. doi: 10.1038/ncomms9855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Li Tingting, Chen Junyu, Zheng Qingbing, Xue Wenhui, Zhang Limin, Rong Rui, et al. Identification of a cross-neutralizing antibody that targets the receptor binding site of h1n1 and h5n1 influenza viruses. Nat Commun. Sep 2022;13(1):5182. doi: 10.1038/s41467-022-32926-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lin Qingshan, Li Tingting, Chen Yixin, Lau Siu-Ying, Wei Minxi, Zhang Yuyun, et al. Structural basis for the broad, antibody-mediated neutralization of h5n1 influenza virus. J Virol. 2018;92(17) doi: 10.1128/jvi.00547-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wang Pengfei, Zuo Yanan, Sun Jianfeng, Zuo Teng, Zhang Senyan, Guo Shichun, et al. Structural and functional definition of a vulnerable site on the hemagglutinin of highly pathogenic avian influenza A virus h5n1. J Biol Chem. Mar 2019;294(12):4290–4303. doi: 10.1074/jbc.RA118.007008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Xiong Xiaoli, Corti Davide, Liu Junfeng, Pinna Debora, Foglierini Mathilde, Calder Lesley J., et al. Structures of complexes formed by h5 influenza hemagglutinin with a potent broadly neutralizing human monoclonal antibody. Proc Natl Acad Sci. 2015;112(30):9430–9435. doi: 10.1073/pnas.1510816112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zuo Yanan, Wang Pengfei, Sun Jianfeng, Guo Shichun, Wang Guiqin, Zuo Teng, et al. Complementary recognition of the receptor-binding site of highly pathogenic h5n1 influenza viruses by two human neutralizing antibodies. J Biol Chem. Oct 2018;293(42):16503–16517. doi: 10.1074/jbc.RA118.004604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhu Xueyong, Guo Yong-Hui, Jiang Tao, Wang Ya-Di, Chan Kwok-Hung, Li Xiao-Feng, et al. A unique and conserved neutralization epitope in h5n1 influenza viruses identified by an antibody against the a/goose/guangdong/1/96 hemagglutinin. J Virol. 2013;87(23):12619–12635. doi: 10.1128/jvi.01577-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Winarski Katie L., Thornburg Natalie J., Yu Yingchun, Sapparapu Gopal, Crowe James.E., Spiller Benjamin W. Vaccine-elicited antibody that neutralizes h5n1 influenza and variants binds the receptor site and polymorphic sites. Proc Natl Acad Sci. 2015;112(30):9346–9351. doi: 10.1073/pnas.1502762112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yi Kye Sook, Choi Jung-ah, Kim Pankyeom, Ryu Dong-Kyun, Yang Eunji, Son Dain, et al. Broader neutralization of ct-p27 against influenza A subtypes by combining two human monoclonal antibodies. PLoS ONE. 2020;15(7):1–14. doi: 10.1371/journal.pone.0236172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lim Jeremy J., Deng Rong, Derby Michael A., Larouche Richard, Horn Priscilla, Anderson Malia, et al. Two phase 1, randomized, double-blind, placebo-controlled, single-ascending-dose studies to investigate the safety, tolerability, and pharmacokinetics of an anti-influenza A virus monoclonal antibody, MHAA4549A, in healthy volunteers. Antimicrob Agents Chemother. 2016;60(9):5437–5444. doi: 10.1128/aac.00607-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wu Ying, Cho MyungSam, Shore David, Song Manki, Choi JungAh, Jiang Tao, et al. A potent broad-spectrum protective human monoclonal antibody crosslinking two haemagglutinin monomers of influenza A virus. Nat Commun. Jul 2015;6(1):7708. doi: 10.1038/ncomms8708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kallewaard Nicole L., Corti Davide, Collins Patrick J., Neu Ursula, McAuliffe Josephine M., Benjamin Ebony, et al. Structure and function analysis of an antibody recognizing all influenza A subtypes. Cell. Jul 2016;166(3):596–608. doi: 10.1016/j.cell.2016.05.073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Edgar Robert C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Alamdari Sarah, Thakkar Nitya, van den Berg Rianne, Tenenholtz Neil, Strome Robert, Moses Alan M., et al. Protein generation with evolutionary diffusion: sequence is all you need. bioRxiv. 2024 doi: 10.1101/2023.09.11.556673. [DOI] [Google Scholar]
- 24.Dunbar James, Deane Charlotte M. Anarci: antigen receptor numbering and receptor classification. Bioinformatics. 2015;32(2):298–300. doi: 10.1093/bioinformatics/btv552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Delgado Javier, Radusky Leandro G., Cianferoni Damiano, Serrano Luis. FoldX 5.0: working with RNA, small molecules and a new graphical interface. Bioinformatics. 2019;35(20):4168–4169. doi: 10.1093/bioinformatics/btz184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chinery Lewis, Jeliazkov Jeliazko R., Deane Charlotte M. Humatch - fast, gene-specific joint humanisation of antibody heavy and light chains. bioRxiv. 2024 doi: 10.1101/2024.09.16.613210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Pagès H., Aboyoun P., Gentleman R., DebRoy S. 2022. Biostrings: efficient manipulation of biological strings. R package version 2.66.0. [Google Scholar]
- 28.van Zundert G.C.P., Rodrigues J.P.G.L.M., Trellet M., Schmitz C., Kastritis P.L., Karaca E., et al. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428(4):720–725. doi: 10.1016/j.jmb.2015.09.014. [DOI] [PubMed] [Google Scholar]
- 29.Teixeira João M.C., Vargas Honorato Rodrigo, Giulini Marco, Bonvin Alexandre, SarahAlidoost, Reys Victor, et al. haddocking/haddock3: v3.0.0-beta.5. January 2024. https://doi.org/10.5281/zenodo.10527751
- 30.Ford Colby T. PD-1 targeted antibody discovery using AI protein diffusion. Technol Cancer Res Treat. 2024;23 doi: 10.1177/15330338241275947. PMID: 39228166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tomezsko Phillip J., Ford Colby T., Meyer Avery E., Michaleas Adam M., Jaimes Rafael. Human cytokine and coronavirus nucleocapsid protein interactivity using large-scale virtual screens. Front Bioinform. 2024;4 - 2024 doi: 10.3389/fbinf.2024.1397968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ford Colby T., Yasa Shirish, Jacob Machado Denis, White Richard Allen, Janies Daniel A. Predicting changes in neutralizing antibody activity for sars-cov-2 xbb. 1.5 using in silico protein modeling. Front Virol. 2023;3 - 2023 doi: 10.3389/fviro.2023.1172027. [DOI] [Google Scholar]
- 33.Bonvin Lab. HADDOCK3 antibody-antigen tutorial. 2024. https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/
- 34.Allaire J.J., Teague Charles, Scheidegger Carlos, Xie Yihui, Dervieux Christophe, Woodhull Gordon. Quarto. November 2024. https://doi.org/10.5281/zenodo.5960048
- 35.Rego Nicholas, Koes David. 3dmol.js: molecular visualization with webgl. Bioinformatics. 2014;31(8):1322–1324. doi: 10.1093/bioinformatics/btu829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Janies Daniel, Ocaña Kary, Guirales-Medrano Sayal, Obeid Khaled, Alexander Rachel, Ford Colby T. Analyses of phylogenetics, natural selection, and protein structure of clade 2.3.4.4b h5n1 influenza A reveal that recent viral lineages have evolved promiscuity in host range and improved replication in mammals in North America. bioRxiv. 2025 doi: 10.1101/2025.03.15.641219. [DOI] [Google Scholar]
- 37.Taaffe Jessica, Zhong Shuyi, Goldin Shoshanna, Rawlings Kate S., Cowling Benjamin J., Zhang Wenqing. An overview of influenza H5 vaccines. Lancet Respir Med. Apr 2025;13(4):e20–e21. doi: 10.1016/S2213-2600(25)00052-9. [DOI] [PubMed] [Google Scholar]
- 38.U.S. Food and Drug Administration . 2024 – highly pathogenic avian influenza (H5) virus vaccines. 2024. FDA briefing document: vaccines and related biological products advisory committee meeting October 10, 2024 – highly pathogenic avian influenza (H5) virus vaccines. Accessed: 2025-04-24. [Google Scholar]
- 39.Moderna Inc. 2025. Moderna announces updates on pandemic influenza program. Accessed: 2025-04-24. [Google Scholar]
- 40.Arcturus Therapeutics Holdings Inc. 2023. Arcturus therapeutics receives U.S. FDA fast track designation for arct-810, MRNA therapeutic candidate for ornithine transcarbamylase deficiency. Accessed: 2025-04-24. [Google Scholar]
- 41.Garazi Peña Alzua, Nicolás León André, Yellin Temima, Bhavsar Disha, Loganathan Madhumathi, Bushfield Kaitlyn, et al. Human monoclonal antibodies that target clade 2.3.4.4b H5N1 hemagglutinin. bioRxiv. 2025 doi: 10.1101/2025.02.21.639446. [DOI] [Google Scholar]
- 42.Absci Corporation . February 2024. Absci initiates IND-enabling studies for ABS-101, a potential best-in-class anti-TL1A antibody de novo designed and optimized using generative AI. Accessed: 2025-04-30. [Google Scholar]
- 43.Santuari Luca, Bachmann Salvy Marianne, Xenarios Ioannis, Arpat Bulak. AI-accelerated therapeutic antibody development: practical insights. Front Drug Discov. 2024;4 doi: 10.3389/fddsv.2024.1447867. [DOI] [Google Scholar]
- 44.He Xin-heng, Li Jun-rui, Xu James, Shan Hong, Shen Shi-yi, Gao Si-han, et al. AI-driven antibody design with generative diffusion models: current insights and future directions. Acta Pharmacol Sin. Mar 2025;46(3):565–574. doi: 10.1038/s41401-024-01380-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Watson Joseph L., Juergens David, Bennett Nathaniel R., Trippe Brian L., Yim Jason, Eisenach Helen E., et al. De novo design of protein structure and function with RFdiffusion. Nature. Aug 2023;620(7976):1089–1100. doi: 10.1038/s41586-023-06415-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Venderley Jordan. 2023. AntiBARTy diffusion for property guided antibody design. [Google Scholar]
- 47.Martinkus Karolis, Ludwiczak Jan, Cho Kyunghyun, Liang Wei-Ching, Lafrance-Vanasse Julien, Hotzel Isidro, et al. 2024. AbDiffuser: full-atom generation of in vitro functioning antibodies. [Google Scholar]
- 48.Eastman Peter, Galvelis Raimondas, Peláez Raúl P., Abreu Charlles R.A., Farr Stephen E., Gallicchio Emilio, et al. Openmm 8: molecular dynamics simulation with machine learning potentials. J Phys Chem B. Jan 2024;128(1):109–116. doi: 10.1021/acs.jpcb.3c06662. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Sequence identity comparisons between the diffused and reference antibody sequences. A) Violin/box plots showing the distribution of the diffused sequences' identities by chain. B) Heatmap showing the pairwise identity comparisons.
Data Availability Statement
All code, data, results, and additional analyses are openly available on GitHub at: https://github.com/Santollan/Frankies. This repository includes the open-source logic for running the Frankies pipeline. Also, this includes all sequences and folded structures for the reference antibodies and H5N1 isolates used in this H5N1 study, analysis scripts, and docking metrics. This also includes the experimental outputs for the 30 diffused Fv candidates.








