Abstract
Insecticidal bacterial proteins play key roles in insect-bacteria interactions and have been used as biopesticides. Here, we identify two insecticidal proteins in Paeniclostridium ghonii, designated PG-toxin 1 (PG1) and PG-toxin 2 (PG2), which are homologs of botulinum neurotoxins (BoNTs). Unlike BoNTs, PG1 and PG2 contain two separate proteins: One is the protease light chain (LC), and the other is the heavy chain containing the translocation domain and the receptor binding domain. Crystal and cryo–electron microscopy structures show a conserved BoNT-like architecture but without an interchain disulfide bond. Functional characterizations establish that the LCs of PG1 and PG2 cleave insect synaptosomal–associated protein 25 (SNAP25), but not human or rat SNAP25, and microinjection of PG1 and PG2 caused paralysis and death in Drosophila and Aedes mosquitoes. These findings identified unique two-component BoNT-like insecticidal proteins, revealing insights into the evolution of the BoNT family of toxins, and broadening our understanding of bacteria that can be used for biopest controls.
Two-component BoNT-like toxins from Paeniclostridium ghonii specifically cleave insect SNAP25 and kill flies and mosquitoes.
INTRODUCTION
With more than 2.5 million species, insects occupy all terrestrial ecological systems and have major impacts on human health through vector-borne diseases and as major competitors for food and fiber (1–3). They destroy 18 to 20% of annual crop production worldwide, contributing to famine and chronic hunger in resource-poor countries. Biopest control, using natural or engineered insect pathogens and virulence factors, offers many advantages over chemical pesticides (4). Well-studied entomopathogens such as Bacillus thuringiensis, Photorhabdus, and Xenorhabdus, have been widely used as biopesticides (5, 6). These pathogens rely on insecticidal proteins. For instance, the insecticidal activity of B. thuringiensis is due to a large family of pore-forming proteins known as crystalline proteins (Cry) that damage insect midguts (7). Cry proteins have also been used to generate insect-resistant transgenic plants. However, resistance to these Cry proteins is rising in pest populations, raising the need to find additional insecticidal proteins and biopest control approaches (5, 6).
Clostridial neurotoxins (CNTs), which include seven serotypes of botulinum neurotoxins (BoNT/A-G) and tetanus neurotoxin (TeNT), are the most potent toxins targeting vertebrate animals, and are responsible for causing botulism and tetanus in animals and humans, respectively (8–11). BoNTs have also become a billion-dollar biotherapeutic for treating many neurologic disorders as well as for reducing wrinkles. These toxins are initially expressed as a single 150-kDa protein, which then requires posttranslational activation by proteolytical cleavage to separate it into two chains: the N-terminal light chain (LC, ~50 kDa) and the C-terminal heavy chain (HC, ~100 kDa) that remain connected via a disulfide bond formed between the C terminus of the LC and the N terminus of the HC.
The LC is a zinc-dependent metalloprotease containing the conserved motif “HExxH” (12). It specifically cleaves host SNARE (soluble N-ethylmaleimide–sensitive factor attachment protein receptor) proteins in neurons, including vesicle-associated membrane protein 1/2/3 (VAMP1/2/3) by BoNT/B, D, F, G, and TeNT, synaptosomal-associated protein 25 (SNAP25) by BoNT/A, E, and C, and syntaxin 1 by BoNT/C (8, 9, 13). Cleaving any one of these three SNARE proteins can block neurotransmitter release, thus causing paralysis. The HC contains two functional domains: an N-terminal translocation domain (HN) responsible for transporting the LC across endosomal membranes into the neuronal cell cytosol, and a C-terminal receptor binding domain (HC) that targets neurons in humans and vertebrate animals.
BoNTs are produced mainly by four lineages of clostridia collectively termed Clostridium botulinum strains, as well as isolated strains of Clostridium noyvi, Clostridium baratii, Clostridium sporogenes, and Clostridium butyricum (11). BoNTs cause food-borne botulism when they enter humans or vertebrate animals via the oral route due to ingestion of toxin-containing foods. BoNTs are always coexpressed within a gene cluster containing another ~150-kDa nontoxic nonhemagglutinin (NTNH) protein, which forms a complex with BoNTs and protects these toxins from proteases in the digestive tract (14). TeNT does not cause disease via the oral route and lacks the associated NTNH protein (15). In addition to the NTNH protein, the BoNT gene clusters can contain one of two sets of accessory proteins: One set expresses proteins known as hemagglutinin 17 (HA17), HA33, and HA70, which can form a large complex with BoNT-NTNH to facilitate the absorption of toxins across the gut epithelium (16, 17); the other set expresses proteins designated OrfX1, OrfX2, OrfX3, and P47, which also enhance the toxicity of BoNT-NTNH in the gut (18–22).
With the growing availability of genomes in public databases, a number of BoNT-like homologs have been identified bioinformatically in recent years, suggesting that BoNTs may have evolved as a sublineage of a broader BoNT-like toxin family (8, 23–26). The first BoNT homolog was reported in 2015 by Mansfield et al. (27), in the genome of the bacterium Weissella oryzae. This toxin, named “BoNT/Wo,” was reported to cleave VAMP (28), although this specificity remains to be validated. In 2017, Zhang et al. (29) identified BoNT/X in a C. botulinum strain, which cleaves multiple VAMPs including noncanonical VAMP family members such as VAMP4 and Ykt6 that cannot be cleaved by other BoNTs. A year later, another BoNT-like toxin, named BoNT/En, was found in Enterococcus faecium, which can cleave both SNAP25 and VAMP2 (30, 31). The hosts targeted by BoNT/Wo, X, and En remain to be established (32, 33).
The most recent addition to the BoNT-like toxin family tree has come from studies of the insecticidal bacterium, Paraclostridium bifermentans, which produces a BoNT-like protein, named PMP1, with insecticidal activity toward mosquitoes by cleaving mosquito syntaxin 1 (34). PMP1 clustered within the divergent lineage of BoNTs including BoNT/X and BoNT/En, despite having only 34 to 36% identity to these toxins. The identification of PMP1 as an insecticidal BoNT-like protein aligns with earlier reports suggesting numerous links between BoNTs and insecticidal pathogens, such as the presence of homologs of OrfX and HA genes in insecticidal gene clusters, the detection of partial homologs of BoNTs encoded within the genomes of entomopathogenic fungi, and the identification of BoNT-like gene fragments in insect gut metagenomes (24).
All known BoNTs and BoNT-like toxins are produced as a single protein, and they are always coexpressed with NTNH. Here, we identified two BoNT-like toxins in the bacterium, Paeniclostridium ghonii. These two toxins, named PG1 and PG2, are uniquely assembled with two separate proteins, and there are no associated NTNHs. Structural studies demonstrated that these toxins are bona fide members of the BoNT-like toxin family. Functional studies revealed that they selectively cleave insect SNAP25 at unique sites but not human or rat SNAP25. Consistently, microinjection of these toxins induced paralysis and death in Drosophila and mosquito models. P. ghonii has never been previously linked to insects. Our discovery of PG1 and PG2 and their specificity toward insects presents an exciting case that previously uncharacterized entomopathogenic bacteria and insecticidal proteins exist in nature with the potential to be developed for biopest controls.
RESULTS
Identification of BoNT-like genes
We performed a Basic Local Alignment Search Tool (BLAST)–based search for CNT homologs in the National Center for Biotechnology Information (NCBI) database, restricting our search to organisms outside of the Clostridium genus. Using BoNT/A1 (accession # WP_021136346.1) as a query, we identified two separate proteins (next to each other) in the genome of P. ghonii (strain NCTR 3900): One matches the LC of BoNT/A1 (labeled as “M91 family zinc metallopeptidase,” WP_250673407.1) with 28.0% amino acid sequence identity; the other matches the HC of BoNT/A1 (labeled as “hypothetical protein,” WP_250673408.1) with 26.4% amino acid sequence identity (Fig. 1A and fig. S1A). We designated them “LC/PG1” and “HC/PG1,” respectively. We further searched the Joint Genome Institute (JGI) genome database and identified one more set in the genome of another P. ghonii strain (DSM15049), one with 37.7% identity to the LC/PG1 and the other with 61.1% identity to HC/PG1 (Fig. 1A and fig. S1A), designated “LC/PG2” and “HC/PG2,” respectively.
Fig. 1. Identification and bioinformatical analysis of PG1 and PG2 in P. ghonii.
(A) Genomic neighborhoods surrounding PG1 (pg1-LC and pg1-HC) and PG2 (pg2-LC and pg2-HC) genes. The gene clusters containing representative BoNT and BoNT-like toxins are shown for comparison, including PMP1 (pmp1), BoNT/En (bont), BoNT/X, BoNT/A4 (a subtype of BoNT/A with its gene cluster containing orfX), and TeNT (tetX). Other neighboring genes encode ABC transporter ATP-binding protein (abc), radical SAM domain containing protein (radsam), ABC transporter ATP-binding protein/permease (abc-permease), filamentation induced by adenosine 3′,5′-monophosphate protein (fic), partitioning protein A (parA), insecticidal Cry8Ea1 toxin (cry8Ea1), RNA polymerase sigma factor D (rpoD), n-acetylmuramoyl-l-alanine amidase (amiD), metallophosphatase family protein (mpp), and BhlA/UviB family holin-like peptide (labeled “1”) aureocin-like type II bacteriocins (labeled “2” and “3”). Some duplicated labels between PG1 and PG2 were omitted to show adjacent labels clearer (mpp in PG1 and parA, rpoD, and amiD in PG2). (B) Schematic illustration of three-domain arrangement of PG1 and PG2 and their comparison to BoNT/A and PMP1. LC, light chain; HC, heavy chain; HN, translocation domain; HC, receptor binding domain. The conserved catalytic motif (HExxH) and the key motif in the translocation domain (PxxG) are marked. (C) Phylogenetic analysis of the LC (left) and HC (right) of BoNTs, BoNT-like toxins, PG1, and PG2. In both trees, PG-LC and PG-HC cluster with the BoNT/X/En/PMP1 lineage. Bootstrap values are indicated for all major clades. F7 and F5 are two subtypes of BoNT/F. HA/FA is the reported chimeric toxin BoNT/HA. (D) Phylogenetic clustering of P. ghonii genomes with the top 64 closest genomes in the NCBI database based on ANI. ANI values are shown on the right side of the tree and are averaged for the two P. ghonii strains. Strains encoding toxins were marked with a black circle.
Both LC/PG1 and LC/PG2 contain the conserved HExxH motif required for zinc-endopeptidase activity, and more specifically an HELVH motif identical to that in BoNT/X (Fig. 1B and fig. S1B). Within the translocation domain, a PxxG motif was previously identified as highly conserved in many single-chain bacterial toxins (35–37). This motif is conserved in HC/PG1 and HC/PG2 (Fig. 1B and fig. S1B). The putative receptor binding region (HC) of HC/PG1 and HC/PG2 lacks the canonical SxWY motif known to mediate ganglioside-binding in BoNTs (38). In the mosquitocidal PMP1, a second SxWY motif (“motif 2”) is present immediately upstream of the canonical SxWY motif (34). HC/PG1 and HC/PG2 have a similar motif (SDVY) in the same place as the PMP1 motif 2 (fig. S1B).
PG1 and PG2 cluster with BoNT/X/En/PMP1
To explore their evolutionary relationships with BoNT proteins, we aligned and built maximum-likelihood phylogenies for LC/PG and HC/PG in comparison with BoNTs and BoNT-like toxins (Fig. 1C). As expected, PG1 and PG2 clustered as neighbors, indicating their common origin. Overall, PG1 and PG2 clustered with BoNT/X/En/PMP1 (Fig. 1C): The LC/PG clustered as a sister lineage to X/En/PMP1 with moderate (84%) clade support, and the HC/PG clustered specifically with PMP1 with strong (99%) clade support (Fig. 1C). Among BoNT/X/En/PMP1, PMP1 shares the highest identity to PG1 (29% for LC, 32.4% for HC) and PG2 (27% for LC, 34% for HC) (fig. S1A).
Gene neighborhood analysis
We next investigated the genomic neighborhoods surrounding PG1 and PG2 (Fig. 1A). Overall, the PG1 locus is similar to the PG2 locus (Fig. 1A). The LC/PG gene is immediately upstream of the HC/PG gene. The intervening region between the LC and HC is noncoding with numerous stop codons in all reading frames (fig. S1C), confirming that LC and HC are encoded by two separate genes. This is a major difference from known BoNTs, TeNT, BoNT-like toxins, and BoNT homologs, which all have their LC and HC encoded within a single gene (Fig. 1, A and B).
Another major difference from BoNTs and BoNT-like toxin clusters is that there is no presence of ntnh, orfX, and ha homologs among genes surrounding PG1 and PG2 (Fig. 1A). Instead, there is a gene encoding a predicted Cry delta endotoxin near PG1 and PG2 (accession # WP_250673405.1; Fig. 1A). A Cry toxin prediction was confirmed by the Conserved Domain Database, with matches to all three domains (N, M, and C-terminal region). BLAST analysis of this protein showed that it shares ~40% amino acid identity to Cry8Ea1 family delta endotoxin (accession # WP_172452114.1) in B. thuringiensis, which are well-established insecticidal proteins. This Cry gene is encoded in the opposite direction to LC/PG genes, suggesting independent regulation of the Cry gene and the PG genes.
The PG locus encodes several genes shared with those found in other CNT gene clusters and homologs. These include predicted amiD (N-acetylmuramoyl-l-alanine amidase gene) and parA genes, which are also present in the BoNT/En locus of E. faecium, and abc [ABC transporter adenosine 5′-triphosphate (ATP)–binding protein] and fic (filamentation induced by cyclic adenosine 3′,5′-monophosphate protein) genes that are found in the tent locus of Clostridium tetani. Three nearby genes are also shared with the PMP1 locus of P. bifermentans: an mpp gene encoding a putative metallophosphatase family protein, an abc-permease (ABC transporter ATP-binding protein/permease), and a radsam gene encoding a radical SAM domain–containing protein. Downstream of the mpp gene in both P. ghonii genomes is a cluster of genes encoding a BhlA/UviB family holin-like peptide (labeled “1”) and two genes encoding aureocin-like type II bacteriocins (labeled “2” and “3”) (Fig. 1A). These data are consistent with PG1, PG2, PMP1, and BoNT/En forming a distinct lineage of virulence genes within a larger family of BoNT-like proteins.
pg1 and pg2 are plasmid-borne genes
In the initially identified assemblies, the scaffold from DSM15049 containing the PG2 genes was more complete [64,948 base pairs (bp)] than the toxin-encoding contig from NCTR 3900 (29,173 bp), in which we identified genes encoding plasmid replication initiation proteins. BLAST analysis of these proteins confirmed a close relationship to orthologous genes in a PMP1-encoding megaplasmid in P. bifermentans (34). The contigs carrying the PG cluster in DSM15049 and NCTR 3900 were both predicted as plasmids with 73.28 and 74.66% of the vote, respectively, in excess of the 40 to 60% threshold where incorrect classifications typically fall (rfPlasmid) (39). Similarly, another near-Clostridium species with a known BoNT-like sequence on a reported circularized plasmid, P. bifermentans parabai, was predicted as a plasmid with 79.16% of the vote. It was thus likely that the pg1 and pg2 genes were both encoded in plasmids.
To verify the plasmid localization of PG2, we obtained P. ghonii Prévot 1938 [DSM15049/American Type Culture Collection (ATCC) 25757/VPI 4897] from DSM, cultured, sequenced, and produced a closed genome for the strain. The resulting assembly consisted of a ~3.3–mega–base pairs chromosome, and 135, 126, 60, and 14–kilo base pair (kbp) plasmids. pg2 was confirmed to be localized to the 136-kDa plasmid. In addition, an updated assembly for strain NCTR 3900 became available as of August 2024, incorporating long-read sequencing, which indicated that PG1 (LZ906_015900) localizes to a 137-kbp plasmid, pPGH-1 (CP161001.1). These plasmids are highly syntenous with 86% shared coverage and 96% nucleotide identity. These results indicate that both strains have related mega-plasmids that carry divergent PG toxins, potentially paralleling observations in group I C. botulinum where common plasmid families have highly divergent BoNT serotypes.
Phylogenetic analysis of P. ghonii
P. ghonii was first identified in the 1930s and was originally named Clostridium ghonii. It was later reclassified as a member of the genus Paeniclostridium (40). To examine the phylogenomic similarities of P. ghonii to other available sequenced organisms, we retrieved genomes of all available Paeniclostridium species, as well as [Eubacterium] tenue and members of the genus, Paraclostridium, given their previously identified relationships to P. ghonii based on 16S analysis (40). On the basis of genome-wide average nucleotide identity (ANI) clustering, the two strains of P. ghonii cluster as a sister lineage to a clade of Paraclostridium which includes P. bifermentans and P. benzoelyticum (Fig. 1D). P. ghonii displays ANI values of ~87.4 to 87.6% to these Paraclostridium species and lower ANI values to (Eubacterium) tenue (83.4%) and Paeniclostridium sordelli (82.7%). The relatively close relationship between P. ghonii and P. bifermentans is noteworthy as PMP1 was identified in a P. bifermentans strain (34).
LC/PG1 and LC/PG2 can cleave insect SNAP25
The LC of BoNTs cleave vertebrate neuronal SNARE proteins. To evaluate whether LC/PG1 and LC/PG2 also cleave SNARE proteins, genes encoding LC/PG1 or LC/PG2 were synthesized and expressed in human embryonic kidney (HEK) 293 cells together with cotransfection of a collection of plasmids encoding representative SNARE proteins, including rat (Rattus norvegicus) version of VAMP1,2,3,4,5,7,8, Ykt6, SNAP25, syntaxin 1A, syntaxin 1B, and syntaxin 2, 3, 4; fly (Drosophila melanogaster) version of SNAP25, SNAP24, SNAP29, nSyb, and syntaxin 1A; and yeast (Saccharomyces cerevisiae) VAMP2 homolog SNC-2P (figs. S2 and S3) (13). It has been well established that mutating two conserved residues (e.g., R363 and Y366 in BoNT/A) can reduce BoNT LC activity (41). Therefore, we created the equivalent catalytically inactive mutant forms of LC/PG1 (R333 and Y336, ciLC/PG1) and LC/PG2 (R332 and Y335, ciLC/PG2), which served as negative controls (figs. S2 and S3).
Among SNARE proteins screened in this assay, only fly (D. melanogaster) SNAP25 appeared to be cleaved by PG1: It showed a band with smaller molecular weight when coexpressed with LC/PG1 in comparison with SNAP25 alone or coexpressed with ciLC/PG1 (fig. S2). Coexpression of LC/PG2 also cleaved Drosophila SNAP25 (fig. S3), and partial cleavage was observed of rat SNAP25 and fly SNAP24 (fig. S3). Other SNARE proteins were not affected by coexpression of LC/PG1 or LC/PG2.
To further evaluate the specificity LC/PG1 and LC/PG2, we expanded the assay to include SNAP25 from major insect species such as mosquito (Anopheles stephensi), moth (Plutella xylostella), and beetle (Dendroctonus ponderosae), as well as other arthropods such as tick (Ixodes scapularis) and shrimp (Penaeus vannamei) (Fig. 2A). We also analyzed LCs of BoNT/A (LC/A), BoNT/E (LC/E), and BoNT/En (LC/En) in parallel (Fig. 2A). Coexpression of LC/PG1 generated a smaller fragment for fly, mosquito, moth, and beetle SNAP25, but did not affect shrimp, tick, or rat SNAP25, nor did it affect fly SNAP24 (Fig. 2A), suggesting that LC/PG1 is highly specific toward insect SNAP25. Coexpression of LC/PG2 resulted in complete cleavage of fly, mosquito, and moth SNAP25 (Fig. 2A), as well as partial cleavage of rat, beetle, and shrimp SNAP25, and fly SNAP24 (Fig. 2A), suggesting that LC/PG2 prefers insect SNAP25, but it is less stringent than LC/PG1. Coexpression in HEK293 cells usually produce relatively high levels of LC with long incubation time; therefore, any preferred substrate should be completely cleaved under our assay conditions. Thus, rat, beetle, shrimp SNAP25, and fly SNAP24 are not likely preferred substrates for LC/PG2.
Fig. 2. LC/PG1 and LC/PG2 cleave insect SNAP25, but not human SNAP25.
(A) HA-tagged SNAP25 from the indicated species were coexpressed with the LCs of PG1, PG2, BoNT/A, E, or En in 293T cells via transient transfection. Cells lysates were analyzed by immunoblot detecting SNAP25 using an anti-HA antibody. Tubulin was detected as a loading control. (B) In vitro cleavage assays were carried out with the ratio of 5:1 (recombinantly purified SNAP25:LC). LC/PG1 and PG2 cleaved fly SNAP25 but did not cleave human SNAP25. (C and D) Fly SNAP25 was incubated with LC/PG1 or LC/PG2, respectively (PG1-pos and PG2-pos). As negative controls, equivalent samples were pretreated with trifluoroacetic acid and heat-inactivated before incubation (PG1-neg and PG2-neg). Peptides in the supernatants were extracted from each sample and subsequently analyzed with LC-MS/MS to determine molecular weight. Left panels showed eluted peptide peaks from the high-performance column over retention time (RT, x axis). The mass/charge ratio (m/z) for the cleaved peptides are listed from z = 1 to 5. Right panels showed MS/MS spectra of the deducted cleavage products from the left panels. The sequences are listed above the right panel, labeled with b and y fragmented ions for each positive sample of LC/PG1 (C) and LC/PG2 (D), respectively. (E) Sequence alignment of the indicated SNAP25 and fly SNAP24, with the cleavage sites of BoNT/A, E, PG1 and PG2 marked in red. (F) LC/PG1 did not cleave SNAP25 (E197K), while LC/PG2 did not cleave SNAP25 (R191E). R191E mutant also showed partial resistance to LC/PG1. (G) Crystal structure of LC/PG1 was resolved to 1.80-Å resolution (left). Zinc atom (red) and coordinated residues are shown. The right panel shows an overlay of LC/PG1 with LC/A [Protein Data Bank (PDB) ID: 4EJ5] (43) (LC/PG1, cyan color; LC/A, magenta color).
As controls, LC/A and LC/E cleaved rat SNAP25 completely but showed no or minor degrees of cleavage for all other SNARE proteins (Fig. 2A), demonstrating that LC/A and LC/E prefer mammalian SNAP25 and do not recognize insect SNAP25. LC/En cleaves SNAP25 at the N-terminal region (30); thus, the cleavage band cannot be detected as the detection tag is fused to the N termini of these SNARE proteins (Fig. 2A). LC/En cleaved all tested SNAP25 as well as fly SNAP24, suggesting that LC/En is less specific and capable of cleaving a broad range of SNAP25 (Fig. 2A).
To further characterize the protease activity of LC/PG1 and LC/PG2, we next expressed and purified LC/PG1, LC/PG2, human SNAP25, and fly SNAP25 proteins recombinantly from Escherichia coli and carried out in vitro incubation assays. SDS–polyacrylamide gel electrophoresis (SDS-PAGE) gel analysis showed that LC/PG1 and LC/PG2 both cleaved fly SNAP25, resulting in smaller fragments than the original full-length SNAP25 (Fig. 2B). In contrast, human SNAP25 was not cleaved by either LC/PG1 or LC/PG2 (Fig. 2B), further demonstrating that both LCs are specific toward insect SNAP25.
LC/PG1 and LC/PG2 cleave distinct sites at SNAP25 C-terminal region
To determine the cleavage sites, we next carried out liquid chromatography–tandem mass spectrometry (LC-MS/MS) analysis of samples containing fly SNAP25 incubated with LC/PG1 (Fig. 2C). A single dominant peak was detected, with a monoisotopic molecular weight of 1817.0384 (Z = 1), which corresponded to the peptide sequences of E197-K212 of fly SNAP25 (Fig. 2C). Similarly, incubation of fly SNAP25 with LC/PG2 also generated a single dominant peak, with a monoisotopic molecule weight of 2332.2724 (Z = 1), corresponding to K192-K212 of fly SNAP25 (Fig. 2D).
To further confirm whether these peptides reflect the cleavage sites, we next expressed and purified mosquito SNAP25 recombinantly in E. coli and incubated it with either LC/PG1 or LC/PG2. Both cleaved mosquito SNAP25 in vitro (fig. S4A). LC-MS/MS analysis detected one dominant peak for each sample, corresponding to A197-K212 and K192-K212 of mosquito SNAP25, respectively (fig. S4, B and C). Together, these data demonstrate that LC/PG1 and LC/PG2 each cleave SNAP25 once: between N196 and E197 for fly SNAP25 (A197 for mosquito SNAP25) by LC/PG1 and between R191 and K192 by LC/PG2 (Fig. 2E).
These two cleavage sites on SNAP25 are located between the known cleavage sites for LC/A and LC/E (Fig. 2E). At these two cleavage sites, N196 (for PG1) and K192 (for PG2) are conserved in all SNAP25 we tested, as well as SNAP24 (Fig. 2E). Residue variations occur at the position E197 (P1′) for PG1: with a T in shrimp, V in tick, K in humans and rat SNAP25, and an N in fly SNAP24 (Fig. 2E). Residue A at this position, found in mosquito SNAP25, apparently is tolerated by LC/PG1 (Fig. 2E). Variations for PG2 cleavage site occur at R191 (P1 position): with an L in beetle, A in shrimp, Q in tick, and E in human and rat SNAP25, as well as an A in fly SNAP24 (Fig. 2E).
To further validate the importance of these residue differences, we generated and purified recombinant mutant fly SNAP25 containing R191E or E197K. As expected, E197K mutant was not cleaved by LC/PG1, and R191E mutant is resistant to LC/PG2 (Fig. 2F). R191E also showed reduced cleavage by LC/PG1 in comparison with wild-type (WT) SNAP25, whereas E197K showed similar degrees of cleavage by LC/PG2 compared with WT SNAP25 (Fig. 2F and fig. S5). These findings further suggest that LC/PG1 has more stringent requirements on amino acid sequences near its cleavage site than LC/PG2.
Crystal structure of LC/PG1
We next sought to resolve the structure of PG LCs and were able to obtain good crystals for LC/PG1 purified from E. coli. The purified recombinant LC/PG1 consisted of residues 1 to 392 with an N-terminal poly-His tag. Crystals of LC/PG1 diffracted to a resolution of 1.80 Å (Fig. 2G and table S1), with a single molecule per asymmetric unit. No electron density was observed for the C-terminal residues 378 to 392 nor a disordered loop between residues 54 and 69. Both these areas were in solvent-accessible position of the structure (Fig. 2G). The active site is in a shallow pocket of LC/PG1 and contains the catalytic zinc ion, coordinated by the HExxH motif consisting of residues H202, E203, H206, and E241. The glutamate at position 203 normally interacts with Zn2+ via a water-mediated interaction in other CNT LCs; however, here, an unknown density was present at the water position and modeled as an acetate anion. Nonetheless, the active site is structurally very conserved and most likely supports the same enzymatic activity as other LCs where this water molecule normally provides the nucleophile base required for proteolysis. In addition, the conserved Y336 is positioned in the active site and expected to stabilize catalytic intermediates during the reaction (fig. S6) (42).
The overall fold of LC/PG1 in the crystal structure is similar to other BoNT LCs despite sharing less than 30% sequence identity. The root mean square deviation (RMSD) values are 3.5 and 4.1 Å against LC/A and LC of BoNT/X (LC/X) for 352 residues, respectively, (Fig. 2G and fig. S6) (43, 44), with the main structural variations observed in the length and position of the flexible loop regions. The main secondary structure elements are conserved, demonstrating that LC/PG1 is a bona fide member of BoNT-like superfamily. The high RMSD against LC/A and LC/X (3.5 and 4.1 Å) agrees with PG1 being an interesting outlier in the BoNT toxin family (typically RMSD between 2.3 and 2.9 Å) with unique features (44). The structure of LC/PG1 reveals a slightly more open catalytic pocket compared to other LCs (fig. S6, D to F). However, it remains challenging to draw definitive conclusions from this observation, as LC/En, for example, has a more restricted site access yet exhibits a broader substrate range. Overall, structural comparisons across the LCs show that while the active site itself is highly conserved, differences in the surrounding loops and exosites and their flexibility are likely responsible for substrate selectivity LCs (fig. S6, G to I).
LC and HC form a complex
As LC and HC of PG1 and PG2 are encoded by two separate genes, we next investigated whether they can form a complex when coexpressed in E. coli using a vector with two separate promoters and ribosome binding sites (pRSFduet). The LC is expressed with an N-terminal StrepII tag, and the HC is fused with a C-terminal His6 tag (Fig. 3A). Tandem affinity purification was carried out first for the His6 tag and then for the StrepII tag. SDS-PAGE analysis revealed that both LC and HC of PG1 and PG2 were copurified with the His6 tag followed by the StrepII tag (Fig. 3A), indicating that coexpressed LC and HC formed a complex in E. coli. We note that there was an additional band above the LC/PG2 after the His6 tag purification, and it was not present after the second purification step with the StrepII tag (Fig. 3A), suggesting that this band is likely a degradation product of the HC. Compared to HC/PG1, it appeared that HC/PG2 is more susceptible to degradation in E. coli. The complex structure was resistant to a wide range of pH (5.6 to 10.0) without dissociation into separate LC and HC, indicating that tight intermolecular interactions might be involved (fig. S7).
Fig. 3. Cryo-EM structure of PG1 complex.
(A) Strep-II–tagged LC and His-tagged HC were coexpressed in E. coli using a pRSF-Duet vector. They were subjected to tandem purification, first using Ni–nitrilotriacetic acid column, followed with Strep tag column. Purified proteins were analyzed by SDS-PAGE and Coomassie Blue staining. LC and HC were copurified together for both PG1 and PG2. (B) Top: Representative cryo-EM 2D averages. Bottom: 3D reconstruction density map and atomic model of PG1 complex. Overlay of our built LC (cyan) and HC (blue) atom model into the cryo-EM density map shows local agreement of the refined model with the map. (C) The structure of PG1 complex was resolved to 3.3 Å by cryo-EM, and the front, back, and top views were presented. The three domains of PG1 complex are labeled in different colors: LC (cyan), HN (green), and HC (yellow). The N-terminal belt region of HN is marked in magenta. A schematic illustration of PG1 domains is shown above the structures, with the residue numbers labeled. (D) Left: An overlay of LC/PG1 with LC of BoNT/A (PDB ID: 3V0C) (14). Right: An overlay of HN/PG1 with the HN of BoNT/A. PG1 in cyan and BoNT/A in gray. (E) Schematic diagrams of PG1 and BoNT/A structures show different orientations of belt region with respect to the rest of the molecule in the front (left) and back view (right). In contrast to BoNT/A, the belt region of PG1 only holds and interacts with LC of PG1 from the posterior side. (F) An overlay of HC/PG1 with HC of BoNT/A. PG1 in cyan and BoNT/A in gray. The HC is composed of two subdomains, HCN and HCC.
Cryo-EM structure of PG1 complex
To understand how LC and HC form a complex, we carried out cryo–electron microscopy (cryo-EM) studies focusing on the PG1 complex purified from E. coli. Negative staining EM showed homogeneous particles with uniform size after two-dimensional (2D) classifications of the toxin complex particles, indicating relatively homogenous complex formation (fig. S8). We then conducted single-particle cryo-EM analysis with the collection of 11,692 micrographs of PG1 complex. Details of the cryo-EM data collection are provided in table S2 and a comprehensive processing pipeline was followed as shown in fig. S9. After blob-based particle picking and 2D classification, the 2D class averages showed well-defined secondary structural elements (Fig. 3B). Subsequently, the refined 2D classes from the selected particles were used to generate initial 3D models. After multiple rounds of 3D classification and 3D refinement, a total of 364,136 particles was used to reconstruct the final 3D density map of the PG1 complex resolved to an overall resolution of 3.3 Å per gold-standard Fourier shell correlation 0.143 criterion, which allowed us to build the PG1 complex atomic model with side chain–level accuracy (Fig. 3B, fig. S9, and table S2).
In whole, both LC and HC are clearly resolved, and their overall structures are homologous to known BoNTs (Fig. 3, B and C). The PG1 complex showed three typical BoNT domains including the LC, HN, and HC, with LC and HC flanking each side of HN (Fig. 3C). The LC is largely the same as the crystal structure of the LC alone (Fig. 2G). The HN of PG1 consists of a central pair of α helices, each ~105 Å in length, which are highly homologous to the same region in BoNTs (Fig. 3D) (14, 45). There is ~68-residue N-terminal loop region in the HC that interact with one side of the LC (Fig. 3, C and D). This loop is homologous to the previously defined belt region at the N terminus of BoNT-HC, which usually wraps around BoNT-LC, with the putative function of serving as a pseudo-substrate inhibitor and a chaperone facilitating translocation of the LC across endosomal membranes (46). The position of this belt region is a major difference between PG1 and BoNTs: It usually circles the LC in BoNTs, but it is located on one side of the LC in PG1 (Fig. 3E).
The overall structure of the HC of PG1 (HC/PG1) also closely resembles that of the BoNT/A and other BoNT HC domains (Fig. 3F), with RMSD of 2.2, 1.1, and 1.1 Å against HC/A for 390 residues, HC of BoNT/X for 382 residues, and HC of PMP1 for 373 residues, respectively (Fig. 3F and fig. S10) (32, 39, 45). HC/PG1 consists of two subdomains: HCN and HCC, linked by a short helix. The HCN of PG1 adopts a jelly roll barrel motif, while its HCC forms a β-trefoil fold, similar to the typical structure found in other BoNT family members. Since HC/PG1 lacks the typical SxWY ganglioside-binding motif (GBM) present in most BoNTs (fig. S1), it is unlikely to bind to carbohydrates in the same way. However, a shallow hydrophobic pocket consisting of residues Y1193, Y1195, and W1197 is exposed on the surface of HC/PG1, at a site adjacent to the GBM in BoNT/A (fig. S10), suggestive of a potential receptor-binding site. The HC of PMP1 and BoNT/X also contain unusual hydrophobic patches of aromatic residues across their surface that were proposed to have a role in targeting insect cells through a mechanism of cell recognition (32). In addition, HC/PG1 displays a surface exposed leucine-rich loop, which is reminiscent of a feature observed in several BoNTs and shown to mediate the toxins interactions with neuronal membrane (47). However, analysis of the surface of HC/PG1 shows a more varied overall electrostatic potential, which suggests a different cell-binding mechanism that remains to be determined.
LC-HC interfaces
The structure of PG1 complex reveals that LC-HC interactions are mainly mediated by the belt-LC and additional LC-HC interfaces (table S3). The key interface is between the C terminus of the LC and the belt of the HC (Fig. 4A), with one β strand from the LC (residues 383 to 388) and two β strands from the HC (residues 5 to 10 and 63 to 66) forming a three-stranded antiparallel β sheet reminiscent of a “seat belt buckle” structure (Fig. 4A, left). In comparison, BoNT/A shows a similar three-stranded β sheet structure arrangement at this location, composed of one strand from the LC and two from the N-terminal region of the HC (Fig. 4A, right). The major difference is that there is a disulfide bond formed between residues C430 (on the LC strand) and C454 (on one of the HC strands) in BoNT/A (Fig. 4A), whereas there is no such disulfide bond in PG1 (Fig. 4A).
Fig. 4. LC-HC interfaces in PG1.
(A) Left: The close-up view of PG1 LC-belt region intermolecular β sheet (seat belt buckle structure) formed by the C-terminal region of LC/PG1 (cyan) and the N-terminal region of HC/PG1 (magenta). Right: The close-up view of BoNT/A (PDB ID: 3V0C) LC-HC belt region interface as well as the circled disulfide bond linkage (yellow) formed by cysteine residue C430 and cysteine residue C454. The PG1 LC (cyan), HN (green), and HC (beige) as well as the belt region (magenta) are shown. Interactions within the β sheets are shown as dashed yellow lines. (B) The close-up view of the second intermolecular three-stranded β sheet interface between the LC/PG1 and the belt region. The LC/PG1 (cyan), HN (green), and HC (yellow) as well as the belt region (magenta) are shown. Interactions between LC/PG1 and the belt region are shown as dashed yellow lines. (C) The close-up view of the interactions between the tip of the belt region with the LC/PG1. (D) The close-up view of PG1 LC-HC interdomain interface of PG1 complex and interfacing residues in the front (left) and top (right) view. The PG1 LC (cyan), HN (green), and HC (yellow) as well as the belt region (magenta) are shown. The N306-T446 and E322-W439 interactions are shown as dashed yellow lines (left) and their interactions are circled and highlighted by red dashed lines viewed from the top (right). (E) Left: An overlay of the crystal structure of LC/PG1 (light gray) versus the cryo-EM structure of LC/PG1 (cyan) in the PG1 complex. Right: The close-up view of PG1 LC structural rearrangements induced by PG1 complex formation. The red arrows indicate a short β hairpin and the LC active site residues (H202, E203, and H206) and with E241 undergoing structural rearrangements upon complex formation.
To validate the importance of this seat belt buckle structure, we carried out pull-down assays using glutathione S-transferase (GST) tag fused LC/PG1, which can pull down HC/PG1 in solution (fig. S11A). Deletion of the belt region (residues 1 to 70) of the HC or deletion of the C-terminal region of the LC (residues 384 to 389) abolished pull-down of HC/PG1 (fig. S11, A and B). Furthermore, we generated and tested point mutations between LC and belt region (fig. S11C, including D18A, K48A, R55A, D61A, D63A, and W439A). Among them, D61A abolished formation of complex. D61 forms multiple salt bridges with D60 and R385 of LC/PG1. It also forms close interactions with side chains of I57 and A383, and backbone of S59. Since D61 is located at the intersection of two independent flexible N- and C-terminal loops of LC/PG1, D61A abolishes key interactions supporting stability of complex toxin, whereas other point mutations might be tolerated at pull-down assay levels. It is also possible that D61A may cause unexpected protein conformational changes that disrupt the interactions.
Besides this key seat belt buckle interface, the PG1 belt forms many additional interactions with the LC. The most notable one is a second intermolecular three-stranded β sheet interface, mediated by two β strands in PG1 LC (residues 221 to 227 and 234 to 238) and one β strand of the belt (residues Phe14 to Asp18; Fig. 4B). There is also a short helix on the belt in this region that interacts with a loop on the LC (e.g., between E253 on the LC and Y21 on the belt, Fig. 4B). These structural features and interactions are unique to PG1 and not found in any other BoNTs. In addition, there are many more interactions between the unstructured region of the belt with the LC (Fig. 4C and table S3).
Besides the belt region, there are also direct interactions between the LC and the HC region, mediated by N306 and E322 on the LC with T446 and W439 on the HC, respectively (Fig. 4D). This interface is located on the opposite side of belt-LC interfaces (Fig. 4D). Thus, LC-HC and LC-belt interactions appear to clamp the LC from two sides, which may further stabilize the PG1 complex.
To further understand the complex assembly mechanisms, we compared the crystal structure of the LC alone (Fig. 2G) with the LC in the PG1 complex (Fig. 4E). The overall structure is nearly identical, indicating that formation of the complex does not require large conformational alterations in the LC. However, there are two major differences: (i) The C terminus of the LC lacks observable electron density in the LC crystal structure (Fig. 2G), suggesting that it is flexible and unstructured, whereas it adopts a well-defined β strand conformation in the PG1 complex and extend between two β strands of the HC belt (Fig. 4, A and E); (ii) the region forming the second three-stranded β sheet interface identified in the PG1 complex (Fig. 4, B and E). This region is observed as a long, extensive loop in the LC alone but undergoes a drastic shift, resulting in a notable extension of the β hairpin (Fig. 4E), allowing it to form a three-stranded β sheet interface with the HC.
In addition to these two major changes, there are also interesting residue shifts in the catalytic site. Although the HExxH motif remains stable when comparing between the LC alone and the PG1 complex, the latter shows that the helix holding E241 undergoes a conformational change with an inward kink that occupies the catalytic site and likely disrupts coordination of the zinc ion in the cryo-EM structure (43, 48). We cannot exclude the possibility that sample processing for cryo-EM may somehow have resulted in loss of the zinc ion, as PG1 and PG2 both clearly require coordinated metal to maintain their catalytic activity and metal chelation using EDTA blocked fly SNAP25 cleavage (fig. S12), similar to other BoNT LCs (49). The HC belt seamlessly conforms to the LC surface, and its presence on the LC has no discernible impact on the local structure of the LC (Fig. 4E). When PG1 and PG2 complexes were incubated with fly SNAP25, the resulting cleavage was comparable to incubation with LC/PG1 and LC/PG2, respectively (fig. S13).
PG1 and PG2 are toxic to fly and mosquito
We next focused on investigating the biological activity of PG1 and PG2 complexes in vivo. To further validate that PG1 and PG2 are not toxic to mammals, we first evaluated their activity in mice by injecting PG1 or PG2 into the hind leg muscles in the well-established digit abduction score assay (50). Injection of picogram quantities of BoNTs causes paralysis of leg muscles. We found that injection of even one microgram PG1 or PG2 did not induce any muscle paralysis.
We then used D. melanogaster as an insect model. PG1 and PG2 were administered via microinjection into the thorax of adult flies (between the mesopleura and pteropleura; Fig. 5A). The injected flies were then placed in standard vials with fresh food, with their mobility and mortality observed and recorded. Isolated LC and HC were analyzed as controls (Fig. 5, B and C). In addition, we also created a mutant PG2 complex containing a point mutation that disrupt the HExxH motif (E206Q) in the LC, which served as a negative control (Fig. 5C). Both PG1 (Fig. 5B) and PG2 (Fig. 5C) paralyzed flies (loss of flight and climbing ability) and resulted in death of all flies (Fig. 5, B and C). As controls, injections of LC, HC, or PG2 (E206Q) did not affect fly survival rate (Fig. 5, B and C). Titrating PG1 and PG2 in this assay showed dose-dependent mortality with an estimated median lethal dose at ~198 femtomole (fmol) for PG1 and ~187 amol for PG2, respectively (Fig. 5, D to G). Thus, PG2 appears to be ~1000-fold more potent than PG1 in D. melanogaster. We also tested PG1 assembled in vitro, generated by incubation of separately expressed/purified LC/PG1 and HC/PG1. The assembled PG1 showed similar level of insecticidal activity compared to complex PG1, indicating that the postexpression assembly is sufficient to generate functional toxin unit (fig. S14).
Fig. 5. PG1 and PG2 are toxic to flies and mosquitoes.
(A) Schematic illustrations of microinjection of toxins into adult D. melanogaster. (B) Survival curves of adult flies injected with 100 fmol of PG1. Injections of HC/PG1 and LC/PG1 were analyzed in parallel as controls. PG1 killed most flies within 48 hours (hr). (C) Survival curves of adult flies injected with 1 fmol of PG2. Injections of HC/PG2, LC/PG2 and inactivated PG2 mutant containing E206Q. PG2 killed all injected flies within 24 hours. (D and E) Survival curves for PG1 injected flies were recorded (D). The LD50 was ~198 fmol for PG1 (E). (F and G) Survival curves for PG2 injected flies were recorded (F). The LD50 was ~187 amol for PG2 (G). PG2 exhibited 1000-times higher potency than PG1. (H) Schematic illustration of PG1, PG2, and their mutants PG112, PG221, and PG2-SS. (I) SDS-PAGE analysis showed that the LC and HC were copurified in PG221 and PG112. (J) Flies were injected with PG1, PG2, PG221, and PG112 at the indicated doses (100, 1000, and 10,000 amol), and the percentage of survival flies after 24 hours were recorded and plotted. The toxicity level was PG2 (most toxic) > PG221 > PG112 > PG1 (least toxic). (K) SDS-PAGE analysis showing PG2 and PG2-SS under reducing (+2-ME, mercaptoethanol) versus nonreducing (−2-ME) conditions. PG2-SS appears to be a single band at ~140 kDa without the reducing agent, and it becomes two bands with the reducing agent, indicating that its LC and HC are linked by a disulfide bond. (L) PG2 and PG2-SS showed similar level of toxicity when injected into flies.
To further evaluate the activity of PG1 and PG2, we also carried out microinjection into the thorax of adult yellow fever mosquito (Aedes aegypti), which is a major human disease vector that can spread many human diseases such as dengue and Zika viruses. Both PG1 and PG2 results in death of mosquitos, and PG2 is more potent than PG1 (fig. S15).
The HC determines the potency difference between PG1 and PG2 in flies
We next sought to create chimeric toxins between PG1 and PG2 to understand the reason for the potency difference between PG1 and PG2 in Drosophila. Simply coexpressing LC/PG1 with HC/PG2 or LC/PG2 with HC/PG1 did not result in complex formation. As the structure of PG1 complex showed that the N-terminal region of HC (e.g., W439) interacts with LC/PG1, we decided to swap only the C-terminal part of the HC domain (HCC) to avoid disrupting complex formation. Two chimeras were thus created, with one designed PG112 (containing PG2 HCC) and the other PG221 (containing PG1 HCC) (Fig. 5H). Both proteins are successfully coexpressed and purified as a complex (Fig. 5I). We then assessed their potency on Drosophila via microinjection. PG221 is close to PG1 and showed greatly reduced activity compared to PG2 (Fig. 5J), whereas PG112 showed higher potency than PG1 (Fig. 5J), although it still did not reach the same level as PG2. These findings suggest that the HCC/PG2 is a major factor contributing to the higher activity of PG2. It is thus likely that PG2 can achieve a better receptor recognition than PG1 in Drosophila.
A disulfide bond can be introduced in PG2
Lastly, we explored whether it is feasible to introduce a disulfide bond to link the LC and HC of PG toxins and whether this can further enhance the potency of the toxin. As PG2 is more potent in Drosophila and Aedes than PG1, we focused on PG2. On the basis of its homology to PG1, we designed two point mutations (R390C in the LC and N8C in the HC; Fig. 5H) and coexpressed the LC and HC in a dual promoter vector (Fig. 3A). A single protein was purified, which showed a molecular weight at ~140 kDa on the SDS-PAGE gel without reducing agents (Fig. 5K). This protein separated into two bands under reducing conditions (Fig. 5K), suggesting that a disulfide bond was successfully formed. We designed it PG2-SS. We then compared the toxicity levels of this mutant PG2-SS with the WT PG2 via microinjection in adult Drosophila (Fig. 5L). The overall toxicity of PG2-SS is similar to WT PG2 (Fig. 5L).
DISCUSSION
BoNTs are the most potent toxins known and can incapacitate large animals and humans with nanogram amounts of toxin molecules (10). Although foodborne botulism has become rare in human populations, it remains a major threat to wildlife and results in deaths of countless land animals, fish, and birds all over the world (11). The complex multiple-domain structure of BoNTs and their intricate mode of action are well adapted to target and disable animals, which depend on their advanced nervous systems to survive. With our discovery of PG1 and PG2, together with the previously reported insect-specific PMP1 (34), it becomes clear that the elegant and deadly design of BoNTs is not just reserved for humans and vertebrate animals—There is a parallel world of BoNT-like toxins evolved to specifically target insects in nature, as proposed previously by genomic analysis (26). Their presence has previously escaped our attention and has only been revealed now with advances in sequencing and analyzing diverse microbial genomes.
The overall structure of PG1 is conserved compared with the general structure of BoNTs, which either results from convergent evolution or reflects an evolutionary relationship such as a common ancestor. Insects have existed far longer on earth than vertebrate animals and humans. Therefore, it is possible that BoNT-like toxins such as PG1 and PG2 may represent ancient forms. PG1 and PG2 have a major feature distinct from all BoNTs and BoNT-like toxins: Their LC and HC are encoded by two separate proteins. The toxin complex is formed via noncovalent interactions without an interchain disulfide. This is remarkable as it has been a long-held dogma that this interchain disulfide is crucial for the toxicity of BoNTs and TeNT. Beside linking two chains effectively, it has been shown that the intact disulfide bond is required for membrane translocation of BoNT LCs (51, 52). All potent single-chain bacterial toxins known to date either use a disulfide bond to connect the enzymatic domain (the “A” domain) and the translocation-receptor-binding regions (the “B” domain) or use self-cleavage events to separate the A and B domain in a well-established A-B toxin paradigm, suggesting that having both A and B domains within a single chain could be important for the potency of these toxins. There are bacterial toxins that are formed by two or more components, such as anthrax toxin and Shiga toxin, but these toxins are not at a 1:1 ratio between their A and B domains and in general need multiple numbers of B domains. PG toxins are a unique example of an AB toxin where the A and B domains are separated and assembled with 1:1 ratio.
The C terminus of the LC and the N terminus of the HC in PG1 form a three-stranded β sheet structure, and similar structure features are also present in the corresponding location for BoNTs. Our creation of engineered PG2-SS demonstrated that an interchain disulfide bond between the LC and HC of PG2 can be formed in vitro after mutating proper residues to cysteine within this β sheet structure. Since adding a disulfide bond is not detrimental to PG toxins, its β sheet structure could offer a path to evolve an interchain disulfide bond seen in BoNTs. Furthermore, the belt in PG toxins is located on one side of the LC and forms several interaction interfaces with the LC, whereas the BoNT belt is less structured and wraps around the LC. These findings support the hypothesis that BoNT LC and HC could originate from distinct ancestral genes, with the initial proteins interacting with each other via the N-terminal belt region in the HC, and later fused together as a single gene, possibly under the evolutionary pressure to gain enhanced efficacy needed for targeting and disabling hosts much larger than insects or for economical protein synthesis.
On the other hand, it remains possible that PG1 and PG2 represent a convergent evolution independent of BoNTs, and a two-component composition could gain the convenience of eliminating the need for proteolytic activation and for disulfide bond reduction after the LC reaches the cytosol, without sacrificing the potency. In this case, our work on engineered PG2-SS raises the possibility that BoNT could be engineered into a two-component toxin for production as isolated nontoxic LC and HC, and later assembled together, either through noncovalent interactions or with interchain disulfide bond formed in vitro. Such modified BoNTs could avoid the use of live bacteria expressing full-length toxins and also eliminate the requirement for proteolytic activation.
Phylogenetic analysis clusters PG toxins with the previously reported BoNT/X/En/PMP1 lineage. Within this branch of BoNT-like toxins, PMP1 and PG toxins are both capable of targeting insects. Compared to PG toxins, PMP1 is more similar to BoNTs as it is still a single-chain toxin with an interchain disulfide bond (34). Furthermore, PMP1 also resides in a typical BoNT-like gene cluster, with accompanying ntnh and orf genes (34), whereas PG toxins do not have any of these accessory proteins. PG1 and PG2 showed a high degree of selectivity on targeting SNARE proteins. In particular, PG1 cleaves only insect SNAP25 but not human or rat SNAP25. Notably, PG1 also does not cleave shrimp SNAP25. This level of specificity is unusual and could be important for later development of BoNT-like toxins for biopest control, as both shrimp and insects are arthropods. A major concern for using insecticidal proteins for biopest control is their potential toxicity to many agriculturally important arthropod species such as shrimp, lobster, and crabs.
The specificity of a BoNT is determined by not only its substrate cleavage by its LC but also receptor recognition by its HC. Although no receptors have been identified for any BoNT-like toxins, PG2 showed a far higher toxicity than PG1 (and PMP1) in Drosophila models. We created a chimeric toxin composed of the LC-HN of PG1 and the HC of PG2. This PG112 showed higher potency than PG1 in Drosophila, demonstrating that these BoNT-like toxins can be engineered to enhance selectivity and potency targeting selected insect species.
A major limitation of our studies is that the natural host and route of entry for PG toxins remain unknown. BoNTs usually enter the body via the oral routes and thus require NTNH and HA or OrfX proteins for protection and facilitating absorption. PG toxins do not reside in a BoNT-like gene cluster and have no accompanying ntnh, ha, or orf genes. We found a Cry toxin located close to PG toxins, raising the possibility that other factors may assist PG toxin absorption at the intestinal barrier. However, lack of NTNH proteins raises the intriguing possibility that PG toxins may not enter insects from the oral route, which needs to be further investigated.
P. ghonii is notable as it was identified as a member of the microbiome in the colon of the Iceman (41, 53). One of the most closely related species is P. sordellii (previously known as Clostridium sordellii), a Gram-positive pathogen that causes severe oedemic, myonecrotic, or enterotoxic infections in humans and farm animals (54). P. ghonii has never been linked to insects before, and we remain unfamiliar with their ecological roles and specific association with insect diseases, yet the fact that they harbor potent insect-specific neurotoxins as well as classic insecticidal Cry proteins suggests that there is a rich repertoire of entomopathogens existing in nature. It is exciting that genomic analysis combined with a bottom-up approach from characterizing toxins offers a way to uncover these hidden players, which might play an important role in shaping our ecological systems, and which have potential for use in agriculture pest and insect vector controls.
MATERIALS AND METHODS
Phylogenetic and genomic neighborhood analysis
An alignment was made using sequences from bontbase.org as well as PMP1 (WP_150887772.1) using MAFFT with the L-INS-i algorithm (55). The alignment was then split into the HC and LC regions corresponding to domain boundary definitions from Protein Data Bank (PDB) 3BTA. Conserved alignment blocks were then manually selected and used to produce two separate phylogenetic trees with the IQ-TREE algorithm (54). For analysis based on ANI, we chose the closely related species of P. ghonii to build the species phylogeny, including all species in the Paeniclostridium genus, as well as P. bifermentans, which was shown to contain PMP1 toxin, and Paraclostridium benzoelyticum that was reported as the species with the highest ANI in the NCBI database. The assembly files in .fna format of all the genomes were downloaded from the NCBI (https://ncbi.nlm.nih.gov/assembly) and JGI database (55). Then, the fastANI program was used to compute pairwise ANI and generate a distance matrix with identity values for the list of the assemblies (56). The R package phangorn was used to build a phylogenetic tree using the neighbor-joining method based on the ANI distance matrix (57). The ANIs for each of closely related genome to both P. ghonii genomes were averaged and indicated in Fig. 1B. The genomic neighborhoods for ±20 kb surrounding the neurotoxin genes were obtained from the NCBI database in .gff format. For PG-toxin genes, we downloaded the complete scaffold to collect as much sequence information as possible. The .gff files were uploaded to AnnoView for gene neighborhood visualization (58).
Plasmid prediction and sequencing assembly
Contig level assemblies were obtained for NCTR 3900 (PG1) and DSM15049 (PG2). To investigate the source of the contigs in their entirety, contig level assemblies were passed through rfPlasmid (v0.0.16) using a Clostridium model and default settings.
Lyophilized P. ghonii Prévot 1938 (DSM15049/ATCC 25757/VPI 4897/JCI 1490) was obtained from DSM, resuspended and cultured in Trypticase-Peptone-Glucose-Yeast Extract (TPGY) Broth (pH 7.3) at 37°C under anerobic conditions. Genomic DNA was extracted as described in the Gram-positive protocol for the DNeasy Blood and Tissue Kit Handbook (Qiagen, Hilden, Germany). Genomic DNA was prepared for short-read sequencing using the Illumina DNA Prep kit, Nextera DNA CD Indexes and sequenced on an iSeq100 sequencer [iseq i1 Reagent v2 (300-cycle 150 × 2)] under the manufacturer’s specifications (Illumina, San Diego, CA, USA) yielding 808,873 paired reads ≥ Q30. Genomic DNA was prepared for long-read sequencing via the Rapid Barcoding Kit (SQK-RBK004, Oxford Nanopore) and sequenced via MinION Mk1B sequencer on a flongle flowcell (FLO-FLG114) resulting in 28,369 reads with q score ≥9, N50 = 6822 bp, with an estimated coverage of 33X.
Nanopore reads were error corrected, trimmed, and assembled de novo via Canu v 2.2 into a draft assembly (59). iSeq100-produced short reads were mapped against the draft assembly via bowtie2 (60), sorted and indexed via samtools (61), and used to error correct the long-read assembly via pilon (62).
Materials and constructs
The following mouse monoclonal antibodies were purchased from the indicated vendors: α-tubulin (Cell Signaling Technology, #3873, 1:2000); anti-HA (BioLegend, 16B12, 1:2000); anti-FLAG (Sigma-Aldrich, #F1804, 1:1000). The 293T (#CRL-3216) cells were obtained from ATCC, which were negative for mycoplasma contamination.
The cDNAs encoding LC/PG1 (residues 1 to 392, WP_250673407.1), HC/PG2 (residues 1 to 838, WP_250673408.1), LC/PG2 (residues 1 to 395, WP_307509754.1), and HC/PG2 (residues 1 to 846, WP_307509757.1) were synthesized by Twist Bioscience (South San Francisco, CA, USA). LC/PG1 and LC/PG2 were cloned into pcDNA3.1 vector with an HA-tag and a FLAG-tag on their N termini, respectively. The cDNAs encoding LC/A (residues 1 to 425, WP_011948511.1), LC/E (residues 1 to 425, ALT05366.1), and LC/En (residues 1 to 433, WP_086311652.1) were also synthesized and cloned into pcDNA3.1. All the examined mammalian SNARE constructs on pcDNA3.1 encoding VAMP1, VAMP2, VAMP3, VAMP4, VAMP5, VAMP7, VAMP8, Ykt6, SNAP25, Syx1A, Syx1B, Syx2, Syx3, and Syx4 fused with an N- or C-terminal HA tag, respectively, were provided from our previous researches (14, 15), while cDNAs encoding fly-SNAP25 (D. melanogaster, residues 1 to 212, NP_00103-6641.1), fly-SNAP24 (residues 1 to 212, NP_524298.1), fly-SNAP29 (residues 1 to 284, NP_523831.1), fly-nSyb (residues 1 to 129, NP_001286278.1), mosquito-SNAP25 (A. stephensi, residues 1-212, XP_035899569.1), moth-SNAP25 (P. xylostella, residues 1 to 213, KAG7297990.1), beetle-SNAP25 (D. ponderosae, residues 1 to 211, XP_019769279.1), shrimp-SNAP25 (P. vannamei, residues 1 to 211, ROT68655.1), tick-SNAP25 (I. scapularis, residues 1 to 214, XP_040073329.1), and yeast-SNC-2P (S. cerevisiae, residues 1 to 115, AJU15247.1) were synthesized and cloned into pcDNA3.1 with an N-terminal HA tag, respectively. In addition, human-SNAP25 (1 to 206), fly-SNAP25 (1 to 212), and mosquito-SNAP25 (1 to 212) were also cloned into pET28 vector with an His6 tag on their N terminus, respectively. HC/PG1 and HC/PG2 were cloned into pRSF-Duet-1 with C-terminal His6 tag. For coexpression with LC, each LC was also cloned into the 2nd cloning site in the same vector with an N-terminal StrepII tag. Chimeric PG112 construct is composed of LC/PG1, HC/PG1 1 to 633, and HC/PG2 Y635-A846 encoding genes, while PG221 construct is composed of LC/PG2, HC/PG2 1-S634, and HC/PG1 Y634-A838 encoding genes. LC/PG1 (or PG2) and their mutants were also cloned into pGEX4T-1 to express as GST-tagged proteins. All mutants of LC, HC/PG1 (or PG2), and fly-SNAP25 were generated using each pair of primer containing desired mutations. To generate PG2-SS, the gene fragments encoding LC/PG2 containing R390C and HC/PG2 containing N8C were cloned in the same pRSF-Duet-1 vector.
Protein purification
E. coli BL21 (DE3) was used to purify all the proteins and their mutants. In general, each transformant seed was cultured in the autoinduction medium (Formedium, AIMLB) for 4 hours at 37°C, induction of expression was carried out at 22°C overnight. Bacterial pellets were sonicated in lysis buffer [50 mM tris (pH 7.5) and 150 mM NaCl], and supernatants were collected after centrifugation at 20,000g for 30 min at 4°C. Supernatants were loaded onto a 15-ml Ni–nitrilotriacetic acid agarose column and washed with more than 20× column volume of wash buffer [50 mM tris (pH 7.5), 150 mM NaCl, and 10 mM imidazole]. Proteins were eluted with elution buffer [50 mM tris (pH 7.5), 150 mM NaCl, and 200 mM imidazole] and then further desalted with a PD-10 column (GE, 17-0851-01). Proteins were concentrated using an ultracentrifugal filter (10 to 50 kDa MWCO, Millipore). For LC/HC complex toxin purification, the His tag–purified proteins were further purified using AKTA Prime FPLC system (GE) equipped with StrepTrap XT (Cytiva), which was further purified on a Superdex200 increase 10/300 GL column preequilibrated in the same buffer used for lysis. Elution peak corresponding to PG complex was collected, concentrated, and frozen in liquid nitrogen for storage at −80°C.
Cotransfection assay
Plasmids encoding the toxin LC (0.125 μg per well) and SNARE proteins (0.375 μg per well) were cotransfected into 50 to 70% confluent 293T cells on 24-well cell culture plates. PolyJet transfection reagents (SignaGen, MD) were used following the manufacturer’s instructions. Cell lysates were harvested 24 hours later after transfection using radioimmunoprecipitation assay buffer [50 mM tris (pH 7.5), 1% NP-40, 150 mM NaCl, 0.5% sodium deoxycholate, and 0.1% SDS] containing EDTA-free protease inhibitor cocktail (APExBio, catalog no. K1008). Cell lysates were centrifuged at 16,000g for 10 min at 4°C, and then, each 6 μl of lysate was loaded onto SDS-PAGE, which was subsequently analyzed by immunoblot.
Cleavage of recombinant fly-SNAP25 by LC/PG1 and LC/PG2
His6-tagged human-, fly- or mosquito-SNAP25s were expressed and purified from E. coli, incubated (10 μM SNAP25) with 2 μM LC/PG1 or LC/PG2 in 50 mM Hepes buffer (pH 7.4) containing 5 mM NaCl, 2 mM dithiothreitol, and 10 μM zinc acetate at 37°C for 90 min. Samples were analyzed by SDS-PAGE and Coomassie Blue staining.
Pull-down assay of HC/PG1 binding with LC/PG1
Thirty microliters of glutathione beads were incubated with 18 μg of GST-tagged LC/PG1 or LC/PG1-∆Cterm in tris-buffered saline (TBS) [50 mM tris (pH 7.5) and 150 mM NaCl]. After incubation for 1 hour at 4°C, the bead baits were washed three times with TBS. Then, the beads were incubated with 18 μg of HC/PG1, mutants or HC/PG2 in 300 μl of TBS for 12 hours at 4°C. After incubation, the beads were washed three times with TBS, and then, 30 μl of 1× SDS loading buffer was added and the samples were boiled. Each sample was analyzed by SDS-PAGE, followed by Coomassie Blue staining.
Cleavage site identification using LC-MS/MS
Sample preparation procedure for MS analysis was modified from a previous report (63). For experimental group, LCs (2 μM) were incubated with SNAP25 (20 μM) at 37°C for 2 hours and then heat-inactivated at 95°C for 5 min. For negative control group, LCs were heat-inactivated before incubation with SNAP25. Equal volume of 10% trifluoroacetic acid was added to all samples and extensively mixed for 1 min. After high-speed centrifugation (20,000g, 10 min), the supernatants containing cleaved small peptides were desalted using homemade C18 stage tips. The elution was dried using a Vacufuge (Eppendorf) centrifuge. Dried peptide samples were reconstituted using 20 μl of 5% formic acid (FA)/5% acetonitrile (ACN) and analyzed by an EASY Nanoflow HPLC 1200 (Thermo Fisher Scientific) coupled with a Q-Exactive HF-X Orbitrap mass spectrometer (Thermo Fisher Scientific) in positive-ion mode. Mobile phases A and B were 0.1% FA and 80% ACN/0.1% FA, respectively. A 24-min gradient from 5 to 50% of mobile phase B (400 nl/min) was used to elute peptides. The spray voltage and capillary temperature were set at 1.6 kV and 300°C, respectively. The full MS scans ranging from mass/charge ratio (m/z) 350 to 1400 were acquired (resolution: 120,000). The automatic gain control (AGC) target and maximum ion transfer (IT) were set at 3 × 106 ions and 100 ms, respectively. Higher-energy collisional dissociation (HCD) MS/MS scans were acquired under a data-dependent mode ranged from m/z 200 to 2000 (resolution: 30,000). The AGC target and maximum IT were set at 1 × 105 ions and 55 ms, respectively. The top 15 most abundant precursor ions were selected for MS/MS scans (charge: 2 to 4; precursor isolation window: 2 m/z; dynamic exclusion: 15 s). Raw data were analyzed by MaxQuant and QualBrowser (Thermo Fisher Scientific). Exported MS results from QualBrowser and MS/MS results from MaxQuant were plotted using Prism (64).
X-ray crystallography
The LC/PG1 gene was codon optimized for E. coli expression, synthesized, and cloned into a pET-28a(+) expression vector (GenScript, NJ, USA) with an N-terminal 6×His tag and Tobacco Etch Virus (TEV) protease recognition site. Expression was carried out in T7 Express Competent E. coli (New England Biolabs) cells grown in terrific broth medium at 37°C for ~3 hours and induced with 1 mM final isopropyl-β-d-thiogalactopyranoside concentration overnight at 16°C when cells were harvested and frozen at −80°C. Cells were lysed by sonication for 15 min on ice, in 0.02 M tris (pH 8.0) with 0.2 M NaCl and 25 mM imidazole. Proteins were purified by IMAC (immobilized metal ion affinity chromatography) with a HisTrap FF column (Cytiva, Amersham, UK), TEV cleavage of the affinity tag followed by reverse IMAC, and size exclusion (Superdex200, Cytiva). The final sample was kept in 25 mM MES (pH 6.0) with 0.15 M NaCl, and 5% glycerol at a concentration of 17.0 mg/ml.
Original crystallization conditions were identified by screening using an automated platform (Mosquito Xtal3, SPTLabtech, UK) using a sitting drop set-up. Crystals of LC/PG1 were grown within 3 to 4 days with 0.1 μl of sample at 17.0 mg/ml mixed with 0.1 μl of reservoir solution consisting in 5% v/v Tacsimate, 0.1 M Hepes (pH 7.0), 10.0% w/v polyethylene glycol monomethyl ether 5000 (condition F3 of the Index Screen, Hampton Research, USA. Crystals were transferred briefly into a cryoprotectant solution, consisting of the growth condition supplemented with 20% glycerol, before freezing in liquid nitrogen. Diffraction data were collected at station I04 of the Diamond Light Source (UK), equipped with an Eiger2 XE 16 M detector (Dectris, Switzerland). Complete datasets were collected from single crystals at 100 K. Raw data images were processed and scaled with DIALS (65) and AIMLESS (66) using the CCP4 (67) suite 7.0. Molecular replacement was performed with the coordinates of the AlphaFold prediction model of the protein to determine initial phases for structure solution in PHASER (68). Working models were refined using REFMAC5 (69) and manually adjusted with COOT (70). Water molecules were added at positions where Fo-Fc electron density peaks exceeded 3σ and potential hydrogen bonds could be made. Validation was performed with MOLPROBITY. (71) Crystallographic data statistics are summarized in table S1. The atomic coordinates and structure factors (PDB code 9GY5) have been deposited in the PDB (http://wwpdb.org).
Negative-staining EM
PG1 protein (3 μl) purified by size exclusion column diluted to 0.05 mg/ml was applied to glow-discharged grids obtained from Electron Microscopy Services. The sample was incubated on the grids for 30 s, and excess sample was blotted away using filter paper. Subsequently, the grids were stained with 3 μl of a 1.5% (w/v) uranyl formate solution for 30 s, and excess uranyl formate solution was blotted away using filter paper. The stained grids were dried for at least 30 min before imaging. Images were captured using the Thermo Fisher Scientific Tecnai T12 microscope located in the Molecular Electron Microscopy Suite (MEMS) at Harvard Medical School. For data processing, particle selection was performed using Relion LoG autopicking (72). After manual inspection of the autopicked particles, successive rounds of 2D classification were executed with Relion to eliminate low-quality particles.
Cryo-EM studies
For cryo-EM studies, 2 μl of concentrated PG1 (~2 mg/ml) were loaded onto freshly glow-discharged QuantiFoil R1.2/1.3 300-mesh holey carbon grids and blotted on Vitrobot for 11 s at 100% humidity with Whatman no.1 filter paper. The grids were then rapidly plunged frozen in liquid ethane precooled by liquid nitrogen. A total of 11,692 micrographs were recorded using the Titan Krios microscope, which was equipped with a K3 detector (Gatan) at the Harvard Cryo-Electron Microscopy Center (HC2EM).
Motion correction and contrast transfer function estimation for all micrographs were carried out using cryoSPARC (73). Dose-fractionated movies were binned by 2, and particle picking was accomplished through Blob-autopicking. After manual verification of the autopicked particles, several rounds of 2D classification were conducted to discard poor-quality particles and the selected particles were reextracted unbinned. Subsequently, the refined 2D classes from the selected particles were used to generate initial 3D models. Multiple rounds of 3D classification and 3D refinement were carried out with cryoSPARC, and nonuniform refinement volume from the best 3D class with 364,136 particles produced a final volume at 3.3-Å resolution using C1 symmetry. A local refinement job with a mask was performed, yielding a final volume still at 3.3-Å resolution using C1 symmetry.
The sharpened final volume map served as the basis for model building, postprocessing, and interpretation using Phenix (ver. 1.20.1) (74), UCSF Chimera (ver. 1.15) (75), ChimeraX (v1.5 and v1.7) (76), COOT (ver. 0.9.8.1) (70), ISOLDE v1.5 (77), and PyMol (ver. 3.1) (78). The coordinates of the PGs have been deposited in the PDB (PDB entry pdb_00009b6m, PDB ID 9B6M), and the EM density map (EMD-44266) has been deposited in the Electron Microscopy Data Bank (EMDB entry ID EMD-44266, Deposition ID D_1000282285).
Mouse injection
PG1 and PG2 were diluted in 0.2% gelatin-phosphate buffer (pH 6.3) and administered 5 μl of PG1 and PG2 (each 1 μg) by intramuscular injection into the gastrocnemius muscle of the right hindlimb of female mice (CD-1 strain, 20 to 30 g, Envigo). Mice were monitored once per day for 4 days, but no paralysis and weight loss were observed. All mouse injection procedures were performed in accordance with the guidelines approved by the Institute Animal Care and Use Committee at Boston Children’s Hospital (#00002367).
Fly injection
Oregon-R flies aged 3 to 7 days posteclosion were selected for injection experiments. Toxin solutions were prepared by diluting the toxins in phosphate-buffered saline (PBS) (pH 6.3) containing 0.25% gelatin to prevent adhesion to container surfaces. For each experimental group, 20 to 30 flies were anesthetized with CO2, and 100 nl of the toxin solution was injected into the thorax at a rate of 100 nl/s using a Nanoject III (Drummond) (79).
Injection needles were prepared by pulling glass capillaries using either a Flaming/Brown-type micropipette puller (settings: temperature, 680; pull, 50; velocity, 50; time, 200) or a Narishige PC-10 Dual-Stage Glass Micropipette Puller (settings: weight, 25 × 2; heat 71 for step 1, heat 61 for step 2). The pulled needles were optimized to a tip diameter of 0.05 to 0.1 mm. Before use, the needle tip was inspected under a microscope and adjusted by breaking the tip with thin forceps. The tip should be as thin as possible without bending during injection.
The injection needle was backfilled with mineral oil using a syringe, ensuring no bubbles were present. The needle was then loaded onto the Nanoject III microinjector. To prepare the microinjector, the plunger was extended by pressing the “empty” button until it reached its limit, signaled by an audible click. The needle was subsequently filled with the toxin solution. The residual mineral oil served to protect the plunger from direct contact with the toxin solution.
Injections were performed intrathoracically between the pteropleura and mesopleura. After injection, the flies were transferred to vials containing fresh fly medium, and phenotype assessment began 1 to 2 hours postinjection to ensure recovery from anesthesia. Paralysis phenotypes were defined as flies exhibiting weak climbing ability and loss of flight capability. For survival testing, injected flies were monitored for up to 2 days. For imaging experiments, flies were transferred to a 60-mm dish 4 hours postinjection. The mortality was scored after 24 hours.
Each experiment used 25 to 30 flies per injection condition and repeated three times independently. Data were analyzed using GraphPad Prism.
Mosquito injection
Adult 3-day-old eclosed A. aegypti were similarly injected using beveled glass capillaries. Toxin solutions were prepared by diluting the toxins in PBS (pH 7.0) containing blue dye for visualization of injection success.
For each experimental group, 25 to 30 mosquitoes were cold anesthetized and then placed on a CO2 pad for continued anesthesia. Toxin solutions (100 nl) were injected using a Nanoject II into the 2nd thoracic segment just below the wing. Twelve adults with successful injections were carefully transferred into a 50-ml conical tube. The tube was hydrated and sugar water was provided for mosquito recovery and observation. After 1 hour, the number of mosquitoes dead or alive, and/or paralyzed were noted. The mortality was scored after 24 hours.
Acknowledgments
We thank members of the Dong laboratory for technical assistance and suggestions. We thank the beamline scientists at Diamond Light Source (UK) for support with x-ray diffraction data collection. Cryo-EM data were collected at the Harvard Cryo-EM Center for Structural Biology at Harvard Medical School. We thank R. Walsh, M. Mayer, R. Nair, and S. Rawson at the Harvard Cryo-EM Center for assistance.
Funding:
This study was partially supported by grants from National Institute of Health (NIH) (R01NS080833, R01NS117626, R01AI170835, and R01AI189789 to M.D.). We acknowledge support from the NIH-funded Harvard Digestive Disease Center (P30DK034854) and Boston Children’s Hospital Intellectual and Developmental Disabilities Research Center (P30HD18655). M.D. holds the Investigator in the Pathogenesis of Infectious Disease award from the Burroughs Wellcome Fund. A.C.D. acknowledges funding support from the Natural Sciences and Engineering Research Council of Canada (NSERC), through a Discovery Grant (RGPIN-2019-04266) and Discovery Accelerator Supplement (RGPAS-2019-00004). A.C.D. also holds a University Research Chair at the University of Waterloo. This work was partially supported by the Novo Nordisk Foundation (NNF20OC0064789) and the Swedish Research Council (2022-03681) to P.S.
Author contributions:
A.C.D., P.S., and M.D. conceived the project. X.W. and G.M. made the initial discovery. X.W. did the bioinformatical analysis. P.-G.L. designed and carried out most of the experiments. L.Y. solved the cryo-EM structure of PG1 and analyzed negative staining EM with the help of S.P.K. and G.M. J.S. carried out microinjection assays in the fly model. G.M. and P.S. solved the crystal structure of LC/PG1 and analyzed the structures. T.W. contributed to the bioinformatics studies and performed whole-genome sequencing, assembly, and analysis of strain DSM15049. P.C. developed the mass spectrometry method and identified the cleavage site. Y.X. helped with microinjection assays. J.L. helped with protein purification. H.Z. and S.S.G. carried out microinjection assays in the mosquito model. S.P., A.C.D., P.S., and M.D. supervised the project. P.-G.L., L.Y., A.C.D., and M.D. wrote the manuscript with input from all coauthors.
Competing interests:
The authors declare that they have no competing interests.
Data and materials availability:
All data generated or analyzed during this study are included in this published article (and its Supplementary Materials). All biological materials are available upon request from the corresponding author (M.D.) pending scientific review and a completed material transfer agreement. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The structure was deposited to RCSB Protein Data Bank. Crystallographic data statistics are summarized in table S1. The atomic coordinates and structure factors (PDB code 9GY5) have been deposited in the Protein Data Bank (http://wwpdb.org). The coordinates of the PGs have been deposited in the Protein Data Bank (PDB entry pdb_00009b6m, PDB ID 9B6M), and the EM density map (EMD-44266) has been deposited in the Electron Microscopy Data Bank (EMDB entry ID EMD-44266, deposition ID D_1000282285).
Supplementary Materials
This PDF file includes:
Figs. S1 to S15
Tables S1 to S3
REFERENCES AND NOTES
- 1.Gratz N. G., Emerging and resurging vector-borne diseases. Annu. Rev. Entomol. 44, 51–75 (1999). [DOI] [PubMed] [Google Scholar]
- 2.Rocklöv J., Dubrow R., Climate change: An enduring challenge for vector-borne disease prevention and control. Nat. Immunol. 21, 479–483 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Deutsch C. A., Tewksbury J. J., Tigchelaar M., Battisti D. S., Merrill S. C., Huey R. B., Naylor R. L., Increase in crop losses to insect pests in a warming climate. Science 361, 916–919 (2018). [DOI] [PubMed] [Google Scholar]
- 4.Wilson A. L., Courtenay O., Kelly-Hope L. A., Scott T. W., Takken W., Torr S. J., Lindsay S. W., The importance of vector control for the control and elimination of vector-borne diseases. PLOS Negl. Trop. Dis. 14, e0007831 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Silva-Filha M., Romão T. P., Rezende T. M. T., Silva Carvalho K., Gouveia de Menezes H. S., do Nascimento N. A., Soberón M., Bravo A., Bacterial toxins active against mosquitoes: Mode of action and resistance. Toxins (Basel) 13, 523 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Charles J. F., Nielsen-LeRoux C., Mosquitocidal bacterial toxins: Diversity, mode of action and resistance phenomena. Mem. Inst. Oswaldo Cruz 95 (Suppl. 1), 201–206 (2000). [DOI] [PubMed] [Google Scholar]
- 7.Bravo A., Gill S. S., Soberon M., Mode of action of Bacillus thuringiensis Cry and Cyt toxins and their potential for insect control. Toxicon 49, 423–435 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dong M., Masuyer G., Stenmark P., Botulinum and tetanus neurotoxins. Annu. Rev. Biochem. 88, 811–837 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pirazzini M., Rossetto O., Eleopra R., Montecucco C., Botulinum neurotoxins: Biology, pharmacology, and toxicology. Pharmacol. Rev. 69, 200–235 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Rossetto O., Montecucco C., Tables of toxicity of botulinum and tetanus neurotoxins. Toxins (Basel) 11, 686 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rossetto O., Pirazzini M., Montecucco C., Botulinum neurotoxins: Genetic, structural and mechanistic insights. Nat. Rev. Microbiol. 12, 535–549 (2014). [DOI] [PubMed] [Google Scholar]
- 12.Schiavo G., Benfenati F., Poulain B., Rossetto O., Polverino de Laureto P., DasGupta B. R., Montecucco C., Tetanus and botulinum-B neurotoxins block neurotransmitter release by proteolytic cleavage of synaptobrevin. Nature 359, 832–835 (1992). [DOI] [PubMed] [Google Scholar]
- 13.Jahn R., Scheller R. H., SNAREs—Engines for membrane fusion. Nat. Rev. Mol. Cell Biol. 7, 631–643 (2006). [DOI] [PubMed] [Google Scholar]
- 14.Gu S., Rumpel S., Zhou J., Strotmeier J., Bigalke H., Perry K., Shoemaker C. B., Rummel A., Jin R., Botulinum neurotoxin is shielded by NTNHA in an interlocked complex. Science 335, 977–981 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Masuyer G., Conrad J., Stenmark P., The structure of the tetanus toxin reveals pH-mediated domain dynamics. EMBO Rep. 18, 1306–1317 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lee K., Zhong X., Gu S., Kruel A. M., Dorner M. B., Perry K., Rummel A., Dong M., Jin R., Molecular basis for disruption of E-cadherin adhesion by botulinum neurotoxin A complex. Science 344, 1405–1410 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sugawara Y., Matsumura T., Takegahara Y., Jin Y., Tsukasaki Y., Takeichi M., Fujinaga Y., Botulinum hemagglutinin disrupts the intercellular epithelial barrier by directly binding E-cadherin. J. Cell Biol. 189, 691–700 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Gustafsson R., Berntsson R. P.-A., Martínez-Carranza M., El Tekle G., Odegrip R., Johnson E. A., Stenmark P., Crystal structures of OrfX2 and P47 from a botulinum neurotoxin OrfX-type gene cluster. FEBS Lett. 591, 3781–3792 (2017). [DOI] [PubMed] [Google Scholar]
- 19.Gao L., Lam K. H., Liu S., Przykopanski A., Lubke J., Qi R., Kruger M., Nowakowska M. B., Selby K., Douillard F. P., Dorner M. B., Perry K., Lindstrom M., Dorner B. G., Rummel A., Jin R., Crystal structures of OrfX1, OrfX2 and the OrfX1-OrfX3 complex from the orfX gene cluster of botulinum neurotoxin E1. FEBS Lett. 597, 524–537 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kosenina S., Stenmark P., Crystal structure of the OrfX1-OrfX3 complex from the PMP1 neurotoxin gene cluster. FEBS Lett. 597, 515–523 (2023). [DOI] [PubMed] [Google Scholar]
- 21.Lam K. H., Qi R., Liu S., Kroh A., Yao G., Perry K., Rummel A., Jin R., The hypothetical protein P47 of Clostridium botulinum E1 strain Beluga has a structural topology similar to bactericidal/permeability-increasing protein. Toxicon 147, 19–26 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gao L., Nowakowska M. B., Selby K., Przykopanski A., Chen B., Kruger M., Douillard F. P., Lam K. H., Chen P., Huang T., Minton N. P., Dorner M. B., Dorner B. G., Rummel A., Lindstrom M., Jin R., Botulinum neurotoxins exploit host digestive proteases to boost their oral toxicity via activating OrfXs/P47. Nat. Struct. Mol. Biol. 32, 864–875 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Doxey A. C., Mansfield M. J., Lobb B., Exploring the evolution of virulence factors through bioinformatic data mining. mSystems 4, e00162-19 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Doxey A. C., Mansfield M. J., Montecucco C., Discovery of novel bacterial toxins by genomics and computational biology. Toxicon 147, 2–12 (2018). [DOI] [PubMed] [Google Scholar]
- 25.Dong M., Stenmark P., The structure and classification of botulinum toxins. Handb. Exp. Pharmacol. 263, 11–33 (2021). [DOI] [PubMed] [Google Scholar]
- 26.Mansfield M. J., Doxey A. C., Genomic insights into the evolution and ecology of botulinum neurotoxins. Pathog. Dis. 76, fty040 (2018). [DOI] [PubMed] [Google Scholar]
- 27.Mansfield M. J., Adams J. B., Doxey A. C., Botulinum neurotoxin homologs in non-Clostridium species. FEBS Lett. 589, 342–348 (2015). [DOI] [PubMed] [Google Scholar]
- 28.Zornetta I., Azarnia Tehran D., Arrigoni G., Anniballi F., Bano L., Leka O., Zanotti G., Binz T., Montecucco C., The first non Clostridial botulinum-like toxin cleaves VAMP within the juxtamembrane domain. Sci. Rep. 6, 30257 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zhang S., Masuyer G., Zhang J., Shen Y., Lundin D., Henriksson L., Miyashita S.-I., Martínez-Carranza M., Dong M., Stenmark P., Identification and characterization of a novel botulinum neurotoxin. Nat. Commun. 8, 14130 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhang S., Lebreton F., Mansfield M. J., Miyashita S. I., Zhang J., Schwartzman J. A., Tao L., Masuyer G., Martínez-Carranza M., Stenmark P., Gilmore M. S., Doxey A. C., Dong M., Identification of a botulinum neurotoxin-like toxin in a commensal strain of Enterococcus faecium. Cell Host Microbe 23, 169–176.e6 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Brunt J., Carter A. T., Stringer S. C., Peck M. W., Identification of a novel botulinum neurotoxin gene cluster in Enterococcus. FEBS Lett. 592, 310–317 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Martínez-Carranza M., Škerlová J., Lee P.-G., Zhang J., Krč A., Sirohiwal A., Burgin D., Elliott M., Philippe J., Donald S., Hornby F., Henriksson L., Masuyer G., Kaila V. R. I., Beard M., Dong M., Stenmark P., Activity of botulinum neurotoxin X and its structure when shielded by a non-toxic non-hemagglutinin protein. Commun. Chem. 7, 179 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Gregg B. M., Matsumura T., Wentz T. G., Tepp W. H., Bradshaw M., Stenmark P., Johnson E. A., Fujinaga Y., Pellett S., Botulinum neurotoxin X lacks potency in mice and in human neurons. mBio 15, e0310623 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Contreras E., Masuyer G., Qureshi N., Chawla S., Dhillon H. S., Lee H. L., Chen J., Stenmark P., Gill S. S., A neurotoxin that specifically targets Anopheles mosquitoes. Nat. Commun. 10, 2869 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mansfield M. J., Wentz T. G., Zhang S., Lee E. J., Dong M., Sharma S. K., Doxey A. C., Bioinformatic discovery of a toxin family in Chryseobacterium piperi with sequence similarity to botulinum neurotoxins. Sci. Rep. 9, 1634 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Mansfield M. J., Sugiman-Marangos S. N., Melnyk R. A., Doxey A. C., Identification of a diphtheria toxin-like gene family beyond the Corynebacterium genus. FEBS Lett. 592, 2693–2705 (2018). [DOI] [PubMed] [Google Scholar]
- 37.Orrell K. E., Mansfield M. J., Doxey A. C., Melnyk R. A., The C. difficile toxin B membrane translocation machinery is an evolutionarily conserved protein delivery apparatus. Nat. Commun. 11, 432 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Rummel A., Mahrhold S., Bigalke H., Binz T., The HCC-domain of botulinum neurotoxins A and B exhibits a singular ganglioside binding site displaying serotype specific carbohydrate interaction. Mol. Microbiol. 51, 631–643 (2004). [DOI] [PubMed] [Google Scholar]
- 39.van der Graaf-van Bloois L., Wagenaar J. A., Zomer A. L., RFPlasmid: Predicting plasmid sequences from short-read assembly data using machine learning. Microb. Genom. 7, 000683 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sasi Jyothsna T. S., Tushar L., Sasikala C., Ramana C. V., Paraclostridium benzoelyticum gen. nov., sp. nov., isolated from marine sediment and reclassification of Clostridium bifermentans as Paraclostridium bifermentans comb. nov. Proposal of a new genus Paeniclostridium gen. nov. to accommodate Clostridium sordellii and Clostridium ghonii. Int. J. Syst. Evol. Microbiol. 66, 1268–1274 (2016). [DOI] [PubMed] [Google Scholar]
- 41.Binz T., Bade S., Rummel A., Kollewe A., Alves J., Arg362 and Tyr365 of the botulinum neurotoxin type a light chain are involved in transition state stabilization. Biochem. 41, 1717–1723 (2002). [DOI] [PubMed] [Google Scholar]
- 42.Segelke B., Knapp M., Kadkhodayan S., Balhorn R., Rupp B., Crystal structure of Clostridium botulinum neurotoxin protease in a product-bound state: Evidence for noncanonical zinc protease activity. Proc. Natl. Acad. Sci. U.S.A. 101, 6888–6893 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Stura E. A., Le Roux L., Guitot K., Garcia S., Bregant S., Beau F., Vera L., Collet G., Ptchelkine D., Bakirci H., Dive V., Structural framework for covalent inhibition of Clostridium botulinum neurotoxin A by targeting Cys165. J. Biol. Chem. 287, 33607–33614 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Masuyer G., Zhang S., Barkho S., Shen Y., Henriksson L., Košenina S., Dong M., Stenmark P., Structural characterisation of the catalytic domain of botulinum neurotoxin X-high activity and unique substrate specificity. Sci. Rep. 8, 1–10 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lacy D. B., Tepp W., Cohen A. C., DasGupta B. R., Stevens R. C., Crystal structure of botulinum neurotoxin type A and implications for toxicity. Nat. Struct. Biol. 5, 898–902 (1998). [DOI] [PubMed] [Google Scholar]
- 46.Brunger A. T., Breidenbach M. A., Jin R., Fischer A., Santos J. S., Montal M., Botulinum neurotoxin heavy chain belt as an intramolecular chaperone for the light chain. PLOS Pathog. 3, 1191–1194 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Yin L., Masuyer G., Zhang S., Zhang J., Miyashita S.-I., Burgin D., Lovelock L., Coker S.-F., Fu T.-m., Stenmark P., Characterization of a membrane binding loop leads to engineering botulinum neurotoxin B with improved therapeutic efficacy. PLOS Biol. 18, e3000618 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Swaminathan S., Eswaramoorthy S., Structural analysis of the catalytic and binding sites of Clostridium botulinum neurotoxin B. Nat. Struct. Biol. 7, 693–699 (2000). [DOI] [PubMed] [Google Scholar]
- 49.Simpson L. L., Maksymowych A. B., Hao S., The role of zinc binding in the biological activity of botulinum toxin. J. Biol. Chem. 276, 27034–27041 (2001). [DOI] [PubMed] [Google Scholar]
- 50.Aoki K. R., A comparison of the safety margins of botulinum neurotoxin serotypes A, B, and F in mice. Toxicon 39, 1815–1820 (2001). [DOI] [PubMed] [Google Scholar]
- 51.Fischer A., Montal M., Crucial role of the disulfide bridge between botulinum neurotoxin light and heavy chains in protease translocation across membranes. J. Biol. Chem. 282, 29604–29611 (2007). [DOI] [PubMed] [Google Scholar]
- 52.Pirazzini M., Tehran D. A., Leka O., Zanetti G., Rossetto O., Montecucco C., On the translocation of botulinum and tetanus neurotoxins across the membrane of acidic intracellular compartments. Biochim. Biophys. Acta 1858, 467–474 (2016). [DOI] [PubMed] [Google Scholar]
- 53.Cano R. J., Tiefenbrunner F., Ubaldi M., Del Cueto C., Luciani S., Cox T., Orkand P., Kunzel K. H., Rollo F., Sequence analysis of bacterial DNA in the colon and stomach of the Tyrolean Iceman. Am. J. Phys. Anthropol. 112, 297–309 (2000). [DOI] [PubMed] [Google Scholar]
- 54.Vidor C. J., Bulach D., Awad M., Lyras D., Paeniclostridium sordellii and Clostridioides difficile encode similar and clinically relevant tetracycline resistance loci in diverse genomic locations. BMC Microbiol. 19, 53 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Katoh K., Standley D. M., MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Nguyen L. T., Schmidt H. A., von Haeseler A., Minh B. Q., IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Waterhouse A., Bertoni M., Bienert S., Studer G., Tauriello G., Gumienny R., Heer F. T., de Beer T. A. P., Rempfer C., Bordoli L., Lepore R., Schwede T., SWISS-MODEL: Homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Wei X., Tan H., Lobb B., Zhen W., Wu Z., Parks D. H., Neufeld J. D., Moreno-Hagelsieb G., Doxey A. C., AnnoView enables large-scale analysis, comparison, and visualization of microbial gene neighborhoods. Brief. Bioinform. 25, bbae229 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Koren S., Walenz B. P., Berlin K., Miller J. R., Bergman N. H., Phillippy A. M., Canu: Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Langmead B., Salzberg S. L., Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup , The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Walker B. J., Abeel T., Shea T., Priest M., Abouelliel A., Sakthikumar S., Cuomo C. A., Zeng Q., Wortman J., Young S. K., Earl A. M., Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLOS ONE 9, e112963 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Schiavo G., Santucci A., Dasgupta B. R., Mehta P. P., Jontes J., Benfenati F., Wilson M. C., Montecucco C., Botulinum neurotoxins serotypes A and E cleave SNAP-25 at distinct COOH-terminal peptide bonds. FEBS Lett. 335, 99–103 (1993). [DOI] [PubMed] [Google Scholar]
- 64.Cox J., Mann M., MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008). [DOI] [PubMed] [Google Scholar]
- 65.Winter G., Waterman D. G., Parkhurst J. M., Brewster A. S., Gildea R. J., Gerstel M., Fuentes-Montero L., Vollmar M., Michels-Clark T., Young I. D., Sauter N. K., Evans G., DIALS: Implementation and evaluation of a new integration package. Acta Crystallogr. D Struct. Biol. 74, 85–97 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Evans P. R., Murshudov G. N., How good are my data and what is the resolution? Acta Crystallogr. D Biol. Crystallogr. 69, 1204–1214 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Collaborative Computational Project, number 4 , The CCP4 suite: Programs for protein crystallography. Acta Crystallogr. D Biol. Crystallogr. 50 (Pt 5), 760–763 (1994). [DOI] [PubMed] [Google Scholar]
- 68.McCoy A. J., Grosse-Kunstleve R. W., Adams P. D., Winn M. D., Storoni L. C., Read R. J., Phaser crystallographic software. J. Appl. Cryst. 40, 658–674 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Murshudov G. N., Skubak P., Lebedev A. A., Pannu N. S., Steiner R. A., Nicholls R. A., Winn M. D., Long F., Vagin A. A., REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67, 355–367 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Emsley P., Lohkamp B., Scott W. G., Cowtan K., Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Williams C. J., Headd J. J., Moriarty N. W., Prisant M. G., Videau L. L., Deis L. N., Verma V., Keedy D. A., Hintze B. J., Chen V. B., Jain S., Lewis S. M., Arendall W. B. III, Snoeyink J., Adams P. D., Lovell S. C., Richardson J. S., Richardson D. C., MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci. 27, 293–315 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Scheres S. H. W., RELION: Implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Punjani A., Rubinstein J. L., Fleet D. J., Brubaker M. A., cryoSPARC: Algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017). [DOI] [PubMed] [Google Scholar]
- 74.Liebschner D., Afonine P. V., Baker M. L., Bunkóczi G., Chen V. B., Croll T. I., Hintze B., Hung L.-W., Jain S., McCoy A. J., Moriarty N. W., Oeffner R. D., Poon B. K., Prisant M. G., Read R. J., Richardson J. S., Richardson D. C., Sammito M. D., Sobolev O. V., Stockwell D. H., Terwilliger T. C., Urzhumtsev A. G., Videau L. L., Williams C. J., Adams P. D., Macromolecular structure determination using x-rays, neutrons and electrons: Recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 75, 861–877 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.C. C. Huang, G. S. Couch, E. F. Pettersen, T. E. Ferrin, in Pacific Symposium on Biocomputing (World Scientific, 1996), vol. 1, p. 724.
- 76.Goddard T. D., Huang C. C., Meng E. C., Pettersen E. F., Couch G. S., Morris J. H., Ferrin T. E., UCSF ChimeraX: Meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Croll T. I., ISOLDE: A physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr. D Struct. Biol. 74, 519–530 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Adams P. D., Afonine P. V., Bunkóczi G., Chen V. B., Davis I. W., Echols N., Headd J. J., Hung L.-W., Kapral G. J., Grosse-Kunstleve R. W., McCoy A. J., Moriarty N. W., Oeffner R., Read R. J., Richardson D. C., Richardson J. S., Terwilliger T. C., Zwart P. H., PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Merkling S. H., van Rij R. P., Analysis of resistance and tolerance to virus infection in Drosophila. Nat. Protoc. 10, 1084–1097 (2015). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figs. S1 to S15
Tables S1 to S3
Data Availability Statement
All data generated or analyzed during this study are included in this published article (and its Supplementary Materials). All biological materials are available upon request from the corresponding author (M.D.) pending scientific review and a completed material transfer agreement. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The structure was deposited to RCSB Protein Data Bank. Crystallographic data statistics are summarized in table S1. The atomic coordinates and structure factors (PDB code 9GY5) have been deposited in the Protein Data Bank (http://wwpdb.org). The coordinates of the PGs have been deposited in the Protein Data Bank (PDB entry pdb_00009b6m, PDB ID 9B6M), and the EM density map (EMD-44266) has been deposited in the Electron Microscopy Data Bank (EMDB entry ID EMD-44266, deposition ID D_1000282285).





