Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 May 24.
Published in final edited form as: J Mater Chem B. 2023 May 24;11(20):4377–4388. doi: 10.1039/d3tb00288h

Encapsulin Cargo Loading: Progress and Potential

Jesse A Jones a, Robert Benisch b, Tobias W Giessen a,b,*
PMCID: PMC10225969  NIHMSID: NIHMS1901861  PMID: 37158413

Abstract

Encapsulins are a recently discovered class of prokaryotic self-assembling icosahedral protein nanocompartments measuring between 24 and 42 nm in diameter, capable of selectively encapsulating dedicated cargo proteins in vivo. They have been classified into four families based on sequence identity and operon structure, and thousands of encapsulin systems have recently been computationally identified across a wide range of bacterial and archaeal phyla. Cargo encapsulation is mediated by the presence of specific targeting motifs found in all native cargo proteins that interact with the interior surface of the encapsulin shell during self-assembly. Short C-terminal targeting peptides (TPs) are well documented in Family 1 encapsulins, while more recently, larger N-terminal targeting domains (TDs) have been discovered in Family 2. The modular nature of TPs and their facile genetic fusion to non-native cargo proteins of interest has made cargo encapsulation, both in vivo and in vitro, readily exploitable and has therefore resulted in a range of rationally engineered nano-compartmentalization systems. This review summarizes current knowledge on cargo protein encapsulation within encapsulins and highlights select studies that utilize TP fusions to non-native cargo in creative and useful ways.

Graphical Abstract

graphic file with name nihms-1901861-f0001.jpg

Encapsulins are protein nanocompartments that selectively encapsulate cargo proteins via specific peptide targeting motifs. Genetic fusion of these motifs to non-native cargo proteins results in the facile engineering of rationally designed nano-compartmentalization systems.

Introduction

All cells employ different methods of regulating their metabolism in time and space.1 Doing so allows compartmentalization and dynamic spatial control of otherwise incompatible chemical reactions or metabolic pathways.13 As opposed to the lipid-based organelles used by eukaryotes, prokaryotes mainly employ various protein-based modalities to establish and maintain discrete subcellular compartmental-ization.15 These strategies range from small homo-multimeric protein cages natively lacking sequestered protein cargo, to more complex, multi-component shell systems capable of housing multiple functional enzymes. A unifying theme for all prokaryotic protein compartments and cages is their ability to self-assemble and create sequestered spaces within cells. Through various biological and chemical engineering strategies, diverse nano-sized protein cages have been repurposed as programmable molecular containers for use in biocatalysis, biomedicine, and bionanotechnology.6, 7

Examples of prokaryotic protein cages include the small 8–12 nm ferritins, hollow shells comprised of identical protein subunits with the shell serving as a diffusion barrier and ferroxidase at the same time, allowing effective iron storage within, without the need to sequester proteinaceous cargo.8, 9 In comparison, the much larger 40–200 nm bacterial microcompartments (BMCs) consist of multi-component cages sequestering multiple enzymes that act together to yield a complex metabolic organelle-like compartment.4, 10 The more recently discovered encapsulin nanocompartments occupy the space between these two examples with respect to size, complexity, and ability to encapsulate cargo proteins, and will be the focus of this review.1113

Encapsulins are icosahedral protein nanocages that range from 24 to 42 nm in size with varying triangulation numbers (T1, T3, or T4) formed via self-assembly of 60 to 240 subunits of the same shell protein exhibiting the HK97 (Hong Kong 97) phage-like fold (Fig. 1AC).1315 Notably, the eponymous feature of encapsulins is their ability to encapsulate specific cargo proteins during shell self-assembly using selective cargo loading mechanisms based on targeting domains (TDs) or targeting peptides (TPs) present at the N- or C-terminus of each cargo protein. This native, efficient, and modular cargo loading modality makes encapsulins excellent protein cargo carriers with potential broad applications as targeted drug delivery vehicles, vaccine platforms, and bionanoreactors, among others.14, 1622 Recent genome datamining studies have led to the grouping of encapsulins into four separate families that vary in sequence, operon configuration, overall structure, and encapsulation mechanism.2325 Family 1 encapsulins are the most extensively studied, with experimental information available for multiple systems based on their shell structures, associated cargo function, and respective cargo loading process. Similar studies pertaining to Family 2 encapsulins have recently begun to emerge, though these studies remain nascent in comparison to the data available for Family 1 encapsulins. Family 3 and Family 4 encapsulins remain putative and currently lack experimental validation.

Fig. 1.

Fig. 1

Overview of encapsulin nanocompartment structure and assembly. (A) An encapsulin shell monomer from the Thermotoga maritima encapsulin system in ribbon representation (purple; PDB: 3DKT). Left: exterior view. Right: interior view (180° rotated). The interior binding site of the Family 1 encapsulin targeting peptide outlined in light blue. (B) Exterior views of the T=1 encapsulin from T. maritima (left; PDB: 3DKT), the T=3 encapsulin from Myxococcus xanthus (center; PDB: 4PT2), and the T=4 encapsulin from Quasibacillus thermotolerans (right; PDB: 6NJ8) highlighting the different sizes and assembly states of encapsulins. The number of pentameric and hexameric facets that make up the shell are shown at the bottom. (C) Schematic of the Q. thermotolerans encapsulin with a T=4 icosahedral cage overlay highlighting the respective five-fold (left), three-fold (center), and two-fold (right) symmetry axes and pores, with respective magnified views below. (D) Schematic diagram of Family 1 and Family 2 core operon layouts (top) featuring the cargo (pink), respective targeting moieties (turquoise), and encapsulin shell (purple); note Family 1 and 2 cargo genes are found up- and downstream of the encapsulin gene, respectively. For simplicity, only the upstream operon organization is shown. Figures created using ChimeraX (Goddard et al., 2018). PDB, protein data bank; TD, Family 2 targeting domain; TP, Family 1 targeting peptide.

As experimental details only exist for Family 1 and Family 2 encapsulins, this review will focus on the current understanding and use of encapsulin cargo loading of these two families. The native in vivo mechanisms involved in cargo loading as well as efforts undertaken to date to manipulate those mechanisms will be discussed. This review will also detail the practical application of encapsulin cargo loading as it pertains to recent bioengineering efforts as well as recent studies dissecting the mechanism of cargo encapsulation. Lastly, this review will discuss potential future challenges and directions, including prospective studies that may help further elucidate or manipulate encapsulin cargo loading, as well as the future potential that such rational manipulation of encapsulin cargo loading may hold for biocatalysis, biomedicine, and biomaterials research.

Family 1 Encapsulins - Targeting Peptides and Cargo Loading

As Family 1 encapsulins (Pfam ID: PF04454) were the first discovered family, most of the published encapsulin-based research has been focused on them. Originally identified in the 1990s as high-molecular-weight aggregates found in the supernatant of Brevibacterium linens M18, the first encapsulin was initially misidentified as a possible antibacterial bacteriocin dubbed linocin M18.26 Additional homologs were soon discovered in Mycobacterium species and Thermotoga maritima, some of which were thought to display proteolytic activity.27, 28 However, additional studies in the 2000s resulted in researchers being unable to replicate the previously perceived proteolytic effects while concurrent structural studies proved encapsulins to be homomultimeric, self-assembling capsid-like nanocompartments sequestering enzymes in their interior.13, 2931

Family 1 encapsulins are mainly found in the bacterial phyla Actinobacteria, Proteobacteria, and Firmicutes, and are sorted into several operon types according to their native enzyme cargo.23 The operon arrangement for Family 1 encapsulins generally follows a layout comprised of an upstream gene encoding for the respective cargo protein followed by the gene encoding for the encapsulin shell protein, with or without flanking co-regulated accessory proteins (Fig. 1D).23, 32 In order for cargo encapsulation to occur, a targeting peptide (TP), sometimes also referred to as a cargo-loading peptide (CLP), is strictly necessary and is found at the C-terminus of all cargo proteins (Fig. 1D). TPs are usually separated from the catalytically active folded domain of the cargo by a flexible linker with high glycine and proline content of ca. 10–50 residues in length (Fig. 2A). This arrangement likely minimizes steric clashes between adjacent cargo proteins within the shell, thus maximizing cargo loading capacity. A corollary of this feature is that cargo proteins are generally poorly resolved in encapsulin structures due to their high mobility caused by being flexibly tethered to the shell interior. Cargo protein loading is not necessary for shell assembly. Encapsulin shells generally self-assemble very efficiently even in the absence of any cargo. This implies a cargo loading mechanism where co-expression of cargo and shell – as insured by a tight operon structure – allows efficient TP-shell interactions during shell self-assembly.

Fig. 2.

Fig. 2

Family 1 cargo loading is mediated by specific TP-shell interactions. (A) Schematic representation of Family 1 cargo components, including the catalytic domain (pink), proline- and glycine-rich flexible linker (dash), and targeting peptide (turquoise). (B) Cutaway view of the T. maritima T1 encapsulin shell (PDB: 3DKT) with one shell protein subunit highlighted (yellow) and the encapsulin (purple) and GGDLGIRK TP of the FLP cargo (turquoise) shown in surface representation (left) and zoomed-in view of the conserved binding pocket (hydrophobic representation) with the resolved residues of the bound TP shown in stick representation (turquoise; right). (C) Zoomed-in view of the H. ochraceum T1 encapsulin TP-shell interaction (PDB: 7OE2) highlighting the binding pocket and the GSLGIGSLR TP sequence of the FLP cargo as found in the closed (left) and open (right) pentamer conformations of the shell. (D) Cutaway view of the M. xanthus T3 encapsulin shell (PDB: 7S2T) with one shell protein subunit highlighted (yellow) and the SHPLTVGSLRR TP (turquoise) of the EncB FLP cargo shown in surface representation (left). A zoomed-in view of the TP-shell interaction is shown on the right. (E) Similar overview of the M. xanthus T3 shell interaction with the PEKRLTVGSLRR TP of the EncC FLP cargo (PDB: 7S4Q). (F) Analogous overview of the Q. thermotolerans T4 shell interaction with the TVGSLIQ TP of the IMEF cargo (PDB: 6NJ8). (G) Consensus sequences for TPs from each of the major Family 1 cargo classes after alignment via Clustal Omega 1.2.3 with 20 residues centred on the consensus peak or, when limited by sequence length, using the last 20 C-terminal residues; visualized using GraphPad Prism v9.0.2; n, number of cargo sequences used. (H) Schematic of general binding mode for Family 1 TPs. Figures created using ChimeraX (Goddard et al., 2018). TP, targeting peptide; PDB, protein data bank; FLP, ferritin-like protein; IMEF, iron-mineralizing encapsulin-associated firmicute.

Several examples now exist in the literature providing reliable structural data illustrating TP-shell interactions (Fig. 2). Of these examples, the TPs of two ferritin-like protein (Flp) cargos bound to the interior surface of their respective T1 shells have been resolved – GGDLGIRK in the T. maritima system (Fig. 2B), and GSLGIGSLR in the Haliangium ochraceum system determined in both the “closed” and “open” pentameric conformations based on a shift in the encapsulin A-domain (Fig. 2C).13, 33 Further, two Flp TP-shell interactions were resolved for the T3 Myxococcus xanthus system showing the TPs to be SHPLTVGSLRR for the EncB cargo (Fig. 2D) and PEKRLTVGSLRR for the EncC cargo, both found to bind to all available binding sites in pentameric and hexameric shell facets (Fig. 2E).34 Lastly, the TP of an iron-mineralizing encapsulin-associated firmicute (IMEF) cargo protein bound to the hexamers of its native T4 shell from Quasibacillus thermotolerans was determined to be TVGSLIQ (Fig. 2F).14 Based on the cumulative structural data, the TP binding site has been determined to reside on the luminal surface of each Family 1 encapsulin shell protein subunit in a conserved cleft between the N-terminal helix and the P-domain (Fig. 1A and Fig. 2). TP lengths range from 7 to 12 residues that rigidly interact with the binding site. In many cargo proteins, additional C-terminal residues, usually less than 10, can be found after the rigidly interacting binding motif. However, they do not seem to be important for TP-shell interaction based on their absence in the structural data available at this time, though further research is warranted. Because the TP binding pocket completely resides within a single shell protein subunit and does not cross subunit boundaries, the maximal cargo loading is set by the total number of shell protein subunits – 60 for T1, 180 for T3, and 240 for T4 shells. However, cargo loading capacity is likely further determined by cargo protein size and oligomerization state. Bioinformatic analyses have provided further evidence that Family 1 TPs are often comprised of 10 to 20 C-terminal residues containing GSL or double GSL motifs –with exceptions as exemplified by the T. maritima system – often with an immediately subsequent positively charged residue.23, 35 Based on structural information and the consensus TP sequences of the main cargo types (Fig. 2G), a general TP binding mode can be derived where two or three hydrophobic residues (isoleucine, leucine, or valine) spaced one or two residues apart – the spacers often containing glycines for flexibility – specifically interact with hydrophobic patches within the binding pocket. In many instances, positively charged residues (lysines or arginines) follow this motif and seem to interact less specifically with negatively charged surface patches of the shell protein (Fig. 2H).

In sum, cargo loading in Family 1 encapsulin systems results from a combination of mass action based on the relative expression levels of cargo and shell proteins, and specific TP-mediated protein-protein interactions with the final number of encapsulated cargo proteins being additionally determined by the relative size of the shell and cargo as well as the cargo oligomerization state.

Family 2 Encapsulins - Targeting Domains and Cargo Loading

Family 2 encapsulins (Pfam ID: PF19307) can be distinguished from other encapsulin families by sequence similarity, distinct insertion domains within the shell protein, as well as their operon structure.23, 36 In Family 2 encapsulin systems, most putative cargo proteins are encoded by genes found immediately downstream of the encapsulin gene. This family is further subdivided into Family 2A and Family 2B, differentiated by the absence (2A) or presence (2B) of an insertion domain, annotated as a cyclic nucleoside monophosphate (cNMP)-binding domain, within the E-loop of the encapsulin shell protein. In contrast to the short C-terminal TPs found in Family 1 encapsulin systems, Family 2 cargo proteins contain long (20–260 residues), unannotated, intrinsically disordered targeting domains (TDs) generally located at their N-terminus (Fig. 3A).23, 36, 37

Fig. 3.

Fig. 3

Family 2 encapsulin systems utilize N-terminal targeting domains (TDs) to direct cargo to the interior of the shell. (A) Intrinsic disorder statistics plots generated using DISOPRED3 for four different Family 2 cargo types. Light blue background highlights the disordered regions while positions with relatively high sequence similarity, potentially representing conserved interaction motifs, are shown in yellow. Adapted with changes with open access permission from reference23 via a creative common license (https://creativecommons.org/licenses/by/4.0/). (B) SDS-PAGE gel of gel-filtration chromatography fractions containing the S. elongatus T1 encapsulin refolded in the presence of desulfurase cargo with and without its native TD. Adapted with changes with open access permission from reference36 via a creative common license. (C) Native PAGE gel showing Coomassie stain (top) and GFP signal (bottom) of purified S. elongatus encapsulin loaded with GFP reporter fused to different truncations of the N-terminal native TD of the system. Adapted with changes with open access permission from reference36 via a creative common license. (D) View from the shell interior along the 3-fold symmetry axis (black triangle) of the S. elongatus Family 2 encapsulin (pinks and purple) highlighting additional non-shell density attributed to the native TD (turquoise). Adapted with changes with open access permission from reference36 via a creative common license. (E) Sequence logos of conserved motifs found within different Family 2 cargo types. Adapted with changes with open access permission from reference23 via a creative common license. DISOPRED3 outputs were visualized using GraphPad Prism v9.0.2.

So far, little experimental evidence for Family 2 encapsulin systems has been published. However, the N-terminal targeting domain hypothesis has recently been confirmed for one cysteine desulfurase-encapsulating Family 2A system found in Synechococcus elongatus.36 Using in vitro assays, it was shown that the N-terminal 255 residue long domain found in the desulfurase cargo is necessary and sufficient for cargo encapsulation (Fig. 3B and Fig 3C). Furthermore, structural analysis of the cargo-loaded shell highlighted a resolvable, low-resolution density close to the 3-fold symmetry axis of the shell, suggesting a potential binding region for TDs on the shell interior (Fig. 3D). One caveat of this analysis is the fact that cargo-loading was carried out in vitro using a protein refolding procedure which could have resulted in a non-native mode of cargo encapsulation. Computational analysis of the N-terminal TD within the desulfurase cargo identified 20–30 residue long conserved motifs of high sequence identity (Fig. 3E), separated by long stretches of divergent, mostly hydrophobic residues.23, 36 To explore the contributions of each of the conserved motifs to cargo loading, different parts of the TD, containing different combinations of motifs, were N-terminally fused to a fluorescent reporter (GFP) followed by co-expression and purification. The results did not clearly identify a single motif or sub-region sufficient for maximal cargo loading. Instead, it seems that the full-length TD is needed to mediate optimal cargo encapsulation (Fig. 3C). This may have important mechanistic implications for the Family 2 cargo loading process which seems to be quite different compared to Family 1, relying on potentially multiple specific discontiguous interactions based on conserved sequence motifs separated by long flexible and hydrophobic linker regions which themselves might possess affinity for the interior of the encapsulin shell.

For other Family 2 cargo types besides desulfurases, recent bioinformatic analyses have shown similar motif-containing N-terminal domains annotated as mostly disordered.23 So far, one other putative cargo type, a 2-methylisoborneol (2-MIB) synthase with a similarly long, disordered N-terminal domain has been confirmed as a Family 2 cargo protein.38 However, no detailed structural or mechanistic analysis of this system is currently available in the literature.36, 38

Of additional note regarding Family 2 encapsulins is the putative existence of two-component shells, so far only bioinformatically predicted, where the Family 2 operon encodes two distinct encapsulin shell genes.23 As experimental data for these systems is currently lacking, it is not yet known how these encapsulins might assemble. However, if the gene products do assemble into functional encapsulin shells with two different types of subunits, there is a possibility that these systems can natively encapsulate defined stoichiometries of two distinct cargo proteins into the same two-component nanocompartment based on specific interactions of two distinct TDs with two distinct shell protein binding sites. If confirmed, such systems may hold significant potential for novel more complex bioengineering applications, beyond what is currently possible with engineered Family 1 systems.

Encapsulin Engineering and Applications

Protein-based nanocages have gained significant popularity for various engineering applications including for the delivery of therapeutics, as diagnostics, as small molecule and materials nanoreactors, and more.19, 20, 3943 As opposed to lipid-based compartments, protein-based cages can be genetically engineered, generally self-assemble into defined 3D architectures, and can be easily chemically functionalized. Therefore, they represent excellent platforms for rational bioengineering. This has inspired a wide array of protein cage engineering using viral capsids, BMCs, ferritins, and de novo designed protein cages.4, 21, 4250 Similar efforts at engineering Family 1 encapsulins have recently been undertaken making use of the key advantage of encapsulins, namely, their modular native in vivo cargo loading mechanism. With respect to non-native cargo encapsulation, a protein of interest can be easily genetically functionalized with the respective TP leading to efficient in vivo cargo encapsulation upon co-expression with the cognate encapsulin shell protein.48, 51 In addition, in vitro assembly of enzyme-loaded encapsulin nanoreactors is also possible based on in vitro disassembly of the encapsulin shell via exposure to extreme pH or denaturants followed by co-assembly of both shell and separately purified cargo after exchange into physiological buffer, yielding cargo-loaded protein cages.52

Potential benefits of encapsulating non-native cargo proteins are abundant and include improving the stability of cargo proteins under harsh conditions like elevated temperature, extreme pH, or exposure to proteases; controlling or improving catalysis; delivering a therapeutic or diagnostic payload; or a combination of the above. Below, we will first highlight efforts towards engineering TPs and modulating their targeting strength, followed by a discussion of select recent studies showcasing the progress made in employing encapsulins as bioengineering tools. Particular emphasis is placed on examples that improve cargo stability, add control over chemical reactions, or show therapeutic or diagnostic application potential (Table 1).

Table 1.

Recent achievements in engineering encapsulins.

Achievement References
Use of targeting peptides to encapsulate non-native cargo 19, 20, 22, 35, 36, 41, 48, 5164
Improved cargo stability 5759, 65
Control of chemical reactions 22, 53, 59, 60, 62, 66
Therapeutic or diagnostic development 19, 20, 53, 58, 60, 67, 68

Targeting peptide engineering

So far, the prevailing strategy for encapsulating non-native cargo proteins inside encapsulins has been to genetically fuse the known native TP of a given system to the C-terminus of the cargo of interest. Relatively few reports have tried to change or optimize TP sequences for modulating cargo loading efficiency. To rationally and reliably mutate TPs for controlling cargo loading extent, or relative stoichiometry when encapsulating multiple cargos at once, TP-shell interactions need to be understood in detail. So far, only six TP-shell interactions have been structurally resolved, always of native TPs and their cognate encapsulin shell (Fig. 2). However, rational changes to TP sequences for altering TP-shell binding affinity will likely require further structural analyses of systematically mutated TPs. So far, only a few studies have systematically probed TP-shell interactions using experimental and computational approaches (Fig. 4).

Fig. 4.

Fig. 4

Characterization of T1 and T3 encapsulin targeting peptides. (A) Operon design of TP-fused sfGFP and the corresponding T. maritima encapsulin used for heterologous co-expression and downstream cargo loading analysis. Different TP truncations are highlighted. (B) Comparison of normalized sfGFP fluorescence in purified encapsulins to investigate the influence of TP truncation on cargo loading highlighting that the 15 C-terminal residues are sufficient for maximal cargo encapsulation. (A) and (B) adapted with permission from reference35. Copyright 2016 American Chemical Society (ACS). (C) Schematic of computational flexible docking and experimental workflow used to predict and analyze the relative strength of TP-shell binding in single residue TP mutants. (D) Heat map of computational point mutations with the color gradient representing the Rosetta Energy score (blue, improved binding; red, worse binding). (E) Experimental analysis of cargo loading for the three TP mutants highlighted in green in panel (D) highlighting that most single residue substitutions lead to decreased cargo encapsulation. (C)-(E) adapted with changes with open access permission from reference54 via a creative common license (https://creativecommons.org/licenses/by/4.0/). sfGFP, super folder green fluorescent protein.

One study utilized the T1 encapsulin from T. maritima and its native TP fused to a fluorescent reporter (sfGFP) in order to assess the minimal TP needed to attain maximal cargo loading (Fig. 4A and B).35 Different truncations of the C-terminal 30 residues found in the native T. maritima cargo protein were appended to the reporter. Bulk fluorescence after purification was used as a readout of cargo loading extent. Results showed that the 15 C-terminal residues of the native cargo were sufficient for optimal cargo loading. These include the 8 rigidly bound residues that could be structurally resolved (Fig. 2). It is likely that only these residues are needed for binding. In a similar study focused on the T1 encapsulin from Mycobacterium smegmatis, different truncations of the 19 C-terminal residues found in the native cargo were appended to an eGFP reporter.53 The results indicated that the 12 C-terminal residues were required to attain maximal cargo loading levels. Even though no structural information for the M. smegmatis TP-shell interaction is available, the 12 C-terminal residues contain a double GSL motif confirming its importance for cargo encapsulation.

In another recent study, a combined computational and experimental approach was taken to probe the influence of single residue substitutions within the TPs of the T1 T. maritima and T3 M. xanthus encapsulin systems.54 Rosetta-based force-field modelling was employed to predict the influence of mutations within the two TPs (Fig. 4C). Select TP mutants were then experimentally characterized. Computational prediction and experiment were found to generally agree. This approach led to further interesting insights, including the fact that most mutations were computationally predicted to have a negative effect on binding strength, in particular, the mostly conserved GSL-like motifs (Fig. 4D and E). The cumulative results of these studies highlight the fact that TPs vary with respect to native specificity and binding strength, and that TP-shell binding is significantly influenced by specific hydrophobic and ionic interactions, as well as TP flexibility.

Improved cargo stability under harsh conditions

Improved stability of cargo proteins upon encapsulation is an often-observed phenomenon in encapsulin systems and encompasses thermal stability, increased catalytic lifetime of sequestered enzymes, and protease resistance (Fig. 5A). Increased thermal stability is generally found in encapsulins originating from thermotolerant or thermophilic organisms. For example, encapsulation of IMEF cargo inside its native Q. thermotolerans T4 shell increased its melting temperature by 10°C.14 This improved thermal stability is not only seen for native cargo proteins, but also when non-native cargo is encapsulated. Multiple non-native enzymes showed prolonged catalytic activity at elevated temperatures when encapsulated inside the T1 protein shell of the thermophile M. hassiacum. This included an encapsulated dye-decolorizing peroxidase (DyP) cargo which was active at 40°C for 25 hours while the free DyP lost activity at that temperature after 30 minutes.57 Additionally, the carbohydrate oxidase mChitO lost activity after 90 minutes at 50°C whereas when encapsulated, mChitO still retained 50% activity after four hours. The same enzymes in their encapsulated forms also exhibited prolonged resistance against digestion with proteinase K. Similarly, another recent study showed that protein cargo encapsulated within a T1 encapsulin from the acidophile Acidipropionibacterium acidipropionici was protected from pepsin digest at pH 3 as well as trypsin and chymotrypsin degradation at pH 7.5.65 In general, the encapsulin shell is able to protect cargo from proteolytic degradation due to the physical sequestration of the cargo within a stable protein shell which is itself often highly protease resistant. The often observed increased thermal stability of encapsulated cargo might be a result of high local cargo molarity, tethering, and confinement within the tight encapsulin shell that leads to a favorable molecular crowding effect, where protein-protein interactions are optimized to prevent complete cargo unfolding, multimer dissociation, or aggregation and encourage rapid protein re-folding or re-association when partial denaturation or dissociation occurs.57, 69, 70 A similar mechanism might be responsible for the increased catalytic lifetime reported for many encapsulated enzymes.30, 36 Increased stability of proteins of interest upon encapsulation could be exploited for numerous applications, such as producing encapsulated biocatalysts that exhibit long-term stability under harsh conditions or acting as robust delivery vehicles for biological therapeutic or diagnostic proteins.

Fig 5.

Fig 5.

Select engineering applications of encapsulins. (A) Cargo encapsulation within encapsulin protein shells can have a variety of beneficial effects, including increased cargo stability over time, increased thermal stability, and increased resistance against proteases. (B) Encapsulation of a ruthenium-based metal organocatalyst based on a covalently modifiable HaloTag yielding a system able to catalyze de-N-allylation of the shown pro-fluorophore both in vitro and in vivo. Adapted with changes with open access permission from reference53 via a creative common license (https://creativecommons.org/licenses/by/4.0/). (C) In vivo encapsulation of a light-controllable minimal singlet oxygen generator (mSOG) able to generate large amounts of toxic ROS via singlet oxygen species inside mammalian cancer cells resulting in cell death. Adapted with permission from reference58. Copyright 2021 American Chemical Society (ACS). (D) In vitro assembly of gold nanoparticles encapsulated within an encapsulin protein shell using a synthetically modified TP. Adapted from reference52 with permission from the Royal Society of Chemistry. Copyright 2018 The Royal Society of Chemistry.

Encapsulation of metal catalysts and control of chemical reactions

Encapsulins have recently been employed to create metal catalyst-loaded nanocages for biocatalysis applications. An encapsulin-based nanoreactor containing the organometallic ruthenium catalyst [CpRu(HQ)(allyl)]PF6 was assembled by encapsulating a proteinaceous TP-fused HaloTag in vivo which was then purified and covalently modified with the ruthenium catalyst in vitro via a specific chloroalkane spacer.53 The resulting encapsulated catalyst was capable of de-N-allylation of a coumarin pro-fluorophore with a yield and kinetics similar to that of free PEGylated catalyst (Fig. 5B). The nanoreactor was also shown to be active in cultured mammalian cells. This successfully demonstrates the use of engineered encapsulins to conduct bio-orthogonal transition-metal catalysis in live cells. Furthermore, it demonstrates the possibility of using encapsulins for the delivery of catalysts that convert pro-drugs to their bioactive forms in situ. This is especially noteworthy as it opens up the possibility of delivering catalysts with low biocompatibility in an encapsulated, biocompatible, and active form. Other secondary tags besides HaloTag have recently been employed for similar purposes, including simple avidin tags. In general, this approach is promising for creating modular nanoreactors containing potentially multiple co-localized transition metal organocatalysts and enzymes in the same compartment leading to novel chemoenzymatic nanoreactor capabilities.71

Engineered encapsulins as therapeutics and diagnostics

Nanoscale encapsulation systems – including protein nanocages – have long been used as engineering platforms for nano-medicine applications.43, 7276 Different platform types, e.g., lipid- vs protein-based compartments, offer different advantages and disadvantages, including accessible size range, stability, and ease of functionalization.77 As any therapeutics or diagnostics delivery system is required to be biocompatible, large enough to carry a cargo of interest, and targetable to a specific biological site, encapsulins in particular have recently shown significant promise.20, 39, 78, 79

A recent innovative study involved the use of non-UV light to control the production of toxic reactive oxygen species (ROS) inside cancer cells.58 A TP was genetically fused to a minimal singlet oxygen generator (mSOG), a flavoprotein that produces mainly singlet oxygen ROS upon exposure to blue light. The TP-fused mSOG was then encapsulated inside the T. maritima T1 encapsulin shell via co-expression yielding a nanocage-based platform for the delivery of photodynamic therapeutics (Fig. 5C). Encapsulated mSOG was shown to generate higher ROS levels than either the free mSOG or the encapsulin control due to the encapsulin’s additive O2 generative effect when combined with the free mSOG, which in turn is likely due to the observed non-specific adsorption of endogenous flavin molecules such as flavin mononucleotide (FMN), flavin adenine dinucleotide (FAD), and riboflavin to the encapsulin from T. maritima.52, 56 Furthermore, the use of encapsulated mSOG led to increased cell death in a lung cancer cell culture model, being attributed to the fact that encapsulated mSOG was shown to be taken up by cells while free mSOG was not internalized, which was consistent with previous reports that show free mSOG to be incapable of penetrating tumor cells.80 This system provides a novel, highly controllable method for the light-triggered generation of toxic ROS without the use of potentially harmful UV light or the need for additional small-molecule substrates. As ROS generation can be used as both a therapeutic modality and effective bioimaging signal, the platform represents a theranostic encapsulin-based delivery system.

This mSOG encapsulin platform was further engineered to display the Designed Ankyrin repeat protein (DARPin), an antibody mimic, on the exterior of the shell through genetic fusion of DARPin to the surface-exposed C-terminus of the encapsulin shell protein.19 This enabled the targeted delivery of mSOG to Human Epidermal growth factor Receptor 2 (HER2)-positive breast cancer cells and the subsequent light-triggered induction of toxic ROS leading to apoptosis. This platform self-assembles in a single step when expressed in Escherichia coli and offers specific targeting, photodynamic therapeutic ROS generation, and potential modular functionalization with different DARPin molecules selected to specifically bind to other targets.

Another innovative example of using encapsulins as reporters or diagnostic platforms was based on the in vivo encapsulation of the enzyme tyrosinase, able to polymerize melanin inside the encapsulin protein shell.60 The sequestered and concentrated melanin could then be used for imaging and tracking purposes due to its strong near-infrared absorption. Further, cells expressing this system did not show any growth defect based on melanin toxicity which is usually observed for non-encapsulated melanin.

Production of inorganic materials using encapsulins

Encapsulins have recently been used to create inorganic-organic biomaterials platforms. To introduce non-organic cargo into encapsulin shells, a recent study showed that TP-functionalized inorganic nanoparticles could be encapsulated by the T. maritima T1 encapsulin by following an in vitro co-assembly protocol.52 Gold nanoparticles (AuNPs) were surface-modified with TP-containing synthetic peptides consisting of an N-terminal cysteine residue for direct binding to the gold surface, followed by four residues ending in a glutamate able to electrostatically interact with the positive charge of an (11‐ mercaptoundecyl)‐N,N,N‐trimethylammonium bromide (MUTAB) shell around the AuNP, a short four-residue hinge motif ending in a flexible double glycine, and culminating in the seven residue native T. maritima TP. This allowed AuNPs to be encapsulated following denaturation of the encapsulin shell protein at pH 2, mixing of the functionalized TP-containing AuNPs, and refolding and reassembly of the encapsulin shell at pH 7 (Fig. 5D). The process was proven to be extremely efficient with 99% AuNP encapsulation as confirmed by TEM. This approach may be useful in the creation of protein-coated inorganic nanoparticles for use in antimicrobials and anti-cancer applications.

Engineered encapsulins as catalytic enzyme nanoreactors

Engineered encapsulin-based nanoreactors aim to emulate the advantages observed for naturally-occurring protein organelles and compartments.8183 These include the ability to co-localize multiple enzymes which can improve intermediate channeling and pathway flux; increasing the local molarities of enzymes, substrates, and intermediates; preventing unwanted side reactions and toxic or reactive intermediate leakage; and generally improving the stability and performance of a sequestered enzymatic process .24, 84 Recent progress in utilizing encapsulins as enzyme nanoreactors include the encapsulation of the pyruvate dehydrogenase Aro10p in yeast to produce and protect an intermediate of the high-value opioid precursor norcoclaurine.59 When TP-fused Aro10p was encapsulated inside the Myxococcus xanthus T3 encapsulin shell, the Aro10p-generated reactive intermediate 4-hydroxyphenylacetaldehyde (4-HPPA) was protected from endogenous detoxification enzymes and could undergo spontaneous reaction with dopamine yielding norcoclaurine, whereas overexpression of free Aro10p did not lead to detectable levels of the opioid precursor.

Another recent example of employing encapsulins as biocatalytic enzyme nanoreactors is the use of the DyP-peroxidase-loaded M. hassiacum T1 encapsulin together with free eugenol oxidase in an enzyme cascade, yielding lignin-like crosslinked reaction products.57 The main challenge for all protein-based nanoreactors encapsulating non-native enzymes is to overcome the often observed decrease in catalytic activity upon enzyme encapsulation.57 This is generally due to the protein shell not being optimized for the influx of the particular substrates and cofactors needed by a specific non-native sequestered enzyme. Future efforts aimed at pore engineering to improve molecular flux across encapsulin shells will likely be able to address this problem and result in fully catalytically active nanoreactors.22, 64, 85, 86

Conclusions and Future Challenges

Encapsulins offer advantages relative to other nanocages used in bioengineering, including their exclusively proteinaceous nature, biophysical robustness, genetic engineerability, and facile in vivo cargo loading that negates the need for additional methods like cargo-scaffold or cargo-capsid genetic fusions, covalent conjugation, or harsh refolding procedures.7, 40, 87, 88 Encapsulin research has made substantial progress over the past decade, generating novel insights into shell structure and dynamics, cargo encapsulation mechanisms, biological function, and engineering applications.24 This progress has been almost exclusively confined to Family 1 systems. However, recent bioinformatic analyses have unveiled thousands of novel encapsulin systems across numerous bacterial and archaeal phyla.23 Many of these systems likely contain novel useful features for future bioengineering and synthetic biology applications including dynamic and controllable pores, shell insertion domains, and two-component protein shells.38, 65 Recent projects aimed at engineering encapsulins have highlighted a range of promising application areas, many of which have been highlighted in this review. However, some of these studies have also revealed a number of outstanding challenges to be overcome, including limitations associated with the various techniques used for determining cargo loading, such as the potential imprecision of gel densitometry, the difficulty of deconvoluting shell and cargo UV/Vis signals, and the cost and nascence of mass photometry.89, 90 Notably, recent studies have also highlighted the current insufficient understanding of the cargo encapsulation mechanism which at the moment prevents the rational design of stoichiometrically defined cargo-loaded encapsulins. Therefore, systematic studies probing the effects of TP mutations on cargo loading and TP binding mode should be prioritized in future studies, along with elucidating the mechanistic details of pore dynamics and molecular flux across the protein shell, and investigating potential two-component shell assembly, all potentially highly useful features for various engineering applications. We envision encapsulins as a modular and robust platform technology that, once their basic biophysical and biochemical characteristics are thoroughly understood, will find many applications in biomedicine, biocatalysis, biomaterials research, and bionanotechnology.

Acknowledgements

This study was supported by the NIH (R35GM133325). Molecular graphics and analyses performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from the National Institutes of Health R01‐GM129325 and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases.

Footnotes

Conflicts of interest

There are no conflicts to declare.

References

RESOURCES