Skip to main content
Advanced Science logoLink to Advanced Science
. 2022 May 18;9(18):2201444. doi: 10.1002/advs.202201444

Complete Sequences of the Velvet Worm Slime Proteins Reveal that Slime Formation is Enabled by Disulfide Bonds and Intrinsically Disordered Regions

Yang Lu 1, Bhargy Sharma 1, Wei Long Soon 1, Xiangyan Shi 2, Tianyun Zhao 3, Yan Ting Lim 3, Radoslaw M Sobota 3, Shawn Hoon 4, Giovanni Pilloni 5, Adam Usadi 6, Konstantin Pervushin 7, Ali Miserez 1,7,
PMCID: PMC9218773  PMID: 35585665

Abstract

The slime of velvet worms (Onychophora) is a strong and fully biodegradable protein material, which upon ejection undergoes a fast liquid‐to‐solid transition to ensnare prey. However, the molecular mechanisms of slime self‐assembly are still not well understood, notably because the primary structures of slime proteins are yet unknown. Combining transcriptomic and proteomic studies, the authors have obtained the complete primary sequences of slime proteins and identified key features for slime self‐assembly. The high molecular weight slime proteins contain cysteine residues at the N‐ and C‐termini that mediate the formation of multi‐protein complexes via disulfide bonding. Low complexity domains in the N‐termini are also identified and their propensity for liquid‐liquid phase separation is established, which may play a central role in slime biofabrication. Using solid‐state nuclear magnetic resonance, rigid and flexible domains of the slime proteins are mapped to specific peptide domains. The complete sequencing of major slime proteins is an important step toward sustainable fabrication of polymers inspired by the velvet worm slime.

Keywords: fibers, nuclear magnetic resonance, protein sequence, proteomics, slime, structure, velvet worms


Combining transcriptomic and proteomic studies, the full‐length sequences of proteins forming the sticky slime of the velvet worm are obtained, revealing the presence of cysteine residues mediating the formation of disulfide‐bonded multi‐protein complexes, as well as low‐complexity sequences exhibiting liquid‐liquid phase separation. The presence of β‐sheet domains is also predicted and detected in the slime by solid‐state NMR.

graphic file with name ADVS-9-2201444-g006.jpg

1. Introduction

Many living organisms secrete structural fibrous materials in their immediate external environment to secure locomotion,[ 1 ] attach themselves to solid substrates,[ 2 ] or capture prey.[ 3 ] These abilities provide valuable lessons for biomimetic and sustainable biopolymer processing.[ 4 ] A classic example is spider silk, whose biofabrication of mechanically‐tough fibers has been investigated for several decades.[ 5 ] An organism that has more recently garnered interest as a model system for biopolymer fabrication is the velvet worm (Onychophora).

Onychophora are carnivorous invertebrates that are placed in their own phylum and have existed for more than 500 million years. The preying mechanism of velvet worms is unique, consisting of rapid secretion of a highly adhesive slime out of oral papillae to capture small insects (Figure 1 ). The secreted slime has many remarkable characteristics: i) it exhibits high extensibility and tensile strength (ultimate tensile strength of 101.9 +/− 20.1 MPa);[ 6 ] ii) it rapidly phase‐separates from a concentrated dope solution into the adhesive slime as soon as it is ejected out of the slime gland; iii) it is water soluble; and iv) it can reversibly form fibers after being re‐dissolved. The slime is mainly composed of highly hydrated proteins (90% water in the wet slime),[ 7 ] with a small fraction of lipids. Separation of slime protein components by electrophoresis fractionates them into high, middle, and low molecular weight (MW) proteins, with high MW proteins being the most dominant.[ 8 ] Using Sanger sequencing of expressed sequence tag library of total RNA extracted from the slime gland, Haritos et al.[ 8 ] obtained partial sequences of the high MW protein from Euperipatoides rowelli, which we hereby call ER_P1. They found ER_P1 to be a proline (Pro)‐rich protein with a highly disordered structure and potential glycosylation sites.[ 7 ] Subsequently, Baer et al.[ 6 ] revealed through atomic force microscopy and cryo‐transmission electron microscopy observations that prior to secretion, the slime precursor consists of nanoglobular building blocks made of proteins and lipids.

Figure 1.

Figure 1

Fibrous slime of Onychophora (velvet worm). a) The Eoperipatus sp. is characterized in this study. b) Slime ejection from the velvet worm when threatened. c) A close look at the slime showing droplets decorating a few threads.

Fiber formation and aqueous recycling of the slime are key physiochemical characteristics that have been investigated in recent years, but the molecular mechanisms behind fiber self‐assembly are still uncertain. The most plausible explanations for intermolecular interactions governing fiber formation from the concentrated solution are lipid‐protein interactions,[ 8 ] charge‐charge interactions,[ 6 ] and disulfide bonding.[ 9 ] Baer et al.[ 10 ] revealed that fiber formation is pH‐ and ionic strength‐dependent. With additional prediction of phosphorylation sites on the high MW protein sequences, they proposed that the non‐specific non‐covalent interaction is the major driving force connecting proteins and lipids. Furthermore, crystalline β‐sheet structures comprising more than 20 strands were detected in the solid slime using wide‐angle X‐ray diffraction scattering (WAXS).[ 11 ] β‐sheet is a relatively common structural motif in adhesive materials.[ 12 ] In the velvet worm slime, it was suggested that elongated β‐sheets are formed during protein unfolding triggered by shear forces, leading to rapid fiber formation and curing. The primary structure domains nucleating β‐sheets are still unknown, although alternating patches with positive and negative charged residues were suggested as a possible source. The presence of divalent ions in native slime may also enhance the formation of β‐sheets through phosphorylation/ Ca2+ interactions similar to caddisfly larvae silk.[ 13 ] Thus, a reversible fiber assembly model combining both electrostatic interactions and shear‐induced β‐sheet elongation has been proposed.[ 14 ] However, experimental evidence is still lacking to fully explain fiber formation. Disulfide bonding is also a common linkage in fibrous proteins[ 15 ] and was suggested to be related to the sticky and elastic properties of the slime.[ 16 ] However, the limited number of detected cysteine (Cys) in the partial sequence and the inability of a reducing agent to reverse aggregation challenged this theory.[ 8 ]

In order to elucidate the mechanism of fiber formation, it is critical to obtain the full‐length sequence of slime proteins, which has thus far eluded researchers due to the very high MW of the main slime proteins. Indeed, the aggregation of fibers is often governed by the end termini of their constitutive proteins. For example, silk fiber formation is regulated by pH‐, ionic strength‐, and temperature‐dependent aggregation of the C‐ and N‐terminus of silk fibroins.[ 17 , 18 ] To achieve this, we combined RNA‐sequencing of the slime gland of an Eoperipatus velvet worm with high‐throughput proteomic studies[ 19 ] and successfully obtained the full‐length sequences of ES_P1 and ES_P2, which are the most abundant proteins of the Eoperipatus sp. Further, with proteomic analyses, we identified post‐translational modifications (PTMs) of the slime proteins including phosphorylation of serine (Ser), threonine (Thr), and tyrosine (Tyr) and hydroxylation of Pro. Although phosphorylation was detected, it was not identified in the high MW proteins. Instead, we find that large complexes of ES_P1, ES_P2, and smaller MW proteins are stabilized by disulfide bonds. Finally, we find that the N‐termini of ES_P1 and ES_P2 are highly enriched in Gly and Ser, with sequence features reminiscent of the low complexity (LC) domain of the fused in sarcoma (FUS) protein that is well‐known to exhibit liquid‐liquid phase separation (LLPS).[ 20 ] We then demonstrated with recombinant protein construct that this Gly‐ and Ser‐rich domain also exhibits phase separation in vitro, suggesting that LLPS of slime protein is a central mechanism facilitating the solution‐to‐fiber transition during slime fabrication. Finally, we conducted solid‐state nuclear magnetic resonance (ssNMR) studies of 13C labeled slime and identified molecular entities of the slime that are located either in the flexible or the rigid domains of the slime proteins. Combined together, our results unveil new molecular insights into the velvet worm slime and its self‐assembly mechanisms that provide bioinspired lessons for future synthesis of green biopolymers.

2. Results and Discussion

2.1. Phylogenetic Identification

The animals, found locally in the secondary forest in Singapore, contain ca. 20–30 pairs of legs with a body size ranging from 15 mm at 1 week up to 50 mm for mature animals. 12S rRNA sequences of the current specimen were compared with all known 12S rRNA sequences of Onychophora downloaded from NCBI and a phylogenetic tree on 12S rRNA was constructed with max likelihood method (Figure S1, Supporting Information). The specimens are placed near to genus Eoperipatus sp. previously found in Thailand (JX568982.1).

2.2. Full‐Length Slime Protein Sequences by Combined RNA‐seq and Proteomics

Total RNA was extracted directly from the isolated slime gland, subjected to RNAseq using Illumina Hiseq platform, and the raw sequences were assembled de novo using the Trinity software suite to build the transcriptome.[ 21 ] By searching the transcriptome library with Transdecoder,[ 22 ] 9772 complete coding sequences were predicted (the assembled transcriptome and raw RNAseq data have been submitted to NCBI under Bioproject PRJNA806368). Notably, we identified one transcript with very high similarity to the slime ER_P1 from Euperipatoides rowelli [ 8 ] (56% similarity and 99% alignment with E‐value of 0) having a translated MW of 231 kDa, hereafter named ES_P1. The full‐length sequence of the gene encoding ES_P1 was verified by PCR using cDNA as template, which led to 37 single nucleotide polymorphisms (Table S1, Supporting Information). Another transcript translating for a large MW protein (188 kDa) that shared homology with ES_P1 was also detected in the transcriptome and called ES_P2. Both proteins were characterized by a high content of Pro (17% in ES_P1 and 16% in ES_P2) and Lys (10% in ES_P1 and 9% in ES_P2), a limited number of Cys residues (Figure 2 and Table S2, Supporting Information), as well as a LC domain in the N‐terminal sharing intriguing homology with the intrinsically disordered region (IDR) of the FUS protein (Figure S2, Supporting Information).[ 23 ]

Figure 2.

Figure 2

SDS‐PAGE of slime proteins and full‐length protein sequences of high MW slime proteins, ES_P1 and ES_P2. a) Long‐range SDS‐PAGE gel of the whole slime with Coomassie blue staining on native (N), dephosphorylated (DP), reduced by DTT and thermally‐denatured by heating to 70°C for 10 min (TD), and reduced (R) by DTT only slime. b) ProQ stain for phosphorylated protein on N, DP, and TD slime. c) Alcian bluesilver stain for glycoproteins for N, deglycosylated under denaturing conditions (DG‐R), and deglycosylated (DG) of non‐denatured slime. d) Primary structure of ES_P1 and ES_P2 with peptide coverage obtained by tandem MS on fragments recovered after digestion with trypsin (green) and AspN (cyan). Hydroxyproline was also identified by LC‐MS/MS for both slime proteins. e) Primary structures of ES_P1 and ES_P2 can be divided into 3 main domains: long predicted disordered domains (grey), LC domains at the N‐termini (purple), and repeat domains (red) located along the primary structure. The location of all Cys residues is also highlighted. f) Peptide sequences of repeat domains.

To further corroborate that the identified transcripts translated into slime proteins, liquid chromatography with tandem mass spectrometry (LC‐MS/MS) was conducted on native slime separated by gel electrophoresis. Slime proteins were separated into high, middle, and low MW regions by SDS Polyacrylamide Gel Electrophoresis (SDS‐PAGE) (Figure 2a). The bands from different MW regions were carefully cut off, in‐gel digested, subjected to LC‐MS/MS, and searched against the transcriptome library of the slime gland. ES_P1 was identified as the major protein in the bands from the high MW region or reduced slime bands (RS_band 1 and RS_band_2), with 71 and 22 peptide fragments matching the translated ES_P1 transcript after trypsin and AspN digestion, respectively (Figure 2d). Furthermore, 13 and 7 peptide fragments from RS_band_3 in the high MW region also matched ES_P2 after trypsin and AspN digestion, respectively.

Bands cut from the middle and low MW regions were also subjected to LC‐MS/MS and probed against the slime gland transcriptome. Identified peptides matched additional transcripts with translated MWs of 71 and 12 kDa, respectively. Peptides of both proteins were recovered in high abundance from the bands in their respective MW regions. The middle MW protein ES_P3 matched the middle MW Pro‐rich proteins from E. rowelli (44.6%, with 83% alignment and E‐value of 4e−141), with a high Pro content (8.6%) in ES_P3. Similar to ES_P1 and ES_P2, a LC domain was found at the N‐terminal. The low MW protein ES_P4 did not align with any Onychophoran slime protein from the literature and its Pro content was low. In contrast, it contained a relatively high amount of Cys residues (13 Cys /117 aa).

A significant feature from the LC‐MS/MS data was the detection of 39 and 6 hydroxyproline (HyPro) residues (detected as oxidized Pro) in ES_P1 and ES_P2, respectively, which indicates that at least 19.6% of Pro in ES_P1 and ES_P2 are hydroxylated into HyPro. To verify the presence of HyPro in the native slime, we also conducted amino acid analysis (AAA) of hydrolyzed slime, with the HyPro retention time obtained from an external standard (Figure S3, Supporting Information). From the overall AAA composition of the slime, we identified 4.3 +/− 0.1% of Pro and 0.38 +/− 0.02 HyPro residues. Since the AAA composition measured here is for the total slime, the significantly lower content of HyPro compared with LC‐MS/MS results from the individual ES_Ps suggests that HyPro residues are mostly located within the high MW proteins, a result that parallels earlier results obtained by AAA of isolated slime proteins.[ 7 ]

2.3. Primary Structure Features ES_P1 and ES_P2

Prior to this study, only partial sequences of velvet worm slime proteins from E. rowelli had been obtained.[ 8 ] Thus, ES_P1 and ES_P2 are the first velvet slime proteins with complete and verified end‐to‐end primary structures. Based on homology search and structural predictions, we identify four significant features in ES_P1 and ES_P2, as highlighted in Figure 2f. First, both proteins appear to be largely disordered based on predictions for intrinsically disordered proteins as discussed in more detail later. Second, they contain repeat motifs about 30 amino acid long in ES_P1 and 20 amino acid in ES_P2. Notable characteristics within the ES_P1 repeats are the presence of basic di‐peptides (RR, RK, and KK motifs), a high abundance of Pro appearing as di‐peptides (PP) or flanked with non‐polar amino acids (IP, IIP, PI, and IP motifs), whereas anionic and polar residues appear as single residues. In ES_P2, basic di‐peptides (RR, KR, and KK) are also present, while Pro is usually found in the hydrophobic tripeptide PGP and FGP. In contrast to ES_P1, there are almost no anionic residues in the repeat motifs of ES_P2. Third, the N‐termini of both proteins contains LC domains highly enriched in Gly and Ser residues (Figure S2, Supporting Information) with a pattern (Gx(0,1)SSGGSxGS‐xGGSx(2)SGGxYGxSxGGSx(4)GSxGx(2)GxSx(5)G) (where x can be any amino acid) predicted with bioinformatic tools.[ 24 ] These domains share intriguing homology with LC domains of the FUS RNA‐binding protein,[ 23 ] which is well‐established to exhibit LLPS. Fourth, both proteins contain a small number of Cys residues (6 Cys in ES_P1 and 3 Cys in ES_P2) that are mostly located near their termini, namely at positions 16, 2073, and 2080 in ES_P1, and positions 1713 and 1720 in ES_P2. Since the end termini sequences of slime proteins were unknown prior to this work, these Cys residues were not detected in previous studies.

2.4. Large MWs Complex in the Velvet Worm Slime

In gluten elastomeric proteins,[ 25 ] Cys residues located at both the C‐ and the N‐terminus have been shown to link glutenin subunits through disulfide bonds in a tail‐to‐head arrangement, which stabilize and provide elasticity to the structure. A similar disulfide bridge is also known to link the heavy and light chains of Bombyx mori silk fibroins.[ 26 ] Based on this resemblance, we explored whether disulfide bonds could also assemble slime protein subunits into larger complexes. To this end, we enhanced electrophoretic separation of the slime proteins using long‐range SDS‐PAGE gels. The proteins from the native slime located in the high MW region were further separated into 4 distinct bands (bands 1, 2, 3, and 4 in Figure S4, Supporting Information). After treating the slime with DTT to reduce potential disulfide bonds, these bands shifted to the lower MW region (Figure S4a, Supporting Information, see also Figure 2a) and separated into additional distinct bands, called reduced slime band 1 (RS band_1), 2 (RS band_2), and 3 (RS band_3), confirming that proteins detected in the high MW region of SDS‐PAGE gels are in fact complexes linked by disulfide bonds. We attempted to separate the slime proteins in both native and reduced conditions by size‐exclusion chromatography using a Superose 6 column. For the reduced slime, a broad peak was detected in the first 40 mL of elution volume. In contrast, this broad peak was completely absent in the native slime, suggesting the presence of a very large complex beyond the MW range of the column (Figure S5, Supporting Information). We note, however, that the reduction treatment did not significantly affect the size of the nanoglobules from the redissolved slime. Indeed, a comparison of the size distributions of nanoglobules by dynamic light scattering (DLS) did not indicate any significant differences between the native and reduced conditions (Figure S6, Supporting Information).

To identify the components of each disulfide‐linked complex in the high MW region, each band of the native slime from the long‐range gel was cut‐off, subjected to DTT reduction, and stacked onto a second SDS‐PAGE gel for further separation (Figure S4b, Supporting Information). Proteomic analysis by LC/MS‐MS of bands from this second gel showed that each band consisted of complexes of multiple proteins (Table S3, Supporting Information). Bands 1, 2, and 3 were dominated by ES_P1, whereas band 4 mostly contained ES_P2. Importantly, two additional low MW proteins, called ES_P5 and ES_P6, were detected in these complexes when the peptide fragments were probed against the translated transcriptome (Table S3, Supporting Information). These proteins were distinct from ES_P3 and ES_P4 (no significant match) identified from the middle and low MW regions of the SDS‐PAGE gels of the non‐reduced slime shown in Figure 2a. ES_P5, with a MW of 74 kDa, was detected in all bands in high abundance. There were 2.4% Cys (n  =  16), 9.5% Ser, and 5.3% Thr evenly distributed along the primary structure of ES_P5, indicating possible sites for disulfide bond and phosphorylation, respectively. ES_P6 was only detected in band 2 and was rich in charged residues Glu (13.4%) and Lys (14.4%), often in the form of DD or KK dipeptides. Only four Cys residues were found in ES_P6: three at the N‐terminal region and one at the C‐terminal, similar to ES_P1 and ES_P2. The relative content of Ser and Thr was similar to that of ES_P5. However, no phosphorylation modifications were detected in either ES_P5 or ES_P6.

Both phosphorylation and glycosylation staining indicated that the high MW complexes were phosphorylated and glycosylated (Figure 2b,c). Phosphorylated residues were previously suggested to mediate nanoglobules formation[ 6 ] or to enhance β‐sheet crystalline stacking,[ 11 ] and a similar function has been identified in aquatic caddisworm silk, whereby phosphoserines of heavy fibroin interact with Ca2+ ions to β‐rich structures and fibers[ 27 ] via electrostatic interactions. However, phosphorylation sites in the velvet worm slime were only predicted in the high MW slime proteins and not experimentally confirmed.[ 6 ] Our Pro‐Q stain after disulfide bond reduction (Figure 2b) clearly indicate that phosphorylation was present mostly in the lower MW proteins of the large complexes, particularly in band 2. However, phosphorylation sites were not detected in either ES_P1 nor ES_P2 but only in peptides from a smaller MW protein (ES_P7) present in bands 1 to 3 at low abundance, at position Thr 133, Tyr 135, and Ser 136 (protein sequences in Supplementary Data). Since phosphorylation was only detected in lower MW slime proteins, its role may be to mediate linkage of the different slime proteins in the larger complex. While our data indicate that the large MW proteins ES_P1 and P2 are not phosphorylated, we cannot rule out that additional phosphorylated sites on these proteins could be detected by LC MS/MS using different enzymatic treatments.

In comparison, after disulfide bond reduction glycosylation was mainly found in the complex dominated by ES_P1 (Figure 2c). To identify the carbohydrate moieties linked to the slime proteins, we conducted lectin binding assays.[ 28 ] FITC‐labeled lectins showed binding to ES slime using a dot blot assay, demonstrating the presence of the specific glycans, including β‐D‐galactosyl(1‐3)‐D‐N‐acetyl‐D‐galactosamine, L‐fucose, terminal α‐D‐galactosyl, α‐D‐mannose, and N‐acetyl‐β‐D‐glucosaminyl sugars (Figure S7, Supporting Information). FITC‐fluorescence‐based binding assay measurements indicated that α‐D‐mannose, and N‐acetyl‐β‐D‐glucosaminyl sugars were the most abundant glycans (Figure S8, Supporting Information).

2.5. Structural Predictions and Characterization of Slime Proteins

We used the AlphaFold tool[ 29 ] for protein structure prediction, which projected that eleven and thirteen β‐sheet rich domains existed in ES_P1 and ES_P2 respectively (Figure 3a and Figure S9a, Supporting Information), with a high predicted local distance difference test (LDDT) scores, indicating correctly predicted local domains based on interatomic distances.[ 29 ] We also subjected both ES_P1 and ES_P2 to a suite of bioinformatic tools used to identify IDRs, which all predicted the proteins to be largely disordered (Figure 3b and Figure S9b, Supporting Information). Although a high degree of disorder was confirmed, it is interesting to note that AlphaFold prediction indicates the ability of both proteins to acquire localized short secondary structures (Figure 3c and Figure S9c, Supporting Information), corroborating crystalline β‐sheets identified in the native slime by WAXS and Raman spectroscopy.[ 11 ]

Figure 3.

Figure 3

Structural predictions and LLPS of ES_P1. a) Secondary structure domains predicted by AlphaFold indicated as straight vertical lines along the sequence (green: α‐helices; red: β‐sheets). b) Prediction of IDRs within ES_P1 using various bioinformatics tools. c) Predicted structure of ES_P1 based on AlphaFold, with regions of secondary conformation mapped within the structure (green: α‐helices, red: β‐sheets). d) Microdroplets of 200 µm recombinantly expressed N‐terminal region of ES_P1 (ES_P131‐83, located within the purple blocks in Figure 2e) at pH 7 in citrate‐phosphate buffer observed by light microscopy (top), and GFP‐encapsulated microdroplets of ES_P131‐83 observed by fluorescence microscopy (bottom).

Yet, the N‐terminal regions of both proteins containing the LC domains enriched in Gly and Ser were predicted to lack any ordered structure, as evidenced by the low LDDT scores. Based on the striking resemblance of the N‐termini domains with the LC domain of the FUS protein that exhibits LLPS, we hypothesized that these domains may play a similar role in ES_P1 and ES_P2, namely they may trigger the formation of concentrated droplets through LLPS. To verify this hypothesis, a construct of ES_P1 from position G31 to Y83 (referred to as ES_P131‐83) was recombinantly expressed in E. coli and purified. When the ES_P131‐83 construct solubilized in citrate‐phosphate buffer (at a concentration above 50 µm) was pipetted within buffer solutions at pH 4.5 to 8, microdroplets 0.4 to 5 µm in diameter spontaneously formed (Figure 3d and Figure S10, Supporting Information), confirming its ability to exhibit LLPS. These droplets preferably formed at 37°C but were less visible at room temperature (Figure S11, Supporting Information). To enhance visualization, GFP was added to the buffer solution and subsequently recruited within the droplets during phase separation as verified by fluorescence microscopy. Overall, these data suggest that the LC domain at the N‐terminus in ES_P1 may promote LLPS of the slime proteins as a way to concentrate them prior to slime ejection, while at the same time preventing premature aggregation. This mechanism is reminiscent of spider silk formation, although in the latter case silk spidroin precursors are concentrated through fully structured domains[ 30 ] as opposed to IDRs in the velvet worm slime.

2.6. Nuclear Magnetic Resonance (NMR) of the Slime

To correlate the structural and molecular features of the slime, we conducted ssNMR measurements on 13C‐enriched slime from velvet worms. Amino acid signals with greater than 5% abundance in either ES_P1 or ES_P2 proteins were assigned to the 1D 13C cross‐polarization with magic angle spinning (CP‐MAS) and direct polarization with magic angle spinning (DP‐MAS) ssNMR spectra of 13C‐enriched slime (Figure 4a and Table S4, Supporting Information). To achieve a higher peak resolution in CP‐MAS spectra as well as the acquisition of insensitive nuclei enhanced by polarization transfer (INEPT) spectra, the dried slime was wetted with 10 µL of water. Peaks between 100–160 ppm can be assigned to aromatic carbons of Tyr and Phe, and Cζ of Arg, all of which consist of more than 3% in both ES_P1 and ES_P2 sequences (Figure 4b). Carboxylic peaks were not observed in the INEPT spectrum due to the absence of directly bonded protons for polarization transfer. Glu Cγ and Asp Cβ peaks were more prominent in the 1D INEPT but suppressed in CP‐MAS, indicating their presence in flexible regions of the proteins as well as their involvement in hydrogen bonding in the slime, which could potentially stabilize the nanoglobules in solution. Due to the overall abundance of Gly in the slime (27% predicted by AAA, Table S2, Supporting Information), the 13C spectrum of 13C‐Gly labeled slime was used to identify the Gly Cα peak, and this peak was deconvoluted in the DP‐MAS spectrum to identify the relative abundance of Gly in difference secondary conformations in the slime.[ 31 ] The resonance at 44.4 ppm could be assigned to Gly Cα in random coil given that the signal was significantly enhanced in the INEPT spectrum, whereas the 45.3 ppm resonance was assigned to Gly Cα in β‐sheets[ 32 ] (Figure 4c). Based on the areas under the deconvoluted peaks, 16% of Gly residues were estimated to be within β‐sheets while 84% of Gly were found in random conformations, allowing the latter to flexibly interact with different residues forming both inter‐ and intra‐molecular bonds in the slime. In comparison, 5 out of 187 Gly (2.7%) and 41 out of 166 Gly residues (24.7%) in ES_P1 and ES_P2, respectively, were predicted by AlphaFold to lie within secondary structures in these two proteins (Figure 4d). Next, amino acid residues within the flexible regions of the slime were assigned using 2D INEPT (Figure 4e). Peaks located at 57.4, 71.8, 80.3, and 108.6 ppm in the 2D INEPT spectrum were assigned to glycan carbons C6, C2, C4, and C1, respectively, strongly indicating the presence of 6‐carbon monomeric sugar moieties on the slime proteins[ 33 ] (Figure S12, Supporting Information). These results corroborate the presence of galactose/glucose/mannose sugars bound to proteins detected using the lectin‐binding assay. Additionally, since the Ser peaks were very prominent in the 1D and 2D INEPT spectra compared to the CP‐MAS spectra, we posit that glycosylation likely occurred on these Ser residues. The unassigned peak at 3.4 ppm could arise from an aliphatic group of the lipid moiety.[ 34 ] In the 2D DARR spectrum collected for 500 ms, the observed 45.3–54.0 ppm cross‐peaks were assigned to Gly Cα‐Lys Cα inter‐residue correlations. Given that 28 pairs of consecutive GK residues exist in ES_P1 and 12 GK pairs in ES_P2, these results suggest that those pairs are located in the rigid structure of the solid slime and therefore involved in its structural arrangement (Figure S13, Supporting Information).

Figure 4.

Figure 4

NMR characterization of the slime. a) 1D 13C ssNMR DP‐MAS (black), CP‐MAS (green), and INEPT (blue) spectra overlaid for ES slime. Residues of interest are labeled in red. b) Peaks beyond 100 ppm overlaid in 1D 13C spectra. c) Deconvolution of the Gly peak in DP‐MAS spectrum. d) Mapping of Gly residues (red) on the predicted ES_P1 structure. e) 2D 1H‐13C INEPT spectrum of hydrated ES slime in the 0–72 ppm region.

2.7. Fiber Formation after Thermo‐Chemical Treatments

From structural predictions/characterization and SDS‐PAGE of reduced samples combined with LC/MS‐MS, it appears that β‐sheet domains and disulfide bonds are important structural and biochemical characteristics of the slime complex. To gain further insights into the role of these features on slime formation, we subjected the re‐dissolved slime to thermal denaturation (to disrupt β‐sheets) and/or treated it with DTT (to reduce disulfide bonds) and then attempted to draw fibers from the treated solutions. In native conditions, long fibers could readily be drawn (Movie S1, Supporting Information) as expected. When the re‐dissolved slime was heat‐treated to 70°C (Figure S14, Supporting Information), an increase in turbidity was observed and fiber formation was inhibited (Movie S2, Supporting Information), although some weak fibers could sometimes be obtained. In contrast, incubating the slime with a high concentration of DTT (to ensure complete reduction of the disulfide bonds as verified by SDS‐PAGE, see Figure S15, Supporting Information) did not inhibit fiber formation (Movie S3, Supporting Information). Finally, the combination of both heat‐treatment and DTT was the most efficient at inhibiting fiber formation, with no fibers observed in all cases (Movie S4, Supporting Information). These results indicate that thermal denaturation of the slime proteins is most efficient at preventing subsequent fiber formation.

2.8. Updated Fiber Formation Model

Fiber formation has recently been proposed to occur by shear‐induced β‐sheet unfolding followed by stabilization mediated by electrostatic interactions of phosphorylated residues.[ 11 ] An earlier suggestion was that intermolecular disulfide bonds help in stabilizing the slime upon ejection.[ 16 ] Based on our findings, we propose an updated fiber formation model in Onychophora, as illustrated in Figure 5 . Our results clearly support that native slime proteins do not exist as monomeric units but as multi‐protein complexes linking high MW with low MW proteins by disulfide bonds through Cys residues located at the termini of either ES_P1 or ES_P2 (Figure 2a and Figure S15, Supporting Information). However, since reducing inter‐molecular disulfide bonds in the resolubilized slime does not inhibit fiber drawing (Figure S15 and Movie S3, Supporting Information), we conclude that this latter mechanism is not critical for fiber formation. Multi‐protein complexes linked by disulfide bridge have previously been identified in Bombyx mori silk fibroins consisting of heavy and light protein chains, with the linkage located at the C‐terminus of the heavy protein chain.[ 35 ] Disulfide bridges may also occur in the slime proteins given that Cys residues in ES_P1 and ES_P2 are similarly placed at both termini. Interestingly, in silk fibroins, the light chain component does not directly contribute to protein structure or fiber formation but is crucial for proper cellular secretion of the heavy chain.[ 36 ] β‐sheets in the heavy chains, on the other hand, are well‐established to provide mechanical stability to silk fibers.[ 37 ] Disulfide‐linked complexes in Onychophora slime are discovered here for the first time, and we suggest that the low MW slime proteins may also assist in cellular secretion by preventing early aggregation, but this remains to be validated.

Figure 5.

Figure 5

Proposed fiber formation model in Onychophora slime (SMWSP: small molecular weight slime proteins). Slime proteins are mostly concentrated within nanodroplets as disulfide‐bonded complexes that also contain lipids. Protein complex and lipid phases may also be found between the droplets but at a lower concentration. Upon slime ejection, β‐sheets domains in ES_P1 and ES_P2 are aggregation hot spots that mediate shear‐induced fiber formation and lipids migrate towards the outside of the fiber to form a hydrophobic coating (note that in the cartoon, β‐sheets domains are not to scale).

Baer et al.[ 11 ] suggested that β‐sheets present in the initial nanodroplets undergo shear‐induced unfolding, such that the content of β‐sheets in the drawn fibers decreases compared to the nanodroplets. Our fiber formation experiments from the re‐solubilized slime following heat treatment suggest a different mechanism. Indeed, while bioinformatic predictions clearly indicate that ES_P1 and ES_P2 are mostly in random coil configuration in the soluble state, they also confirm the presence of a few β‐sheet domains that were experimentally identified in the re‐dissolved slime by WAXS and FTIR.[ 11 ] The latter are thus the only structural elements that can be thermally denatured and since heat‐treatment of the re‐solubilized slime inhibited fiber formation, these domains appear to be critical for fiber formation. In this picture, the short predicted β‐sheet domains of ES_P1 and ES_P2 (Figure 3b) may act as nucleation sites for shear‐induced intermolecular aggregation and fiber stabilization, concomitantly occurring with the fusion of nanoglobules into the nascent fiber.

PTMs previously detected in the slime include phosphorylation[ 8 ] and Pro hydroxylation,[ 7 ] but it was previously unknown which slime proteins are modified. According to LC MS/MS results, the large MW ES_P1 and ES_P2 are confirmed to contain HyPro whereas phosphorylation is restricted to smaller MW proteins. HyPro is the major PTM reported in spidroin proteins,[ 38 ] and has been linked to increasing protein stability[ 39 ] and mechanical properties of fibrous proteins by enhancing protein‐protein interactions. It is thus tempting to suggest a similar stabilization role in slime proteins. With regard to phosphorylation, its location within a low MW slime protein is intriguing and may suggest a role in linking different proteins in the complexes, similar to the more robust disulfide bonds. We note that according to the 2D DARR NMR spectrum (Figure S13, Supporting Information), the rigid region of the slime contains a high number (28) of the positively‐charged dipeptide GK, most likely from ES_P1, possibly indicating electrostatic pairing with the negatively‐charged phosphorylation side‐chains.

In addition to proteins, the slime also contains lipids and carbohydrates. While the presence of proteins and lipids within the nanoglobules has been observed with stimulated emission depletion microscopy,[ 6 ] further evidence seems necessary to confirm their spatial distribution, especially for the proteins located outside the nanoglobules. At this stage, we assume that proteins and lipids are both present inside and outside the nanoglobules. During slime ejection, we suggest that lipids move outwards and quickly coat the fibers (Figure 5) possibly to enhance their hydrophobicity and prevent early disaggregation. Glycosylation is another modification whose role in slime proteins remains elusive.[ 7 ] According to our INEPT NMR spectra (Figure S12, Supporting Information), glycosylation is located on the flexible regions of the slime, most likely on Ser residues (Figure 4e), thus making it unlikely that carbohydrate side‐chains play a structural role. Given the sticky nature of the slime, an adhesive functionality similar to that of the sericin in silk fibroins[ 40 ] is more probable. However, unlike sericins that are distinct proteins from silk fibroins, glycoprotein staining on SDS‐PAGE gels (Figure 2c) suggests that glycosylation is post‐translationally incorporated into the structural proteins ES_P1 and ES_P2.

Finally, an important finding of our study is the identification of LC sequence domains in the N‐termini of both ES_P1 and ES_P2, with the construct ES_P131‐83 demonstrated to exhibit LLPS (Figure 3c). Hence, our data suggest that the velvet worm slime may be added to the growing list of extracellular biological materials, including mussel fibers,[ 41 ] spider silk,[ 42 ] or squid beak,[ 43 ] whose biofabrication is mediated by intermediate phases formed by LLPS. Exploiting this mechanism, the velvet worm may be able to stockpile protein complexes in a concentrated state within the slime gland while at the same preventing their aggregation prior to fast ejection.

While there remain outstanding questions pertaining to the reversible liquid‐to‐solid transition of the viscous slime into strong adhesive fibers, the complete molecular characterization of all main slime proteins is an important step in this direction and paves the way toward sustainable fabrication of fully recyclable (bio)polymeric materials. We also point out that our study focused on the slime of velvet worms from the Peripatidae family, who are evolutionarily distinct from the Peripatopsidae family for which most biochemical and biophysical data have been obtained to date. Whereas sequence alignment of slime proteins from both families shows high similarity, comparative studies are currently underway to address whether different species have evolved specific biochemical strategies.

3. Experimental Section

Specimens and Slime Collection

Specimens of Onychophora were collected in the local secondary forest in Singapore (Permit No NP/RP19‐037 obtained from the National Parks Board, Singapore) near the Island's coast and maintained in the plastic boxes with perforated lids. The specimens were kept at a cool temperature (20–26°C) and sphagnum moss was used to fill 2–3 cm layer to maintain sufficient moisture. The worms were fed every week with crickets. To enrich the slime with 13C for NMR studies, each cricket was injected with an isotope labeled D‐Glucose (ca. 100 µL of 5 m, U‐13C6, 99% from Cambridge Isotope Laboratories) prior to feeding. The slime collection was performed every week, by stimulating the specimens with a brush and directing the ejection of slime to a pipette tip. To maximize the collection, slime was allowed to dry on the tips and stored at −20°C until further usage. To obtain a liquid slime, the dried off slime was scratched off the tip as pieces under a dissection microscope. Slime samples were prepared as dried pieces for NMR or redissolved in minimum amount of water for other experiments.

RNA Extraction, Sequencing, and Analysis

Total RNAs were extracted from isolated slime glands by Trizol (Thermo Fisher, United State) followed by the manufacturer's protocol. In brief, 1 mL Trizol solution was added to the isolated slime gland (<50 mg). The sample was mixed by vortex for 5 min and further sonicated on ice with microtips, 40% power for 2–3 s. After 5 min of incubation at room temperature, 0.2 mL chloroform was added into the mixture and the mixture was subjected to centrifuge at 12 000x g, 4°C for 15 min. The top aqueous phase was then transferred to a new tube with equal volume of ethanol (70%). Once gently mixed with the pipette, the solution was transferred to RNeasy mini kit column (Qiagen, Germany) and centrifuged at 8000x g for 30 s to capture the RNA on the membrane. After two times washing with RW1 solution (700 µL) and RPE solution (600 µL) provided by the kit respectively, the spin column was transferred to a new collection column and total RNA was eluted by RNase‐free water (30 µL) with 2 min centrifuge at 8000x g. The extracted RNA was stored at −80°C until further usage in RNA sequencing.

RNA seq libraries were then prepared using Illumina compatible NEXTflex Rapid Directional RNA‐Seq kit according to manufacturer's protocol. Sequencing was then performed using 2×150 bp read length on the HiSeq 2000. The raw fastq reads were checked with FastQC[ 44 ] and trimmed by Trimmomatic.[ 45 ] The paired‐end reads were de novo assembled by Trinity[ 21 ] with default settings. Protein sequences were then predicted with TransDecoder.[ 22 ]

Verification of Full‐Length Sequencing

Re‐sequencing primer sets covering 83 to 6262 base pairs of ES_P1 and 173 to 4626 base pairs of ES_P2 were designed in NCBI with the whole RNA seq result as database with default setting. Primers with Tm close to 65°C with PlantiumTaq were selected for PCR. PCRs were performed according to the manufacture with adaptations. Each 50 µL reaction consisted of dNTPs (250 µm), 1x phusion plus buffer (with 1.7 mm MgCl2, Thermo Fisher, United State), primers (400 nm L−1 of each), Phusion plus polymerase (0.05 U,Thermo Fisher, United State), and cDNA (20 ng). PCR products were sent for Sanger sequencing (1st base, Singapore) and results were analyzed by Unipro UGENE.[ 46 ]

Protein Sample Preparation & SDS‐PAGE Staining

Protein concentrations in the collected slime sample were detected by Qubit (Thermo Fisher, United State) with protein broad range assay kits. Proteins (≈15 µg) were loaded into 4–15% mini gel (Biorad, England). Slime was treated as follows: i) native slime (N) was directly collected from the worm without further treatment; ii) dephosphorylated (DP) slime was DP with lambda protein phosphatase (New England biolabs, United State) according to the manufacturer's protocol. The treatments for Reduced (R) and thermally‐denatured (TD) slime were carried out with 5 mm (1 mg mL−1) DTT without/with heat to 70°C for 10 min respectively; deglycosylated slime was treated with protein deglycosylation mix II (New England biolabs, United State) under non‐denaturing (DG) or denaturing conditions (DG‐R) according to manufacturer's protocol. Electrophoresis was carried out at 150 V constant voltage for 60 min until the tracking dye reached the bottom of acrylamide gel. The gel was then incubated in sensitization buffer (30% v/v ethanol, 10% acetic acid, and 10% methanol) for 1 h. After being washed with RO water, the gel was subjected to different staining chemicals accordingly. Coomassie blue staining was performed according to Amini et al.[ 47 ] to visualize all protein contents.

PAGE on slime samples was conducted using Bio‐Rad Protean II electrophoresis system by running on hand‐cast 6% Bis‐tris large‐format gels prepared at pH 6.4 with MOPS running buffer at pH 7.7 (50 mm MOPS, 50 mm Tris, 0.03% EDTA, 0.1% SDS). 15 µL of 1 mg mL−1 slime added to 4x NuPAGE LDS Sample Buffer was added per well. Electrophoresis was carried out at 150 V constant voltage initially and the current was maintained under 100 A throughout to prevent excessive heating of the PAGE system until the dye‐front reached the bottom of the gel. The gels were incubated in sensitization buffer containing 30% v/v ethanol, 10% v/v acetic acid, and 10% v/v methanol to fix the gel and remove the excess running buffer dye, followed by a quick rinse with water. Based on the sample treatment, the gels were stained using ProQ diamond stain based on the manufacturer's protocol (Invitrogen) to detect phosphoproteins, Alcian blue/silver stain for glycoprotein detection, and imaged accordingly. Ovalbumin was used as a positive control for phosphorylated protein stain, and fetuin was used as the control for glycosylated protein stain.

Liquid Chromatography Tandem Mass Spectrometry (LC MS/MS)

The LC MS/MS experiments were performed according to Amini et al.[ 47 ] with modifications. Briefly, protein bands were excised from the Coomassie‐blue stained gel and subjected to in‐gel digestion with trypsin (1.25 µg, PROMEGA, sequential grade) or AspN (1 µg, NEB, England) for overnight digestion respectively. Digested peptides were extracted, dried, and desalted with Oasis HLB 1cc 30 mg columns (Waters, WAT094225) for LC MS/MS analysis using the Easy‐nLC system coupled with an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Scientific) and separated on a 50 cm x 75 µm Easy‐Spray column. Peptides were separated over a 70 min gradient, using mobile phase A (0.1% formic acid in water) and mobile phase B (0.1% formic acid in 95% acetonitrile), and eluted at a constant flow rate of 300 nL min−1 using 2%‐ 27% acetonitrile over 45 min, ramped to 50% over 15 min, then to 90% over 5 min and held for 5 min. Acquisition parameters: data‐dependent acquisition (DDA) with survey scan of 60 000 resolution, automatic gain control (AGC) target of 4e5, and maximum injection time (IT) of 100 ms; MS/MS collision‐induced dissociation in ion trap, AGC target of 1.5e4, and maximum IT of 50 ms; collision energy 35%, isolation window 1.2 m z−1.

The further separated bands on long‐range gel were excised from zinc imidazole stain and reduced with 10 mm DTT at 70°C for 10 min. Individual bands were then stacked on a 2nd long‐range gel and electrophoresis was carried out at 150 V constant voltage for another 40 min. After Coomassie blue staining, the protein bands from 2nd long‐range gel were excised and in‐gel digestion was performed with 1 µg AspN in 100 mm TEAB. Digested peptides were extracted and desalted for LC MS/MS analysis as above, but with an Orbitrap Fusion Tribrid mass spectrometer (Thermo Scientific). Separation parameters were as above, and eluted at a constant flow rate of 300 nL min−1 using 2%‐ 27% acetonitrile over 45 min, ramped to 55% over 15 min, then to 95% over 5 min, and held for 5 min. Acquisition parameters were as above, with isolation window of 1.6 m z−1 instead.

Peak lists were generated in Proteome Discoverer 2.4 (Thermo Scientific) using Mascot 2.6.1 (Matrix Science) and concatenated forward/decoy protein sequences obtained from RNAseq. Search parameters: MS precursor mass tolerance 10 ppm, MS/MS fragment mass tolerance 0.8 Da, 3 missed cleavages; static modifications: Carbamidomethyl (C); variable modifications: Acetyl (Protein N‐term), Deamidated (NQ), Nitro (Y), Oxidation (M), Oxidation (P), Phospho (ST), Phospho (Y). False discovery rate estimation with 2 levels: Strict = FDR 1%, Medium = FDR 5%. Precursor mass peak (MS1) intensities were quantified by label‐free quantification (LFQ) using the Minora feature detector. The peptides presented in only one of the replicates were removed prior to quantification analysis. Total abundance of unique peptides from the same master protein was then filtered by threshold of 107, and sorted to identify the major proteins in each sample/band.

Size Exclusion Chromatography of Slime

Native slime and slime reduced by DTT were passed through a Superose 6 prep grade size exclusion chromatography column (17‐0489‐01, GE Healthcare). The data were acquired on the UNICORN 7.5 software with ÄKTA Pure (Cytiva, Sweden) Fast Protein Liquid Chromatography (FPLC) system.

Dynamic Light Scattering (DLS)

The size distributions of nanoglobules from the redissolved slime in native and reduced conditions were measured by DLS with a Nano‐ZS instrument (Malvern Instruments, UK) and the Zetasizer software version 7.11. Redissolved slime samples with or without DTT were loaded into a 3 mm quartz cuvette (Hellma, GmbH) and data were acquired after 300 s equilibration time at 25 °C. Dispersion Technology Software (DTS) was used for data collection and analysis and the size distribution was analyzed in terms of scattering intensity.

Prediction Tools for Disordered Regions and Structure

Full sequences of ES_P1 and ES_P2 were submitted to web servers to detect disordered regions from primary sequences‐ Espritz,[ 48 ] DisEMBL,[ 49 ] FoldIndex,[ 50 ] PONDR,[ 51 ] Globplot,[ 52 ] IUPred,[ 53 ] Anchor.[ 54 ] Secondary structures for ES_P1 and ES_P2 were predicted using AlphaFold's google collab notebook by predictions of each protein as two halves (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). Regions with pLDDT score below 50 were specified to be disordered. The PDB structure files were combined using ab initio domain assembly (AIDA) server (https://aida.godziklab.org/).

Solid State Nuclear Magnetic Resonance (ssNMR)

Dried slime collected over 3 months was loaded into Bruker 3.2 mm regular wall ZrO2 magic angle spinning (MAS) rotor. NMR data were collected on 800 MHz Bruker Advance III instrument equipped with a 3.2 mm H/C/N EFree MAS solid‐state probe. 1D 1H–13C cross‐polarization (CP), 13C direct‐polarization (DP), 1D and 2D 13C Insensitive Nuclei Enhanced Polarization Transfer (INEPT), and 2D 13C–13C Dipolar‐assisted Rotational Resonance (DARR) experiments were performed with the MAS spinning frequency set at 15.151 kHz and the variable temperature set at −12°C, which gave an actual sample temperature of 11°C based on the external calibration. Chemical shifts were referenced using the DSS scale with adamantane as a secondary standard for 13C (downfield signal at 40.48 ppm) and were calculated indirectly for 1H. The 1H‐13C CP spectrum was collected with a contact time of 1200 µs. 83 kHz SPINAL‐64 1H decoupling was implemented during data acquisition. 83 kHz SPINAL‐64 1H decoupling was implemented during data acquisition. The recycle delays were 1.5 and 30 s in the 1D CP and DP MAS experiments, respectively, and acquisition time was 19.1 ms in both experiments. Additional parameters of the 2D 13C–13C DARR experiment included 1.5 s recycle delay, 45 453 Hz sweep width in both dimensions, 14.1 and 0.9 ms acquisition times in the direct and indirect dimensions, respectively. Data were processed using Topspin 3.6.

NMR Peak Assignment

1D 13C peaks of Ile (Cβ, γ, δ at 36.1, 17.4, 14.1 ppm), Val (Cβ, γ at 32.2, 21.4 ppm), Thr (Cβ, γ at 72.6, 21.4 ppm), Leu (Cβ, δ at 42.1, 25.2 ppm), Lys (Cβ, γ, ε at 32.2, 25.2, 42.1 ppm), Glu (Cγ, δ at 36.1, 183.8 ppm), Pro (Cβ, γ, δ at 32.2, 27.2, 50.8 ppm), Ser (Cβ at 63.9 ppm), Asp (Cβ, γ at 38.2, 180.2 ppm), and Gly (Cα) were assigned based on the average chemical shifts.[ 55 ] Cδ, Cε, Cζ pf Phe, and Cγ, Cδ of Tyr were assigned to peak cluster at 129.8–132.4 ppm, Phe Cγ at 139.2 ppm, Tyr Cε at 118.2 ppm, and Cζ of Tyr/ Arg at 159.5 ppm.

Lectin Binding Assay

Lectin‐binding was tested using dot blot assay and fluorescence measurement using enzyme‐linked immunosorbent assay (ELISA) on FITC‐labeled lectins.[ 28 ] Bovine Serum Albumin was used as negative control and Fetuin (WGA‐binding glycoprotein) was used as positive control. For dot blot assay, nitrocellulose membrane strips were wetted with 1x Tris‐buffered saline (50 mm Tris, 0.5 m NaCl, pH 7.4) with 0.05% Tween 20 detergent (TBST) for 1 h with constant shaking. Membranes were taken out of the buffer, and 5 µL each of 0.1 mg mL−1 BSA, slime, and fetuin were added onto the nitrocellulose membrane and the drops for left to dry for 30 min. The membranes were incubated in blocking buffer (5% w/v skim milk powder in TBST) for 2 h with continuous shaking. The membranes were washed 2x for 10 min each with TBST buffer to remove any remaining blocking buffer. The membranes were incubated in 1:200 dilution of 1 mg mL−1 FITC‐labeled lectins in TBST buffer, and kept for continuous shaking in the dark for 2 h. The membranes were washed 2x with TBST buffer, and imaged using ChemiDoc system using Alexa fluor 488 dye settings.

For fluorescent measurements of binding assay, 50 µL triplicates each of varying concentrations of slime were added per well of transparent Nunclon Delta Surface 96‐well plate (Thermo scientific), and left to incubate overnight at 4°C to ensure protein binding. Each well was washed twice with 100 µL 1X Phosphate‐Buffered Saline (1x PBST: 137 mm NaCl, 2.7 mm KCl, 10 mm Na2HPO4, 1.8 mm KH2PO4, 0.1% w/v Tween‐20) to remove unbound slime. 1 mg mL−1 each of FITC‐labeled lectins was dissolved in PBST in 1:200 dilution to make working concentrations. 100 µL of FITC‐labeled lectins were added to each well, and the plate was incubated for 2 h in dark. Lectin solution was removed and each well was washed 1x with 100 µL PBST. 100 µL PBST was added to each well and fluorescence values were measured (ex: 490 nm; em: 525 nm). The values were plotted using Originlab.

Recombinant Purification of Slime Proteins

Plasmid for ES_P131‐83 protein was synthesized (Bio‐basic Asia) and cloned into kanamycin‐resistant pSUMO‐LIC vector with cleavable N‐terminal His‐SUMO tag and expressed in chloramphenicol‐resistant BL21(DE3) Rosetta T1R E.coli cells. Glycerol stocks were stored at −80°C, and were used to inoculate fresh autoclaved Luria Bertani growth medium for overnight growth at 37°C at 220 rpm. These starter cultures were used to grow E.coli at a large scale in Terrific Broth media at 37°C until OD600nm reached 1, and induced with 0.5 mm isopropyl‐beta‐D‐1‐thiogalactopyranoside (IPTG) at 18°C for 18–20 h. Cells pellets were harvested by centrifugation at 6000 rpm and resuspended in lysis buffer containing 50 mm Tris–HCl, pH 7.8, 0.2% Triton X‐100, with 1 mm phenylmethylsulfonyl fluoride (PMSF) added immediately before lysis using sonicator followed by lyophilizer. Lysed sample was centrifuged at 40 000x g and the protein present in the supernatant was bound to Ni‐NTA column (Cytiva, Sweden) and eluted in 50 mm Tris–HCl buffer at pH 8 containing 500 mm Imidazole, 100 mm NaCl, and 1 mm TCEP. Imiadazole was removed using HiTrap column (Cytiva, Sweden) before protease cleavage of the N‐terminal SUMO tag. The ES_P131‐83 was collected by reverse‐IMAC binding to Ni‐NTA whereby the protein eluted in the flowthrough and the SUMO tag remained bound to the resin. The ES_P131‐83 protein stock for LLPS conditions was stored at 4°C in CHAPS buffer, pH 11.

Amino Acid Analysis

Triplicates of slime solutions in water (100 mg) were hydrolyzed in 500 mL of 6 m HCl solution containing 0.5% phenol in vacuo at 110°C for 24 h. Solvents were removed by centrifugal vacuum and the hydrolysates were washed twice with water. Dried samples were kept at −20°C prior to analysis. For cysteine determination, triplicates of slime solutions in water were oxidized with 500 mL of performic acid (9:1 formic acid: hydrogen peroxide) in ice for 18 h. 100 mL of hydrobromic acid was added to the solution, after which solvents were removed by speed vacuum. Samples were then hydrolyzed in 500 mL of 6 m HCl solution with 0.5% phenol under vacuum at 110°C for 24 h. Solvents were removed by speed vacuum, after which hydrolysates were washed twice with water and kept at −20°C prior to analysis. For analysis, hydrolyzed samples (0.5 mg mL−1) were dissolved in a pH 2.2 citrate buffer containing 0.1 mm Nitrotyrosine as internal standard. For performic acid‐treated samples, 0.1 mm Norleucine was used as the internal standard. Composition analysis was performed with an amino acid analyzer S 433 (Sykam, Germany) using a ninhydrin buffer system.

Conflict of Interest

G.P. and A.U. are employees of ExxonMobil who funded the project.

Supporting information

Supporting Information

Supporting Movie 1

Supporting Movie 2

Supporting Movie 3

Supporting Movie 4

Acknowledgements

Y.L. and B.S. contributed equally to this work. This research was funded by ExxonMobil through the Singapore Energy Research Center (SgEC). The authors also acknowledge financial support from the Singapore Ministry of Education (MOE) through an Academic Research Fund (AcRF) Tier 3 grant (grant no. MOE 2019‐T3‐1‐012). Y.T.L. and R.M.S. thank the support of A*STAR Core funding and the Singapore National Research Foundation under its NRF‐SIS “SingMass” scheme (RMS). The authors would like to thank Jun Jie Loke for his technical input on long‐gel electrophoresis and optical microscopy. The authors thank Ming Wei Chen of the Protein Production Platform at NTU for providing the cloned plasmid for recombinant expression of slime protein. ssNMR experiments were performed at the NTU Center of High Field NMR Spectroscopy and Imaging. The authors also thank Georg Mayer, Alexander Baer, and Matthew Harrington for insightful discussions.

Lu Y., Sharma B., Soon W. L., Shi X., Zhao T., Lim Y. T., Sobota R. M., Hoon S., Pilloni G., Usadi A., Pervushin K., Miserez A., Complete Sequences of the Velvet Worm Slime Proteins Reveal that Slime Formation is Enabled by Disulfide Bonds and Intrinsically Disordered Regions. Adv. Sci. 2022, 9, 2201444. 10.1002/advs.202201444

Data Availability Statement

The assembled transcriptome and raw RNAseq data have been submitted to NCBI under Bioproject PRJNA806368. The raw spectra and search data have been uploaded to the Jpost repository with the following accession numbers: JPST001465 (jPOST) and PXD031722 (ProteomeXchange).

References

  • 1. Sahni V., Harris J., Blackledge T. A., Dhinojwala A., Nat. Commun. 2012, 3, 1106. [DOI] [PubMed] [Google Scholar]
  • 2. Waite J. H., J. Exp. Biol. 2017, 220, 517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Wolff J. O., Grawe I., Wirth M., Karstedt A., Gorb S. N., Soft Matter 2015, 11, 2394. [DOI] [PubMed] [Google Scholar]
  • 4. Sharma B., Malik P., Jain P., Mater. Today Commun. 2018, 16, 353. [Google Scholar]
  • 5. Hayashi C. Y., Shipley N. H., Lewis R. V., Int. J. Biol. Macromol. 1999, 24, 271. [DOI] [PubMed] [Google Scholar]
  • 6. Baer A., Schmidt S., Haensch S., Eder M., Mayer G., Harrington M. J., Nat. Commun. 2017, 8, 974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Benkendorff K., Beardmore K., Gooley A. A., Packer N. H., Tait N. N., Comp. Biochem. Physiol., Part B: Biochem. Mol. Biol. 1999, 124, 457. [Google Scholar]
  • 8. Haritos V. S., Niranjane A., Weisman S., Trueman H. E., Sriskantha A., Sutherland T. D., Proc. R. Soc. B 2010, 277, 3255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Röper H., Z. Naturforsch. C 1977, 32, 57. [PubMed] [Google Scholar]
  • 10. Baer A., Hänsch S., Mayer G., Harrington M. J., Schmidt S., Biomacromolecules 2018, 19, 4034. [DOI] [PubMed] [Google Scholar]
  • 11. Baer A., Horbelt N., Nijemeisland M., Garcia S. J., Fratzl P., Schmidt S., Mayer G., Harrington M. J., ACS Nano 2019, 13, 4992. [DOI] [PubMed] [Google Scholar]
  • 12. Johnston E. R., Miyagi Y., Chuah J.‐A., Numata K., Serban M. A., ACS Biomater. Sci. Eng. 2018, 4, 2815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Addison J. B., Weber W. S., Mou Q., Ashton N. N., Stewart R. J., Holland G. P., Yarger J. L., Biomacromolecules 2014, 15, 1269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Baer A., Schmidt S., Mayer G., Harrington M. J., Integr. Comp. Biol. 2019, 59, 1690. [DOI] [PubMed] [Google Scholar]
  • 15. Matsuhira T., Osaki S., Polym. J. 2015, 47, 456. [Google Scholar]
  • 16. Graham L. D., Glattauer V., Li D., Tyler M. J., Ramshaw J. A. M., Comp. Biochem. Physiol., Part B: Biochem. Mol. Biol. 2013, 165, 250. [DOI] [PubMed] [Google Scholar]
  • 17. Hagn F., Eisoldt L., Hardy J. G., Vendrely C., Coles M., Scheibel T., Kessler H., Nature 2010, 465, 239. [DOI] [PubMed] [Google Scholar]
  • 18. Askarieh G., Hedhammar M., Nordling K., Saenz A., Casals C., Rising A., Johansson J., Knight S. D., Nature 2010, 465, 236. [DOI] [PubMed] [Google Scholar]
  • 19. Guerette P. A., Hoon S., Seow Y., Raida M., Masic A., Wong F. T., Ho V. H. B., Kong K. W., Demirel M. C., Pena‐Francesch A., Amini S., Tay G. Z., Ding D., Miserez A., Nat. Biotechnol. 2013, 31, 908. [DOI] [PubMed] [Google Scholar]
  • 20. Li P., Banjade S., Cheng H.‐C., Kim S., Chen B., Guo L., Llaguno M., Hollingsworth J. V., King D. S., Banani S. F., Russo P. S., Jiang Q.‐X., Nixon B. T., Rosen M. K., Nature 2012, 483, 336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Grabherr M. G., Haas B. J., Yassour M., Levin J. Z., Thompson D. A., Amit I., Adiconis X., Fan L., Raychowdhury R., Zeng Q., Chen Z., Mauceli E., Hacohen N., Gnirke A., Rhind N., di Palma F., Birren B. W., Nusbaum C., Lindblad‐Toh K., Friedman N., Regev A., Nat. Biotechnol. 2011, 29, 644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Haas B. J., Papanicolaou A., Yassour M., Grabherr M., Blood P. D., Bowden J., Couger M. B., Eccles D., Li B., Lieber M., MacManes M. D., Ott M., Orvis J., Pochet N., Strozzi F., Weeks N., Westerman R., William T., Dewey C. N., Henschel R., LeDuc R. D., Friedman N., Regev A., Nat. Protoc. 2013, 8, 1494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Murray D. T., Kato M., Lin Y., Thurber K. R., Hung I., McKnight S. L., Tycko R., Cell 2017, 171, 615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Jonassen I., Collins J. F., Higgins D. G., Protein Sci. 1995, 4, 1587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Shewry P. R., Halford N. G., Belton P. S., Tatham A. S., Philos. Trans. R. Soc. London, Ser. B 2002, 357, 133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Inoue S., Tanaka K., Arisaka F., Kimura S., Ohtomo K., Mizuno S., J. Biol. Chem. 2000, 275, 40517. [DOI] [PubMed] [Google Scholar]
  • 27. Ashton N. N., Stewart R. J., FASEB J. 2019, 33, 572. [DOI] [PubMed] [Google Scholar]
  • 28. Uchiyama N., Kuno A., Koseki‐Kuno S., Ebe Y., Horio K., Yamada M., Hirabayashi J., Methods Enzymol. 2006, 415, 341. [DOI] [PubMed] [Google Scholar]
  • 29. Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A., Bridgland A., Meyer C., Kohl S. A. A., Ballard A. J., Cowie A., Romera‐Paredes B., Nikolov S., Jain R., Adler J., Back T., Petersen S., Reiman D., Clancy E., Zielinski M., Steinegger M., Pacholska M., Berghammer T., Bodenstein S., Silver D., Vinyals O., et al., Nature 2021, 596, 583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Rising A., Johansson J., Nat. Chem. Biol. 2015, 11, 309. [DOI] [PubMed] [Google Scholar]
  • 31. Holland G. P., Creager M. S., Jenkins J. E., Lewis R. V., Yarger J. L., J. Am. Chem. Soc. 2008, 130, 9871. [DOI] [PubMed] [Google Scholar]
  • 32. Spera S., Bax A., J. Am. Chem. Soc. 1991, 113, 5490. [Google Scholar]
  • 33. Arnold A. A., Genard B., Zito F., Tremblay R., Warschawski D. E., Marcotte I., Biochim. Biophys. Acta, Biomembr. 2015, 1848, 369. [DOI] [PubMed] [Google Scholar]
  • 34. Knothe Gerhard, Nelsen Terry C., J. Chem. Soc., Perkin Trans. 2 1998, 10.1039/a801617h [DOI] [Google Scholar]
  • 35. Zhou C., Confalonieri F., Jacquet M., Perasso R., Li Z., Janin J., Proteins: Struct., Funct., Bioinf. 2001, 44, 119. [DOI] [PubMed] [Google Scholar]
  • 36. Mori K., Tanaka K., Kikuchi Y., Waga M., Waga S., Mizuno S., J. Mol. Biol. 1995, 251, 217. [DOI] [PubMed] [Google Scholar]
  • 37. Keten S., Xu Z., Ihle B., Buehler M. J., Nat. Mater. 2010, 9, 359. [DOI] [PubMed] [Google Scholar]
  • 38. dos Santos‐Pinto J. R. A., Arcuri H. A., Esteves F. G., Palma M. S., Lubec G., Sci. Rep. 2018, 8, 14674. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Kaelin W. G. Jr., Annu. Rev. Biochem. 2005, 74, 115. [DOI] [PubMed] [Google Scholar]
  • 40. Kunz R. I., Brancalhão R. M. C., Ribeiro L. de F. C., Natali M. R. M., BioMed. Res. Int. 2016, 2016, 8175701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Priemel T., Palia G., Förste F., Jehle F., Sviben S., Mantouvalou I., Zaslansky P., Bertinetti L., Harrington M. J., Science 2021, 374, 206. [DOI] [PubMed] [Google Scholar]
  • 42. Malay A. D., Suzuki T., Katashima T., Kono N., Arakawa K., Numata K., Sci. Adv. 2020, 6, eabb6030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Tan Y., Hoon S., Guerette P. A., Wei W., Ghadban A., Hao C., Miserez A., Waite J. H., Nat. Chem. Biol. 2015, 11, 488. [DOI] [PubMed] [Google Scholar]
  • 44. Andrews S., FastQC, 2010. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
  • 45. Bolger A. M., Lohse M., Usadel B., Bioinformatics 2014, 30, 2114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Okonechnikov K., Golosova O., Fursov M., Bioinformatics 2012, 28, 1166. [DOI] [PubMed] [Google Scholar]
  • 47. Amini S., Tadayon M., Loke J. J., Kumar A., Kanagavel D., Ferrand H. L., Duchamp M., Raida M., Sobota R. M., Chen L., Hoon S., Miserez A., Proc. Natl. Acad. Sci. U. S. A. 2019, 116, 8685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Walsh I., Martin A. J. M., Domenico T. D., Tosatto S. C. E., Bioinformatics 2012, 28, 503. [DOI] [PubMed] [Google Scholar]
  • 49. Linding R., Jensen L. J., Diella F., Bork P., Gibson T. J., Russell R. B., Structure 2003, 11, 1453. [DOI] [PubMed] [Google Scholar]
  • 50. Prilusky J., Felder C. E., Zeev‐Ben‐Mordehai T., Rydberg E. H., Man O., Beckmann J. S., Silman I., Sussman J. L., Bioinformatics 2005, 21, 3435. [DOI] [PubMed] [Google Scholar]
  • 51. Romero P., Obradovic Z., Li X., Garner E. C., Brown C. J., Dunker A. K., Proteins: Struct., Funct., Bioinf. 2001, 42, 38. [DOI] [PubMed] [Google Scholar]
  • 52. Linding R., Russell R. B., Neduva V., Gibson T. J., Nucleic Acids Res. 2003, 31, 3701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Erdős G., Pajkos M., Dosztányi Z., Nucleic Acids Res. 2021, 49, gkab408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Dosztányi Z., Mészáros B., Simon I., Bioinformatics 2009, 25, 2745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Wishart D. S., Bigam C. G., Holm A., Hodges R. S., Sykes B. D., J. Biomol. NMR 1995, 5, 67. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Supporting Movie 1

Supporting Movie 2

Supporting Movie 3

Supporting Movie 4

Data Availability Statement

The assembled transcriptome and raw RNAseq data have been submitted to NCBI under Bioproject PRJNA806368. The raw spectra and search data have been uploaded to the Jpost repository with the following accession numbers: JPST001465 (jPOST) and PXD031722 (ProteomeXchange).


Articles from Advanced Science are provided here courtesy of Wiley

RESOURCES