Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Aug 17.
Published in final edited form as: Biopolymers. 2018 Jul;109(7):e23226. doi: 10.1002/bip.23226

To achieve self-assembled collagen mimetic fibrils using designed peptides

Rebecca Strawn 1, FangFang Chen 2, Parminder Jeet Haven 2, Sam Wong 2, Anne Park-Arias 3, Monique De Leeuw 4, Yujia Xu 2
PMCID: PMC6698151  NIHMSID: NIHMS979806  PMID: 30133697

Abstract

It has proven challenging to obtain collagen-mimetic fibrils by protein design. We recently reported the self-assembly of a mini-fibril showing a 35 nm, D-period like, axially repeating structure using the designed triple helix Col108. Peptide Col108 was made by bacterial expression using a synthetic gene; its triple helix domain consists of three pseudo-identical units of amino acid sequence arranged in tandem. It was postulated that the 35 nm d-period of Col108 mini-fibrils originates from the periodicity of the Col108 primary structure. A mutual staggering of one sequence unit of the associating Col108 triple helices can maximize the inter-helical interactions and produce the observed 35 nm d-period. Based on this unit-staggered model, a triple helix consisting of only two sequence units is expected to have the potential to form the same d-periodic mini-fibrils. Indeed, when such a peptide, peptide 2U108, was made it was found to self-assemble into mini-fibrils having the same d-period of 35 nm. In contrast, no d-periodic mini-fibrils were observed for peptide 1U108, which does not have long-range repeating sequences in its primary structure. The findings of the periodic mini-fibrils of Col108 and 2U108 suggest a way forward to create collagen-mimetic fibrils for biomedical and industrial applications.

Keywords: collagen based materials, collagen mimetic fibrils, protein design of collagen fibrils, self-assembly of collagen triple helical peptides

1 |. INTRODUCTION

Collagen is one of the most sought-after biomaterials for a wide range of biomedical and industrial applications.[15] In recent years, there have been several novel approaches using synthetic peptides[610] or recombinant technologies[11,12] to create supramolecular assemblies in a length and/or a shape comparable to that of natural collagen fibrils. Some of the new materials have demonstrated the ability to interact with cell receptors.[5,12,13] However, these triple helix-based structures differ from that of natural collagen in one major aspect: they lack the axially repeating structure of collagen fibrils known as the D-period. The 67 nm D-period is the most prominent structural feature of fibrillar collagens—mainly collagens type I, II, and III—in the extracellular matrix. The interactions of collagen with cell receptors and other molecular ligands are mediated directly by collagen triple helices in the highly packed fibril form.[14,15] The D-periodicity of the fibrils is also involved in the resistance of the tensile strength of tissues,[16] and the mineralization of bones.[17] Thus, capturing the structural features of the D-period in collagen-mimetic materials represents a major step forward in developing biomaterials with improved, native-like properties and functionality.

The D-period of collagen fibrils emerges from the lateral packing of collagen triple helices in a structurally specific manner during the fibrillogenesis of collagen. The triple helix has a rod-shaped conformation consisting of three polypeptide chains wrapped around a common axis. Each peptide chain has a characteristic Gly-X-Y repeating sequence, where X and Y can be any of the 20 amino acid residues; the proline residues in the Y-position are frequently modified post-translationally to 4R-hydroxy proline (Hyp).[18] The Gly residues are buried at the center of the helix, while the side-chains of the residues in the X and the Y positions are displaced in a linear, N-to-C directionality on the surface of the helix. The uniform backbone conformation of the triple helix has been well characterized to have an average helical rise of ~0.9 nm per Gly-X-Y tripeptide.[19] The extensive triple helix domains of fibrillar collagens are about 300 nm long, and comprised of more than 1000 amino acid residues (per single peptide chain). In the context of this linear conformation, the 67 nm D-period corresponds to a segment of 234 residues of the triple helix (per peptide chain), and a fully folded triple helix of fibrillar collagen can be divided into 4.4 such D-periods. A mutual staggering of 67 nm of the associating molecules, thus, leads to the formation of alternating gaps and overlaps every 67 nm. When examined using electron microscopy under negative staining, the alternating gaps and overlaps give rise to the canonical striated pattern of collagen fibrils characterized by the alternating dark (the gap) and light (the overlap) bands, which are 0.6D and 0.4D in size, respectively. Alternatively, the gaps and the overlaps of the D-period can be observed as alternating troughs and ridges, respectively, every 67 nm using rotary-shadowing electron microscopy or atomic force spectroscopy (AFM). The D-periodic fibrils can be reconstituted in vitro from acid dissolved collagen monomers (the triple helix) isolated from tissues by increasing pH and temperature.[20] The in vitro self-assembly is a spontaneous, entropy driven process,[21,22] directed largely by the molecular interactions of the residues in the X- and Y-positions[23]; though it should be noted, short stretches of peptide at the N- and C-termini of the triple helix domain of type I collagen have also been reported to be involved in the fibril formation.[24]

The development of fibril-forming collagen materials is especially difficult since the molecular mechanism of the fibrillogenesis of collagen is still not fully understood. In light of discovery of the heptad repeat sequences of α-helix coiled-coil structures of myosin,[25] it was suspected that the D-period of collagen fibrils also came from the periodicity of the primary structure of the protein. The linear conformation of the triple helix makes it possible to directly connect the structural periodicity to the primary amino acid sequence of collagen. Indeed, based on the analysis of the amino acid sequence of Bovine type I collagen, Hulmes et al. demonstrated that both the hydrophobic interactions and the interactions of the charged residues are optimized between the associating triple helices by a mutual staggering of 234 residues.[26,27] This finding implies a specific periodicity of 234 residues in the primary sequences of the collagens; however, the specific residues associated with this periodicity have never been fully elucidated.

We recently created a self-assembled triple helix mini-fibril using the designed peptide Col108. The self-assembled Col108 mini-fibrils bear a D-period-like alternating gap-overlap region every 35 nm, which is subsequently termed the d-period of the Col108 mini-fibrils. The triple helix domain of Col108 was designed to have an explicit, built-in periodicity of 123 residues in its primary structure by placing three pseudo-identical, 123-residue sequence units in tandem.[28] In the fully folded Col108 triple helix, each sequence unit corresponds to a segment of the triple helix 35 nm in length—the size of a d-period. To create the structural features of gaps and overlaps, we included an overhang unit with a size equivalent to 0.3d (~10 nm) comprised of the N and C-terminal sections of peptide bracketing the three sequence units; specifically, the overhang consists of an N-terminal Cys-knot sequence, a C-terminal nucleation sequence, a C-terminal Cys-knot sequence and a C-terminal foldon domain. Within this sequence architecture the critical interacting residues are placed periodically every 123 residues, or every 35 nm along the axis of the fully folded triple helix. We postulated that, given this unit repeating primary structure, a mutual staggering of one sequence unit of the associating triple helices during the self-assembly would bring all interacting residues of associating triple helices into register. The optimized inter-helical interactions would then make the unit-staggered structure the unique, most stable conformation. This unit-staggered model is supported by the Observation of an axially repeating structure of exactly 35 nm for the self-assembled Col108 mini-fibrils, consisting of an ~25 nm (0.7d) gap region and an ~10 nm (0.3d) overlap region.[28] To test the robustness of this design strategy, we have made two new peptides: 2U108 and 1U108 containing, respectively, two and one sequence units. Thus, the 123-residue sequence periodicity is maintained in 2U108 but not in 1U108. The other features of the peptides remain the same as those in Col108, including the overhang unit. According to the unit-staggered model, a minimum of two sequence units is needed to support the axial growth during the self-assembly of the d-period fibrils, and the data of the self-association of the two new peptides support this postulation.

2 |. MATERIAL AND METHODS

2.1 |. The gene constructs of 2U108 and 1U108

The 2U108 and 1U108 were created by modifying the original Col108 plasmid. To construct the 2U108 plasmid, a Kpnl cleavage site was introduced between the first and the second coding sequences of the Col108 plasmid by affecting a CCA → ACC base change by site-directed mutagenesis (Figure 1). Together with an existing Kpnl site, the two Kpnl sites bracket the second sequence unit of Col108. The parent plasmid was then digested with Kpnl, and the fragments were ligated, resulting in the effective removal of the second sequence unit. The full 2U108 expression plasmid produces a 40.9 kDa fusion protein with a His-tagged thioredoxin attached to the N-terminus of the modified 2U108 peptide, separated by a thrombin cleavage site.

FIGURE 1.

FIGURE 1

The sequence architecture of the peptides: a schematic drawing of the primary structures of Col108, 2U108, and 1U108. Each blue block represents one sequence unit consisting of an N-terminal (Gly-Pro-Pro)4 sequence and a Col-domain; the amino acid sequence of the Col-domain is shown in an expanded view at the bottom. The sequence of the Cys-knot is highlighted in brown. The ovals represent the 27-residue foldon domain. The black and open arrows indicate the insertion of, respectively, the tripeptides GTP and GSR associated with the restriction enzyme sites included in the gene

Similarly, the 2U108 plasmid construct was used as the starting point to produce the 1U108 plasmid. A Kpnl cleavage site was introduced between the second sequence unit and the C-terminal (GPP)4 coding sequence by affecting a CCTG → TACC base change by site-directed mutagenesis (Figure 1). The parent plasmid was then digested with Kpnl, and the fragments ligated, resulting in the removal of a sequence unit. The full 1U108 expression plasmid produces a 30 kDa fusion protein.

2.2 |. Expression and purification

The 2U108 and 1U108 peptides were expressed in bacterial strains JM109(DE3) or BL21(DE3). The translation was induced by 0.1 mM IPTG once the OD (600 nm) reached 0.5–0.6 AU. The expression products for 2U108 and 1U108 plasmids were purified using the protocol previously reported.[28] The final product for 2U108 has a molecular weight of 27.1 kDa and is comprised of a triple helix domain containing two tandemly repeating sequence units with a nucleation sequence, and a C-terminal foldon domain.[29,30] The molecular weight of peptide 1U108 is 16.2 kDa, comprised of a triple helix domain of a singular sequence unit with a nucleation domain and a C-terminal foldon domain (Figure 1). Peptides were stored as lyophilized powder at 48°C until use. Stock solutions were made by dissolving the lyophilized powders in 5 mM acetic acid (HAc), pH 4.0, at a concentration of ~1 mg mL−1. The concentration was estimated using extinction coefficients of 0.32 and 0.54, respectively, for 1 mg mL−1 solutions of 2U108 and 1U108 at 280 nm, calculated using the online tool Protparam.

2.3 |. The characterization of the triple helix conformation

The triple helix conformation of 2U108 and 1U108 were assessed via Circular Dichroism (CD). CD (Aviv Biomedical Spectrometer model 202–01) wavelength scans were conducted at 48°C between 180 and 300 nm on 0.5 mg mL−1 peptide samples in the corresponding buffers. Temperature melt experiments were conducted on 0.5 mg mL−1 peptide samples monitored at a wavelength of 225 nm, and covered a temperature range from 4 to 65°C with an equilibration time of 2 min at each temperature, effectively conferring a heating rate of 0.3°C min−1. To aid in the comparison of melt curves between samples, the data was normalized and is displayed in terms of fraction folded, F(T):

F(T)=θ(T)θuf(T)θf(T)θuf(T)

where θ(T) is the observed ellipticity at temperature T, and θf(T) and θuf(T) are the ellipticity of the folded and the unfolded triple helix, respectively. The θf(T) and θuf(T) were determined from the linear extrapolation of, respectively, the native and the unfolded baselines of the melting curve. The apparent melting temperature is determined as the mid-point of the transition,[31] where F(Tm) = 0.5.

2.4 |. Fibrillogenesis

To induce fibrillogenesis samples at ~1 mg mL−1, previously dissolved and equilibrated in 5 mM HAc at 4°C, were mixed with an equal volume of double strength neutralization buffer (60 mM TES, 60 mM Na2HPO4, and 135mM NaCl, pH 7.4) precooled to 4°C. Mixing was conducted on ice, and then the samples were immediately transferred to a water bath set at 37°C. The final concentration of peptide was 0.5 mg mL−1 and the final composition of the fibrillogenesis buffer after mixing was 2.5 mM acetic acid, 30 mM TES, 30 mM Na2HPO4, and 67.5 mM NaCl, pH 7.4 (I = 0.09), herein referred to as fibril-forming buffer. The fibrillogenesis samples were tested for fibrils after being incubated for 24 hr at 37°C.

2.5 |. Electrophoresis

Modified SDS-PAGE techniques were used to monitor the self-association of 2U108 and 1U108 in solution, and to test the purity of the samples.[28] The standard denaturation condition was carried out following the standard protocol: 50 μL of sample at 0.5 mg mL−1 were mixed with 12.5 μL 5× SDS (5%) containing 0.2 M DTT or 2% β-Mercaptoethanol, or both, and boiled for ~45 min in partially sealed eppendorf vials. For denaturation under the nonreducing condition, the samples were prepared using the standard protocol but without the addition of any reducing agent. A mild denaturation condition was devised to denature the peptide by the addition of 2% SDS solution only: the samples did not contain any reducing agent, and were not subjected to heat denaturation (no boiling). In some of the experiments, the samples were prepared following the standard procedure but without boiling; this nonboiling (but reduced) condition was used to test the effectiveness of the reduction of the interchain disulfide bonds by reducing agent.

2.6 |. Electron microscope sample preparation

The 2U108 or 1U108 samples were prepared on 400 mesh formvar carbon-coated copper grids. Three microliters of incubated sample were deposited onto the grids and allowed to sit 100 s. The grids were then washed with deionized water by submersing the grids into water for 5 s. Immediately following this, 3 μL of a 1% sodium phosphotung-state solution, the staining agent, were applied to the grid and allowed to sit for 100 s. The grids were then washed again with deionized water in the previously indicated manner. The grids were air-dried overnight before being examined via electron microscopy (JEM-2100, Jeol).

2.7 |. Molecular model building

The 3D structures of the triple helix and the foldon domain were generated using the program spdbv. The coordinate files for PDB ID 1RFO and PDB ID 1BKV (triple helical peptide T3–785), for the foldon and the triple helix structures, respectively, were downloaded from RCSB PDB. To create the structural model of a section of the Col-domain, the residues of the T3–785 triple helix were modified to those of the Col-domain, followed by energy minimization after each substitution.

3 |. RESULTS

3.1 |. The triple helical peptides 2U108 and 1U108

The primary structures of peptide 2U108 and 1U108 are designed based on the original amino acid sequence of Col108. Different from Col108, the triple helix domain of 2U108 consists of only two sequence units, and that of 1U108 only one (Figure 1). Each sequence unit is composed of a 108-residue Col-domain, and an N-terminal (Pro-Pro-Gly)4 linker. The primary structure of all three peptides share a few common features including the C-terminal (Pro-Pro-Gly)4 nucleation sequence, the Cys-knot sequences at the N- and C-termini of the triple helix domain, and the C-terminal foldon domain. The foldon and the (Pro-Pro-Gly)4 nucleation sequence are included to ensure proper folding of the triple helix, and the Cys-knot to increase the stability of the triple helix conformation through a set of inter-chain disulfide bonds. Similar to Col108, the tripeptide (Gly-Thr-Pro) insert between the two sequence units of 2U108 is due to the use of a Kpnl restriction enzyme site in the gene. In all, peptide 2U108 has 295 residues, with 255 of them being in the triple helix domain and having a non-interrupted Gly-X-Y repeating sequence; peptide 1U108 has a total of 172 residues encompassing a 132-residue triple helix domain. As in the case of Col108, the amino acid residues in 2U108 have a periodic placement because of the repeating sequence units—the residues in the Col domain are being repeated after 123 residues. In contrast, none of the residues of 1U108 have any long-range periodic placement.

The CD spectra of both 2U108 and 1U108 in 5 mM acetic acid (pH 4) are indicative of a triple helix conformation, which is characterized by a deep negative peak at 197 nm and a small positive peak at 225 nm (Figure 2A). The Rpn values (the ratio of the two peaks) of ~0.09 for both peptides are comparable to that of a typical triple helix conformation having a high content of Gly-Pro-Pro sequences but no Hyp.[28] The triple helix conformation of both peptides is quite stable despite lacking the stabilizing Hyp in the Y-positions; the melting temperature of both peptides is ~41°C (Figure 2B). The foldon domain, the Cys-knots at both ends of the triple helical domain, and the high content of charged residues may all contribute to the stability.[3234] Despite the significant differences in size, the melting temperatures of U2108 and 1U108 are in good agreement with that of Col108.[28] Similar length-independent thermal-stability has also been observed in studies of bacterial collagens.[11,33]

FIGURE 2.

FIGURE 2

Both 2U108 and 1U108 form triple helices in solution. A, The CD spectra of 2U108 (solid symbol) and 1U108 (open symbol); B, The melting curves of 2U108 (solid symbol) and 1U108 (open symbol)

3.2 |. The self-association of 2U108 and 1U108 in TES buffer

Fibrillogenesis of collagen in vitro, and to an extent also in vivo, is a process mediated by electrostatic interactions[19]; triple helices obtained from acid-dissolved tissues in cold temperatures will spontaneously self-associate into fibrils once the pH is increased to ~7, and the temperature increased to the range of physiological temperatures, usually 25–37°C. The self-assemblies of 2U108 and 1U108 were studied following the same in vitro fibrillogenesis procedure of native collagen.[35,36] All samples were equilibrated in the refrigerator in pH 4 buffer for at least 48 hr to ensure proper folding. The fibril formation was initiated by raising pH and temperature. The self-association in the fibril-forming buffer was monitored using a modified SDS-PAGE experiment.[28] In this approach, in addition to the standard denaturation procedure utilizing SDS, boiling and the addition of reducing agent, the peptide samples were prepared using two other denaturation conditions. The mild denaturation condition, which includes the addition of SDS but no boiling and no addition of reducing agent, is devised to maximally preserve the aggregates by minimizing the disruptions of the noncovalent interactions and the unfolding. The nonreducing condition includes SDS and boiling but without the addition of reducing agent. Under this condition, all non-covalent interactions stabilizing the triple helices and the aggregates will be maximally disrupted, but the structures related to disulfide bonds will be preserved. This nonreducing condition can effectively probe the involvement of the disulfide bonds during the self-association of the triple helix.[28]

The SDS-PAGE studies of peptide 2U108 are shown (Figure 3A). Under mild denaturation, the triple helices with fully formed Cys-knots at the N- and/or C-termini are expected to migrate as a trimer on the gel; any self-assembled species formed during incubation in the fibril-forming (TES) buffer will appear as species with molecular weight higher than that of a trimer. Indeed, significant amount of aggregates of 2U108 are preserved and can be observed as bands with molecular weights higher than that of the peptide trimer (Figure 3A, Lane 3). After boiling but without the addition of the reducing agent (the nonreducing condition), only the trimer form of 2U108 was observed (Figure 3A, Lane 2). The complete disappearance of bands of high molecular weight species under the nonreducing condition indicates the self-assembled species of Lane 3 were stabilized by noncovalent interactions between the triple helices without the involvement of the inter-helical disulfide bonds. Finally, the addition of reducing agent (the standard denaturation condition) abolished the intra-helical disulfide bonds, and completed the unfolding of the triple helices and the aggregates into monomers (Figure 3A, Lane 1). Because the Cys-knots are known to form only in the structural context of a fully folded triple helix, the lack of any significant presence of dimer or monomer in lane 2 and lane 3 is a strong indication that the 2U108 peptide is in the triple helix form before the initiation of fibril formation. The trimer form of 2U108 often resolves into multiple bands on a gel under the nonreducing condition due to the compounded effects of the rod-shape of the triple helix conformation and the cross linking of the inter-chain disulfide bonds at the ends. The fully folded, trimeric 2U108 has a large aspect ratio: 150 nm long and only 1–2 nm in diameter. With the inter-chain disulfide bonds intact, some of the partially unfolded triple helices may retain a significant level of the rod-shape, and behave differently from the coiled form of the unfolded molecules during the electrophoresis. The multiple trimer bands reflect the different mobilities of a heterogeneous population of partially folded triple helices varying in degrees of “foldedness” and compactness. Without the complications of shape and residual structures, the monomers in Lane 1 emerge as a single, clearly defined band on the gel. A small amount of dimer is often present despite extensive boiling in the presence of reducing agent(s) (~45 min). Each Cys-knot can potentially form three inter-chain disulfide bonds in a folded triple helix; the complete reduction of the multiple disulfide bonds proves to be difficult.

FIGURE 3.

FIGURE 3

The self-association of 2U108 and 1U108 in solution. A, SDS-PAGE of 2U108 after incubation in TES buffer (lanes 1–3), under the conditions of standard denaturation (SD, Lane 1), nonreducing condition (NR, Lane 2), and mild denaturation (MD, Lane 3). Lanes 4–6: samples of 2U108 in HAc buffer under the conditions of standard denaturation (SD, Lane 4), nonreducing condition (NR, Lane 5), and mild denaturation (MD, Lane 6). The samples were separated by an empty lane to prevent contamination caused by the diffusion of the reducing agent in the gel. The picture in A is composed of different parts of the same gel merged together at the vertical dotted lines. B, SDS-PAGE of 1U108 in TES buffer under the conditions of mild denaturation (MD, Lane 1), nonreducing condition (NR, Lane 2), standard denaturation (SD, Lane 3) and nonboiling condition (NB, Lane 4). Reduced samples are loaded to the right half of the gel (separated by Lane M of molecular marker) to minimize the effects of the diffusion of the reducing agent. The thin arrows on top mark the interface of the stacking gel and the separation gel, which occasionally gave the appearance of a thin band even in lanes without any protein. The lane M in B is Prestained Protein Standard (GenScript), molecular weight range 270–15 kDa (the 5 kDa standard usually cannot be resolved at the acrylamide concentrations used)

In contrast, no bands of molecular weights higher than that of the trimer were observed under the mild denaturation condition for the original 2U108 sample in HAc buffer before fibril formation (Figure 3A, Lane 6); the trimer band(s) appeared to be the major species present with a trace amount of dimer. Similarly, only trimers were observed under the nonreducing condition (Figure 3A, lane 5), and only monomers are found in the presence of the reducing agent (Figure 3A, lane 4). A portion of the sample loading wells is kept in the gel pictures of Figure 3 to check on the possible presence of any aggregates that might be too large to enter the stacking gel and/or the separation gel.

The 1U108 molecule appears to form aggregates as well as shown in Figure 3B. Several bands with molecular weights higher than that of the trimer are clearly observable under the mild denaturation condition (Figure 3B, Lane 1). The aggregates are reversibly returned to trimers by boiling under the non-reducing condition (Figure 3B, lane 2); and only the monomer form is present under the standard denaturation condition (Figure 3B, lane 3). Interestingly, a large portion of the denatured peptide migrated as a trimer in the presence of reducing agent but without boiling (Figure 3B, Lane 4). We suspect the residue structure of the triple helix under this condition prevented full reduction of the disulfide bonds of the Cys-knots.

In summary, the SDS-PAGE results clearly demonstrated the self-association of both 2U108 and 1U108 upon incubation in the fibril-formation buffer. These aggregates do not involve disulfide bonds, and are reversible under nonreducing conditions. It needs to be pointed out that, while effective, the SDS-PAGE approach only provides qualitative information on the aggregation. It is impossible to infer the degree of self-association of the peptides based on this approach alone. The presence of SDS alone can cause considerable dissociation of the aggregates. The actual amount of the aggregates could be significantly greater than what is indicated by the number and density of high molecular weight bands. Neither can this technique provide any information on the size or shape of the aggregates.

3.3 |. The characterization of the self-assembled aggregates

The size, shape and structural features of the aggregates of 1U108 and 2U108 in fibril-forming buffer were examined using transmission electron microscopy (TEM). As shown in Figure 4, the aggregates of 2U108 have the appearance of smooth, mini-fibrils with tipped ends, similar to those observed for Col108.[28] The mini-fibrils are about 500 nm to 1 μm in length, with the diameters of the central part of the mini-fibrils varying between ~20 nm and ~75 nm. The tipped ends are characteristic of collagen fibrils formed by lateral association of the triple helices.[22] A closer examination of the negatively stained mini-fibrils of 2U108 revealed the recognizable striated banding pattern of ~35 nm, consisting of a dark band of ~25 nm and a white strip of ~10 nm (Figure 4B,C). This banding pattern resembles those of collagen and Col108 in appearance, and agrees with the d-period of Col108 mini-fibrils in size.

FIGURE 4.

FIGURE 4

The d-period of 2U108 mini-fibrils. The TEM images of 2U108 in fibril forming buffer after incubation at 37°C for 24 hr (A–C), and in 5 mM HAc buffer D. The scale bars are 1000 nm in A, 500 nm in (B), and 200 nm in C and D. The insert in C is an expanded view to show the 35-nm striation (scale bar: 200 nm). The black arrowheads in B indicate regions where the d-period is clearly identifiable under the microscope at the stated magnification; the resolution is compromised after the image is transferred to create the figures

The mini-fibrils are formed only after being transferred into the pH 7 buffer. The TEM image of 2U108 in HAc reveals a striking contrast, showing a uniform background of 2U108 triple helices with no mini-fibrils (Figure 4D). The thread-looking “structures” appear to be 2U108 monomers (triple helices) judging by their size (expected triple helix ~85 nm in length, ~1–2 nm in diameter). No large aggregates were observed. The inhibition of fibril formation in pH 4 buffer is also known for native collagens.[20] The low pH condition can, presumably, disrupt the electrostatically driven interactions between the helices by promoting the protonation of acid residues. The sensitivity of the self-assembly of 2U108 to pH thus supports the view that the mini-fibrils form by a mechanism similar to that for the formation of native collagen fibrils.

The aggregates of 1U108 in pH7 buffer, on the other hand, have very different appearances from that of 2U108 (Figure 5). A 1U108 triple helix is expected to have a length of about 50 nm, and be ~1–2 nm in diameter. There appears to be a considerable degree of self-association to form aggregates up to about 200 nm in length or, occasionally, quite a bit larger (Figure 5D). Most of the aggregates have an elongated shape with a high aspect ratio. They are likely formed by lateral association of the triple helices, but do not follow any specific pattern. There is no discernable banding pattern or other structural features—though this observation is limited by the resolution of the TEM and staining technique. We, therefore, consider the self-associated species of 1U108 to be nonspecific aggregates.

FIGURE 5.

FIGURE 5

The nonspecific aggregates of 1U108 in TEM buffer. The scale bars are 1000 nm in A, 500 nm in B and C, and 200 nm in D. The images in A–C are of different samples after the same treatment; image D is a further magnification of a region of image B

4 |. DISCUSSION

It is no surprise that the 2U108 mini-fibrils have the same d-period as that of Col108 mini-fibrils formed under the same conditions. We anticipated the self-assembly of 2U108 mini-fibrils to follow the same unit-staggered mechanism (Figure 6A).[28] We previously identified two specific factors that work synergistically to make this unit-staggered arrangement the unique, most stable conformation emerging from the self-assembly of Col108: the optimal alignment of interacting residues of associating helices and the reiteration of these interactions through the repeating sequence units. The similar sequence architecture of 2U108 to Col108—the tandem repeats of the same sequence unit—suggests the same stabilizing factors will also be present during the self-assembly of 2U108 (Figure 6A,B).

FIGURE 6.

FIGURE 6

The self-assembly model of 2U108 and of 1U108. A, The d-period of 2U108 and Col108 mini-fibrils in a two-dimensional schematic presentation. The triple helices are shown as a stick with white vertical bars marking the boundaries of each sequence unit; foldon is shown as a circle (roughly in scale relative to that of the triple helix domain). The dark and white boxes represent the negative staining pattern. B, An expanded view of the orange box in A showing the alignment of the sequence units and the interacting residues in the unit-staggered mini-fibrils; the triple helix is presented as a rectangle with the two sequence units shown in different colors; the dark vertical bars mark the locations of charged residues, and the foldon is shown as a filled circle. The alignment of the side chains in the pink box is further expanded using the stick-and-ball presentation adopted from the crystal structure of triple helix T3–785 (PDB ID: 1BKV); the peptide backbone and the Pro residues are shown in shades of grey; the blues and reds are, respectively, positive and negatively charged side chains, green the polar side chains, and black the hydrophobic side chains. C, a few possible assemblies of 1U108; the foldon is shown as a circle; a white vertical bar indicates the C-terminus of the Col-domain. D, two alternative packings of the unit-staggered 2U108 having different densities of gaps (see text)

The stabilizing interactions of 2U108 and Col108 mini-fibrils come from residues in the Col-domain and, to a smaller extent, from the foldon domain. Regions having high content of hydrophobic residues, as well as clusters of charged groups can be readily identified from the amino acid sequence of the Col-domain (Figure 1). In a unit-staggered arrangement, these residues will be placed in the close vicinity of comparable residues from the neighboring helices, which promotes the stabilizing interactions (Figure 6B, and in Ref. 28). The residues on the surface of the foldon domain can potentially interact with the neighboring triple helices and contribute to the stability of the fibril assembly. These foldon interactions, however, are limited in extent and are not considered a deterministic factor for the self-assembly process of the mini-fibrils.[28] This situation is further demonstrated by the lack of any d-periodic mini-fibrils for 1U108. Having the same foldon domain and Col-domain, any interactions involving foldon in the self-assembly of Col108 and 2U108 mini-fibrils are available for the self-association of 1U108. Yet, no d-periodic mini-fibrils are observed for 1U108. Lacking periodicity in the primary sequence to direct the specific, staggered assembly of the triple helices, the 1U108 interactions only lead to non-specific aggregates.

For a specific structure to emerge from a self-association, or a folding, process, there must be a stabilization bias toward the specific set of molecular interactions for the desired conformation. The size of a triple helix has profound effects on the self-association of the triple helix because it is directly linked to the number of available interacting residues. In studies of bacterial collagens, it was suggested that the limited self-association of a bacterial collagen variant with a size ~1/5 the length of human fibrillar collagen triple-helix is due to its limited size; an increase in its length may promote the self-association.[1,11] The triple helix of 1U108 is only about 1/10 the length of a human fibrillar collagen. Yet, there appears to be sufficient molecular interactions between the helices, albeit not in a conformation-specific way, to cause aggregation. More than the insufficient size, the lack of any 1U108 mini-fibrils is likely due to the absence of a design element favoring the d-staggered self-association over other possible conformations. The contact area between two adjacent 2U108 triple helices in the unit-staggered mini-fibril is more or less the size of a 1U108 molecule. Because of the tandem sequence units of the primary structure of 2U108, such interactions can propagate to other associating helices in the unit-staggered assemblage, and ultimately make the d-periodic mini-fibrils the most stable conformation to arise from the process.

A successful design strategy often needs to include a mechanism to weaken other potential competing, or misfolded, conformations. We previously argued that the slightly bulkier foldon domain at the C-terminus may play a critical role in this regard by inhibiting end-on-end stacking of triple helices during the self-assembly.[28] The end-on-end stacking, also referred to as the in-register stacking, represents a conformation with the maximum alignment of the interacting residues, and should therefore be the one that has the highest extent of interaction and, thus, the highest stability. Yet we did not observe such a structure in any of the three peptides studied. The tightly packed, trimeric, beta-hairpin propeller conformation of the foldon has a diameter ~25 Å,[37] which is quite a bit larger than that of a triple helix (~15 Å). We attribute the lack of the end-on-end stacking conformation to the steric hindrance of the bulkier foldon domain at the C-terminal end during the self-assembly. A full understanding of how this bulky structure of the foldon is accommodated in the smooth fibrils of Col108 and 2U108 would require more high-resolution structural studies of the mini-fibrils. A close examination of the structure of the foldon suggests that the effects of its bulkiness may be alleviated somewhat by its unique shape (Figure 7A). Viewed by the threefold symmetry axis of the foldon that is aligned with the axis of the triple helix, the foldon conformation has three slightly concaved faces, perfect for a snugging fit of a triple helix (Figure 7C). We suspect this close packing of triple helices on the curved surfaces of the foldon provides a way for the mini-fibrils to circumvent the steric constraints of the foldon. Nevertheless, the bulker size of the foldon domains inside the mini-fibrils may still cause steric tension, and can potentially destabilize and/or limit the growth of the fibril assembly. For future applications, it may prove advantageous or even necessary to remove the foldon domain in the development of collagen-mimetic fibrils. The removal of the foldon from the current construct of Col108 and/or 2U108 can be achieved by including an enzyme digestion site between the foldon and the triple helix domain. However, in the place of a foldon, a new design feature would need to be developed and included to prevent the end-on-end stacking of the triple helices during the self-assembly.

FIGURE 7.

FIGURE 7

Molecular model of the packing of the foldon in mini-fibrils. A, the structure of the foldon domain showing the 3-fold symmetry. The diameter as shown by the double headed arrow is 2.5 nm; the radii of the end-on image shown to the right are 1.25 and 0.88 nm, respectively, for the long and short arrows. B, the lateral packing of three triple helices. C, the packing of triple helices on the concave surfaces of the foldon. Pictures B and C are shown at the same scale

The conformational uniqueness of the mini-fibrils is characterized by the d-period—the periodic axial spacing of the gaps and/or the overlaps. The structural characterization based on TEM, using both negative staining and positive staining, and AFM[28] only offers limited resolution on the three-dimensional structure of the mini-fibrils. The gap regions of the mini-fibrils, as well as that of fibrillar collagens, usually appear as a continuous dark band wrapping around the fibril on negatively stained TEM images. The resolution of TEM leaves other structural details of the region unresolved. There are apparently different ways of packing the unit-staggered 2U108 triple helices into mini-fibrils while retaining the 35 nm axial spacing of the gap. As shown in the two-dimensional presentations in Figure 6D, going transversely across the mini-fibrils in the regions marked GB and GA, the gaps are about four to five triple helices apart in GB, but are much closer together in GA—separated by only one triple helix at times. Given the diameter of a triple helix of ~1.5 nm, and the resolution of TEM of ~5 nm, both GA and GB would look like a 25 nm dark band across the whole mini-fibril when examined by TEM. These mini-fibrils with different packings would reveal the same d-period: dark bands every 35 nm intercalated with white strips of ~10 nm. Although the unit-stagger between neighboring triple helices is preserved in all the arrangements, the different packing will generate mini-fibrils with nonuniform distributions of the gaps. It is not clear how the different packing may affect the relative stability of the mini-fibrils. Unless one particular organization is significantly more stable than the others, the self-assembly of 2U108 is likely to be a mixture of mini-fibrils with different distributions of gaps. How such variation in the distribution of gaps affects the other properties of the mini-fibrils remains an interesting question to be fully explored.

The design motif of tandem repeats of sequences has been utilized in other studies to develop collagen-mimetic fibrils.[1,5,11,12] Using recombinant gene technology, Fertala and colleagues generated several variants of type II collagen, including the m4D peptide which consists of four tandem repeats of the amino acid sequence corresponding to the D4 period of type II collagen, followed by the 0.4D section at the C-terminus.[5,12] This recombinant variant formed a stable triple helix, and can interact with integrin when coated on a molecular scaffold of nano-fibers. Unfortunately, the fibril-forming properties of m4D were not reported. It remains very interesting to see if the m4D variant, or any other recombinant variants of collagen II, can self-assemble into D-periodic fibrils. In their study of bacterial collagen Scl2, Yoshizumi et al reported on a recombinant triple helix consisting of two tandem collagen domains, the peptide CL-CL, that formed fibrillar structures at neutral pH.[11] The fibril structure was characterized as bundled in-register arrays of the triple helices, but had no D-period-like axial structures. The authors attributed the lack of D-periodicity to the insufficient size of the CL-CL triple helix. In line with our understanding of the self-assembly of the Col108 and 2U108 mini-fibrils discussed above, we think, in addition to increasing the size of the triple helix domains, the design of the CL-CL triple helix can benefit from the inclusion of an overhang unit to explicitly establish the structure of the overlap. The in-register array of CL-CL supports our reasoning that end-on-end stacking is the most stable conformation to emerge from the self-association process of a triple helix having repeating sequence units. At the same time, it also highlights the need for a factor equivalent to a foldon to favor the d-staggered arrangement over the in-register stacking during self-assembly.

The approach of developing d-mini-fibrils using the designed strategy utilized for triple helices Col108 and 2U108 is quite robust. A peptide having the three Col-domains of Col108 replaced by another domain consisting of different amino acid residues also formed d-periodic fibrils, having essentially the same structural features as that of the Col108 and 2U108 mini-fibrils (personal communication, Fang Fang Chen). We believe the mini-fibrils and the design strategy presented here will lead to the development of new biomaterials for a broad range of applications.

5 |. CONCLUSION

The identical d-period of 2U108 and Col108 mini-fibrils indicates a similar molecular recognition process during the self-assembly of the two molecules, which mirrors the similarities in their primary structures. The unit-staggered model can explain both the size of the d-period and that of the gap and the overlap regions of the mini-fibrils: the d-period is determined by the size of the sequence unit, and the 0.3d overhang unit contributes to the overlap region. The specific self-assembly of the mini-fibrils is ultimately determined by the optimization of non-covalent interactions of the associating helices; no interhelical disulfide bonds or other covalent bonds are involved. The interactions of the residues on the surface of the helices stabilize the self-assembly, while the tandem repeats of the sequence unit determine the structural specificity of the d-period by prescribing a unique way to maximize those interactions. Without such an explicitly designed stability-bias, the self-association of triple helices of 1U108 only led to non-specific aggregates, despite having the same interacting residues. The fibril forming process of 2U108 and Col108 share the same sensitivity to pH and temperature as that of native collagen fibrils, indicating the same kind of molecular interactions are involved in the self-assembly process. The periodic mini-fibrils of 2U108 demonstrate the robustness of tandem repeats of sequence units as a design strategy for collagen-mimetic biomaterial.

Funding information

National Science Foundation, Division of Chemistry, Grant number: CHEM1022120; National Institute of General Medical Sciences, Grant number: 1SC1GM121273–01; PSC-CUNY, Enhanced Award 2014, The City University of New York; National Science Foundation, Division of Molecular and Cellular Biosciences, Grant number: MRI DBI 0521709

REFERENCES

RESOURCES