Abstract
Recombinant expression of eukaryotic proteins in Escherichia coli is often limited by poor folding and solubility. To address this problem, we employed a recently developed genetic selection for protein folding and solubility based on the bacterial twin-arginine translocation (Tat) pathway to rapidly identify properly folded recombinant proteins or soluble protein domains of mammalian origin. The coding sequences for 29 different mammalian polypeptides were cloned as sandwich fusions between an N-terminal Tat export signal and a C-terminal selectable marker, namely β-lactamase. Hence, expression of the selectable marker and survival on selective media was linked to Tat export of the target mammalian protein. Since the folding quality control feature of the Tat pathway prevents export of misfolded proteins, only correctly folded fusion proteins reached the periplasm and conferred cell survival. In general, the ability to confer growth was found to relate closely to the solubility profile and molecular weight of the protein, although other features such as number of contiguous hydrophobic amino acids and cysteine content may also be important. These results highlight the capacity of Tat selection to reveal the folding potential of mammalian proteins and protein domains without the need for structural or functional information about the target protein.
Keywords: aggregation, folding quality control, misfolded protein, protein export, protein folding and solubility, selectable marker, twin-arginine translocation
Introduction
Recombinant expression of native or modified eukaryotic proteins in Escherichia coli is key for the production of protein pharmaceuticals and for structure determination. In fact, E. coli continues to be the expression system of choice for many aglycosylated therapeutic proteins and also for high-throughput, multiplexed cloning, expression and purification of proteins for structural genomics.1 However, expression of eukaryotic proteins in E. coli is frequently limited by improper folding, aggregation, and inclusion body formation. This is because prokaryotic expression systems lack certain factors such as chaperones, natural binding partners, or post-translational processing machinery that are often needed for correct folding of eukaryotic target proteins. Indeed, expression analysis of 2078 full-length C. elegans genes in E. coli revealed that only 11% were soluble.2 Likewise, only about 25% of 44 cloned human proteins were soluble following expression in E. coli.3 It should be noted that misfolded proteins often accumulate as insoluble aggregates; hence the property of protein solubility is a reliable indicator of correct folding4,5 and is commonly used as a readout of intracellular folding efficiency.
A number of strategies have been developed to improve soluble expression of eukaryotic proteins. One of the simplest approaches is to reduce the protein translation rate by decreasing the temperature6 or inducer concentration7 to a level that favors correct folding. A slightly more laborious strategy is to coexpress folding modulators such as stabilizing binding partners8 or molecular chaperones.9 The host itself can be genetically modified to promote oxidative protein folding in the cytoplasm,10,11 over-express rare tRNAs12 or more efficiently accumulate membrane proteins.13,14 When changing the intracellular folding environment fails to yield correctly folded proteins, soluble proteins can sometimes be obtained by in vitro refolding or instead by synthesizing the proteins entirely in vitro using cell-free translation.15
Since many proteins are recalcitrant to the solubilization techniques described earlier, direct modification of the protein itself may be required. Truncating large multidomain proteins into separate domains can enhance solubility, and has been performed successfully for numerous proteins including the Ephb2 receptor16 and IgG antibodies.17 Soluble expression can also be improved by genetic fusion of the target protein to a solubility enhancing tag such as the maltose binding protein (MBP), thioredoxin (Trx), or glutathione-S-transferase (GST)18–21 or by directed evolution methods, in which protein diversity libraries are interrogated for soluble variants.22–25 This latter approach is made possible by the recent development of several new protein solubility assays that do not require structural or functional information about the target protein. These assays are based on the notion that a misfolded, insoluble protein will eliminate the activity of a C-terminally fused reporter protein. To date, several different reporter genes have been employed in this type of assay including chloramphenicol acetyl-transferase (CAT),26 dihydrofolate reductase (DHFR),23 green florescent protein (GFP),5,27 and β-galactosidase (β-gal).22 Even membrane protein expression is amenable to this technique.28
Along similar lines, we previously reported a novel genetic selection for protein folding in E. coli based on the observation that transport through the bacterial twin-arginine translocation (Tat) pathway depends on correct folding of the substrate protein prior to transport.29 Protein substrates of interest were fused at their C-terminus to the selectable marker protein TEM-1 β-lactamase (Bla), and directed through the Tat pathway via an N-terminal signal peptide derived from E. coli trimethylamine-N-oxide reductase TorA [ssTorA, Fig. 1(a)]. Importantly, the survival of E. coli cells on selective medium correlated with the solubility of the target proteins of interest [Fig. 1(b)]. Using this assay, we recently isolated solubility-enhanced variants of Alzheimer's Aβ42 peptide29 and single-chain Fv (scFv) antibodies30 from large combinatorial libraries. These studies confirm that the folding quality control (QC) feature of the Tat export pathway can be harnessed for discriminating between folded and misfolded proteins, and for molecular evolution of protein fitness in the cytoplasm of E. coli. The advantages of this method versus previously developed protein folding assays are described elsewhere.29,30
Figure 1.

Selection of folding competent proteins based on the quality control feature of the Tat pathway. (a) Schematic of pSALect showing the ssTorA signal peptide followed by a mini-multiple cloning site (MCS) and finally the TEM-1 Bla sequence. The arrow indicates the IPTG-inducible promoter. The plasmid carries the cat gene for chloramphenicol resistance. (b) Schematic showing the basis for the Tat folding selection where ssTorA is the Tat-specific signal peptide from the E. coli trimethylamine-N-oxide reductase TorA enzyme and POI is the protein-of-interest. Those chimeras that are competent for Tat export (i.e., correctly folded) colocalize Bla to the periplasm and render cells resistant to β-lactam antibiotics. Those that are incapable of Tat export due to incorrect folding render cells sensitive to β-lactam antibiotics.
Here, we investigated whether our Tat-mediated genetic selection technique could be used to evaluate the folding and solubility of structurally and functionally diverse eukaryotic proteins. This was accomplished by cloning 30 different proteins (29 mammalian + GFP) between the Tat-dependent ssTorA signal peptide and the selectable marker Bla. The antibiotic resistance phenotype of cells expressing these fusion proteins was characterized in detail and the results allowed us to derive some general conclusions regarding which protein features correlated with correct folding in the cytoplasm of E. coli cells.
Results
A rapid method for Tat-mediated expression and selection of ORFs in E. coli
In this study, we developed a recombinational strategy using the “GATEWAY” cloning system,31 which is based on a modification of phage lambda site-specific recombination.32 Here, we designed primers with 5′ attB1 and 3′ attB2 linkers to PCR amplify open reading frames (ORFs) corresponding to mammalian proteins or protein domains [Fig. 2(a)]. Following PCR, the resulting products were recombined with pDONR/Zeo (Invitrogen) to yield a set of entry clones that were all sequence confirmed. In addition, a destination vector called pDEST-Tat was constructed by modifying pSALect,29 which is a vector previously developed in our laboratory for creating sandwich fusions between an N-terminal ssTorA signal peptide and C-terminal Bla. Recombination between the destination vector pDEST-Tat and any pENTR-ORF entry vector resulted in a final vector called pTatEXP-ORF that enabled expression of ssTorA-ORF-Bla tripartite fusions [Fig. 2(a)]. These fusions are specifically targeted to the Tat protein export pathway and confer resistance to ampicillin (Amp) if the Bla moiety is colocalized to the periplasm by the ORF-encoded protein of interest.29 Since export efficiency via the Tat pathway is regulated by the folding and solubility of the substrate in the cytoplasm prior to export,29,33 this strategy enables direct selection of proteins or protein fragments that are soluble in E. coli.29,30 As a consequence of the recombination reaction between pDEST-Tat and pENTR-ORF, attB1 and attB2 sequences are introduced at the 5′ and 3′ ends of each ORF in pTatEXP-ORF. Since these recombination sites are translated, an additional 12 amino acids are introduced at the N- and C-termini of the ORF-encoded protein. To determine whether these sites interfered with Tat-mediated protein export and subsequent selection of ssTorA-ORF-Bla chimeras, we cloned and expressed the green fluorescent protein (GFP) using our GATEWAY strategy. Specifically, pTatEXP-GFP was constructed using the strategy outlined in Figure 2(a) and transformed in wild-type (wt) E. coli strain MC4100 and also in a Tat-deficient mutant strain derived from MC4100 called B1LK0 that lacked the essential TatC component (ΔtatC) of the Tat export pathway.34 Expression of ssTorA-GFP-Bla from pTatEXP-GFP in wt cells resulted in strong fluorescence that localized throughout cells and occasionally showed a polar localization [Fig. 2(b)], consistent with the subcellular distribution seen earlier for Tat-targeted ssTorA-GFP.35 It is noteworthy that cells expressing ssTorA-GFP-Bla with the attB1/B2 linkers were somewhat less resistant to 100 μg/mL Amp compared with wt cells expressing ssTorA-GFP-Bla from pSALect [Fig. 2(c) and Supporting Information Figure 1]. Importantly, however, wt cells expressing ssTorA-GFP-Bla from pTatEXP-GFP were significantly more resistant to 100 μg/mL Amp than ΔtatC cells expressing the same construct, which exhibited only a background level of resistance to this amount of Amp [Fig. 2(c)]. This was entirely consistent with our earlier observation that fusions between Bla and soluble proteins such as GFP can confer significant resistance to wt cells following Tat-dependent export.29 It should also be noted that when Amp was excluded from the medium (i.e., nonselective conditions), wt and ΔtatC cells expressing ssTorA-GFP-Bla from pTatEXP-GFP grew equally well [Fig. 2(c)]. These results confirm that our recombinational cloning strategy can be used to rapidly introduce ORFs of interest between ssTorA and Bla, and that the resulting chimeras are competent for Tat-mediated genetic selection.
Figure 2.

Design and validation of Gateway cloning system for Tat-based selection of mammalian proteins. (a) Gateway cloning of any open reading frame (ORF) of interest is accomplished by: PCR cloning with attB containing primers; BP reaction between PCR product and pDONR/Zeo vector to create entry plasmid pENTR-ORF; LR reaction between pENTR-ORF and destination vector pDEST-Tat to create final Tat selection vector pTatEXP-ORF that expresses ssTorA-ORF-Bla fusion proteins. (b) Fluorescence microscopy of wt E. coli MC4100 cells expressing ssTorA-GFP-Bla from pTatEXP-GFP. (c) Spot plating of serially diluted cells on LB agar supplemented with no Amp (top panel) or 100 μg/mL Amp (bottom panel). Each 5-μL aliquot contained an equivalent number of MC4100 (wt) or B1LK0 (ΔtatC) cells expressing ssTorA-GFP-Bla from the plasmid indicated. Overnight cultures were serially diluted 10-fold as indicated by arrow, with 5 μL of undiluted overnight cells plated in the first column.
Selection of mammalian proteins that are soluble in bacteria
Next, a total of 29 mammalian proteins and protein domains were evaluated for Tat-mediated expression and selection. These proteins were of human or murine origin, and represented several diverse protein families with extracellular, cytoplasmic, and nuclear cell locations (Table I). To determine the limits of Tat selection, we chose a mixture of full-length and truncated proteins. Protein truncation designs were described recently21 and were based on individual domains annotated from the SwissProt or Pfam databases or previous examples of successful expression. The genes were PCR amplified from earlier developed entry plasmids21 using attB-containing primers and the resulting products were subjected to the complete GATEWAY cloning strategy described earlier [see Fig. 2(a)]. Following cloning, each pTatEXP-ORF vector was transformed in both wt and ΔtatC cells and the resulting transformants were phenotypically selected by spot plating 5 μL of serially diluted cells on 100 μg/mL Amp. For 12 of the proteins tested, including Ephb2(LB), Ephb2(TK), Efnb2(EC1), and Epha2, there was no phenotypic difference between wt and ΔtatC cells as visualized by spot plating [Fig. 3(a)]. This data was quantified by normalizing the smallest dilution at which growth of ΔtatC cells was observed by the smallest dilution at which wt cells appeared to grow; hence each of these clones was scored a value of 1 [Table I and Fig. 4(a)]. In all of these cases, the growth of cells on plates supplemented with Amp was very weak (i.e., only observed at dilutions of 100 and 10−1). The remaining 17 proteins all conferred greater resistance when expressed in wt cells compared with their ΔtatC counterparts. For 10 of these, such as Ephb2(SAM) and Efna1(EC), the difference in resistance observed between wt and ΔtatC cells was small (i.e., 10-fold difference in smallest dilution that still supported growth) [Table I, Figs. 3(a) and 4(a)]. In contrast, seven of these proteins (CASP2, CDKN1B, Efna1(FL), GATA2, MAX, MMP1, and RAF1) conferred significantly more resistance (i.e., ≥100-fold difference in smallest dilution that supported growth) to wt cells than to ΔtatC cells [Table I, Figs. 3(a,b) and 4(a)]. The minimum bacteriocidal concentration (MBC) on Amp for each of these seven proteins was carefully determined and found to range between 50 and 500 μg/mL as denoted in Figure 4(a). For comparison, the MBC of the very soluble ssTorA-GFP-Bla fusion was found to be 400 μg/mL, whereas for the weakly expressed ssTorA-Ephb2(LB)-Bla and ssTorA-Ephb2(TK)-Bla fusions, the MBC values were 6 and 3 μg/mL, respectively. These latter values are comparable with the MBC reported for plasmid-free wt MC4100 cells.37 As an independent measure of selective cell growth, we determined the specific growth rate of cells cultured in liquid medium that had been supplemented with 100 μg/mL Amp. This has been previously shown to be an effective approach for quantifying Tat export efficiency of Bla fusions.29,38 Indeed, the specific growth rates for cells expressing each of the 29 mammalian proteins were in close agreement with the spot plating results with only a few minor exceptions [cf. Fig. 4(a,b)]. The Ephb2(SAM) and GRB2, which only conferred a small growth advantage to wt versus ΔtatC cells, exhibited unexpectedly fast growth in liquid culture. We suspect that this discrepancy is due to differences in solid versus solution culture conditions that may affect the solubility of these particular clones. Importantly, six of the seven mammalian proteins that conferred the greatest resistance to cells on agar plates were all found to confer rapid growth to cells in liquid culture experiments [Fig. 4(b)]. The one exception, CASP2, did not grow very quickly in liquid culture. Thus, CASP2 appears to be of intermediate solubility because, even though it conferred significant growth to wt but not ΔtatC cells, it only grew at dilutions between 100 and 10−2, and exhibited an MBC (50 μg/mL Amp) that was greater than the negative clones but less than the most soluble clones.
Table I.
Selective Growth Conferred by Mammalian Proteins
| No. | Proteina | Domainb | Organismc | Subcellular Location | Mw (kDa) | pI | Cys (%) | GRAVYd | hp_aae | DF ratiof | Specific growth rate (h−1) | MBC (μg/mL) | STE ratiog | CATHh |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Ephb2 | LB | Mm | Extracellular | 22.5 | 5.8 | 2.2 | −0.14 | 4 | 1 | 0.004 | 6 | 0.15 | 2 |
| 2 | Ephb2 | TK | Mm | Cytoplasm | 35.3 | 5.6 | 1.6 | −0.27 | 5 | 1 | 0.009 | 3 | 0.07 | 4 |
| 3 | Ephb2 | SAM | Mm | Cytoplasm | 8.3 | 4.9 | 0 | −0.03 | 2 | 10 | 0.217 | 100 | 0.16 | 1 |
| 4 | Efnb2 | EC2 | Mm | Extracellular | 20.1 | 8.6 | 2.2 | −0.64 | 3 | 1 | 0.004 | 9 | 0.32 | 2 |
| 5 | Efnb2 | EC1 | Mm | Extracellular | 16.6 | 5.3 | 2.7 | −0.47 | 3 | 1 | 0.009 | 9 | 0.06 | 2 |
| 6 | Efna1 | FL | Mm | Extracellular | 21.9 | 6.4 | 2.1 | −0.59 | 8 | 104 | 0.287 | 400 | 0.42 | 2 |
| 7 | Efna1 | EC | Mm | Extracellular | 16.2 | 6.5 | 2.9 | −0.86 | 2 | 10 | 0.013 | 25 | 0.36 | 2 |
| 8 | Epha2 | LB | Mm | Extracellular | 21.1 | 4.7 | 2.7 | −0.30 | 4 | 1 | 0.013 | 6 | 0.03 | 2 |
| 9 | GATA2 | FL | Hs | Nuclear | 50.3 | 9.7 | 2.7 | −0.51 | 13 | 106 | 0.235 | 500 | 0.09 | nc |
| 10 | Fli1 | FL | Mm | Nuclear | 51.0 | 6.6 | 0.9 | −0.79 | 3 | 10 | 0.017 | 25 | 0.03 | nc |
| 11 | Trp53 | FL | Mm | Nuclear/cytoplasm | 43.5 | 7.0 | 3.1 | −0.59 | 3 | 10 | 0.009 | 12 | 0.03 | nc |
| 12 | Mdm2 | FL | Mm | Nuclear/cytoplasm | 54.5 | 4.5 | 3.5 | −0.83 | 4 | 1 | 0.004 | 12 | 0.19 | nc |
| 13 | Mdm2 | p53-bd | Mm | Nuclear/cytoplasm | 11.7 | 8.8 | 0.5 | −0.25 | 4 | 1 | 0.021 | 12 | 0.42 | nc |
| 14 | GRB2 | FL | Hs | Cytoplasm | 25.2 | 5.9 | 0.9 | −0.67 | 5 | 10 | 0.319 | 30 | 0.31 | 3 |
| 15 | EGFR | TK | Hs | Cytoplasm | 37.3 | 5.5 | 1.8 | −0.22 | 3 | 1 | 0.009 | 9 | 0.00 | 4 |
| 16 | RAF1 | Ras-bd | Hs | Cytoplasm | 9.2 | 9.9 | 3.8 | −0.30 | 3 | 106 | 0.224 | 400 | 0.52 | 3 |
| 17 | HRAS | FL | Hs | Cytoplasm | 21.3 | 5.0 | 3.2 | −0.42 | 4 | 1 | 0.009 | 25 | 0.52 | 3 |
| 18 | JUN | FL | Hs | Nuclear | 35.7 | 9.0 | 0.9 | −0.47 | 3 | 10 | 0.017 | 25 | 0.05 | nc |
| 19 | FOS | FL | Hs | Nuclear | 40.7 | 4.6 | 2.1 | −0.37 | 5 | 10 | 0.040 | 25 | 0.01 | nc |
| 20 | MAD | FL | Hs | Nuclear | 25.3 | 8.9 | 1.4 | −0.97 | 2 | 10 | 0.013 | 25 | 0.52 | nc |
| 21 | MAX | FL | Hs | Nuclear | 18.3 | 5.9 | 0 | −1.32 | 2 | 105 | 0.209 | 50 | 0.27 | nc |
| 22 | CDK2 | FL | Hs | Cytoplasm | 33.9 | 8.9 | 1 | −0.08 | 4 | 10 | 0.013 | 25 | 0.35 | 1 |
| 23 | CDK4 | FL | Hs | Cytoplasm | 33.7 | 6.6 | 1.3 | −0.17 | 4 | 10 | 0.039 | 12 | 0.25 | 4 |
| 24 | CCND2 | FL | Hs | Cytoplasm | 33.1 | 4.9 | 4.1 | −0.21 | 4 | 1 | 0.005 | 9 | 0.12 | nc |
| 25 | CDKN1B | FL | Hs | Cytoplasm | 22.1 | 6.6 | 2 | −1.26 | 2 | 103 | 0.102 | 50 | 0.10 | nc |
| 26 | CASP2 | FL | Hs | Cytoplasm | 48.9 | 6.3 | 4.1 | −0.30 | 5 | 102 | 0.017 | 50 | 0.00 | nc |
| 27 | MMP1 | FL | Hs | Extracellular | 54.0 | 6.5 | 0.6 | −0.57 | 7 | 106 | 0.083 | 50 | 0.09 | 4 |
| 28 | CDKN2A | FL | Hs | Cytoplasm | 16.5 | 5.4 | 0.6 | −0.23 | 4 | 1 | 0.009 | 9 | 0.00 | 1 |
| 29 | CD44 | FL | Hs | Extracellular | 81.6 | 5.0 | 1.2 | −0.77 | 10 | 1 | 0.004 | 6 | 0.13 | nc |
| 30 | GFP | FL | Av | Cytoplasm | 26.9 | 5.6 | 0.8 | −0.52 | 3 | 105 | 0.466 | 400 | 0.69 | 2 |
Entrez GeneSymbol.
Domain: LB, ligand binding; TK, tyrosine kinase; SAM, sterile alpha motif; EC, extracellular; FL, full-length; bd, binding domain.
Organism: Mm, Mus musculus; Hs, Homo sapiens; Av, Aequoria victoria.
GRAVY, grand average of hydropathicity index.
Highest number of contiguous hydrophobic amino acids (A, V, I, L, W, or F).
Fold differences in dilution factor (DF ratio) were obtained by normalizing the smallest dilution at which growth of ΔtatC cells was observed by the smallest dilution at which wt cells appeared to grow.
Soluble versus total expression (STE) ratios were obtained for each protein by normalizing the soluble expression data by the total expression data reported in Tables II and III of Dyson et al.21 Values reported are the average over all 6× and 10× his-tagged constructs.
CATH classification was determined for 17 of the proteins These proteins displayed 56–100% sequence identity with domains found in the CATH database and 11 displayed >90% identity. The proteins were classified as either mainly alpha (1), mainly beta (2), alpha-beta (3), or multidomain (4). As per Houry et al.,36 13 proteins with amino acid sequences greater than 100 consecutive residues not covered by sequence homology were not classified (nc) in this analysis.
Figure 3.

Selective plating of cells as an indicator of protein solubility. Spot plating of serially diluted cells on LB agar supplemented with 100 μg/mL Amp. Each 5-μL aliquot contained an equivalent number of MC4100 (wt) or B1LK0 (ΔtatC) cells expressing different ssTorA-ORF-Bla fusion constructs where the ORF is indicated to the left in each case (numbered as in Table I). Shown are representative spot plating results for clones that conferred: (a) 0–100-fold or (b) >100-fold difference in Amp resistance between wt cells compared to ΔtatC cells. Cells were serially diluted as indicated by arrow and values at top, with 5 μL of undiluted overnight cells plated in the first column.
Figure 4.

Amp resistance phenotypes conferred by mammalian proteins. (a) Phenotypic difference in Amp resistance between wt and ΔtatC cells following spot plating on LB agar supplemented with 100 μg/mL Amp. Values for the fold difference in dilution factor were obtained by normalizing the smallest dilution at which growth of ΔtatC cells was observed by the smallest dilution at which wt cells appeared to grow. Numbers above bars correspond to the MBC values determined for those clones. Numbering of mammalian proteins corresponds to Table I. (b) Specific growth rate of wt cells expressing the same proteins in (a). Cells were grown in LB broth supplemented with 100 μg/mL Amp. Relative specific growth rate values were determined by normalizing the specific growth rate data for each protein (values in Table I) by the specific growth rate for wt cells expressing ssTorA-GFP-Bla.
To independently confirm Tat-specific localization of the mammalian clones, we expressed these as ssTorA-ORF-FLAG chimeras in which the C-terminal Bla moiety was replaced by a nine-residue FLAG epitope tag. Subcellular fractionation analysis of several representative positive clones, namely Ephb2(SAM), RAF1, CDKN1B, and GFP, revealed that all of these proteins accumulated in the cytoplasm of both wt and ΔtatC cells; however, export to the periplasm was only observed in wt cells [Fig. 5(a)]. These results not only demonstrate Tat-dependent export for these proteins but also confirm the overall solubility of these clones, as expected for substrates that are competent for Tat export.29,30,33 Interestingly, the GRB2 protein, which conferred only a small growth advantage on solid agar to wt versus ΔtatC cells but grew at an unexpectedly fast rate in liquid culture, was present in the soluble extracts corresponding to the cytoplasm and periplasm [Fig. 5(c)]. However, even though GRB2 accumulated in the soluble fractions, a much larger amount of GRB2 was detected in the insoluble fraction where it appeared to form higher molecular weight SDS-resistant aggregates [Fig. 5(c)]. This result suggests that GRB2 is of intermediate/borderline solubility and may in part explain the discrepancy observed between the solid versus solution phase behavior of cells expressing this construct. Similar fractionation analysis was performed for mammalian clones that did not confer resistance to wt cells relative to ΔtatC mutants. For most of these, including Efna1(EC), Efnb2(EC2), and MAD, crossreacting bands were detected in the whole cell lysates but not in the soluble cytoplasmic or periplasmic fractions [Fig. 5(c)]. Instead, these proteins were localized exclusively in the insoluble fraction [Fig. 5(c)]. Several higher molecular weight bands corresponding to SDS-resistant insoluble aggregates were also observed for Efna1(EC) and Efnb2(EC2) [Fig. 5(c)]. These results are consistent with the notion that the QC feature of the Tat system discriminates against insoluble, aggregation-prone proteins.29,30,33 For Ephb2(TK) and EGFR, which also did not confer resistance to wt cells relative to ΔtatC mutants, soluble expression was observed in the cytoplasm of wt and ΔtatC cells, but no export was observed in either of these strains [Fig. 5(b)]. Although the exact reasons for this remain unclear, there are a number of possibilities: (i) the proteins may have adopted a soluble conformation that is incompatible with Tat-dependent export from the cytoplasm (e.g., oligomer, molten globule), (ii) their size/shape somehow precludes Tat export, (iii) the proteins may be “slow-folders” that are rejected for export initially, based on the observation that faster folding proteins are exported more efficiently by the Tat system,39 but given more time would fold correctly, and/or (iv) the attB sequences inhibit export of these proteins in a context-dependent manner. These issues are investigated in more detail later.
Figure 5.

Subcellular distribution of Tat-targeted mammalian proteins. (a, b) Western blot analysis of periplasmic (per) and cytoplasmic (cyt) fractions generated from an equivalent number of wt and ΔtatC cells expressing ssTorA-ORF-FLAG fusions lacking the Bla moiety for the clones indicated (numbered according to Table I). Blots were probed by anti-FLAG antibodies. GroEL was used as a fractionation marker by probing with anti-GroEL antibodies (data not shown). (c) Western blot analysis of whole cell lysate (lys), periplasmic (per), cytoplasmic (cyt), and cytoplasmic-insoluble (ins) fractions generated from an equivalent number of wt cells expressing ssTorA-ORF-FLAG fusions lacking the Bla moiety for the clones indicated (numbered according to Table I). Blots were probed by anti-FLAG antibodies. GroEL was used as a fractionation marker by probing with GroEL antibodies (data not shown).
Identification of protein properties that correlate with Tat selection
We next determined whether the antibiotic selection data (see Figs. 3 and 4) correlated with any specific protein features. For most of the properties (e.g., cysteine content, contiguous hydrophobic amino acids, grand average of hydropathicity index, isoelectric point, etc.), no obvious patterns were discernable (Table I). We also investigated whether three-dimensional protein structure (i.e., helical/sheet/loop content) correlated with Tat-based selection in a manner analogous to Hartl and coworkers.36 Here, amino acid sequences for all 30 proteins were entered into the CATH database40 and the overall structural class was determined by sequence alignment with protein domains whose structures have been solved (Table I). However, a comparison of the CATH classification with the specific growth rate or MBC on Amp revealed no preferred structural motifs (Supporting Information Figure 2). It should be noted that many higher molecular-weight proteins were not included in the analysis because they lacked homology with previously solved structures. Interestingly, many of these higher molecular weight proteins were not competent for export. In fact, the majority of the proteins that conferred significant Tat-dependent Amp resistance were less than 30 kDa in size [Fig. 6(a)], bearing in mind that in our assay each protein was appended with a Tat export signal and a 29-kDa Bla moiety. Thus, with few exceptions, only those chimeras that were less than ∼60 kDa were export competent. This is consistent with the observation that the Tat system is limited to some extent by the length of the protein substrate, and terminates translocation when the length exceeds a threshold (20–30 nm).41
Figure 6.

Protein properties that correlate with soluble expression. (a) Relative specific growth rate for each protein plotted versus its molecular weight (Mw) and (b) soluble versus total expression (STE) ratio plotted versus relative specific growth rate. Data points for which STE ratios correlated with relative specific growth rate are represented by the filled circles (n = 19); values that deviated from this relationship are shown as open triangles. Relative specific growth rate values were determined by normalizing the specific growth rate data for each protein (values in Table I) by the specific growth rate for wt cells expressing ssTorA-GFP-Bla. STE ratios for each protein were obtained by normalizing the soluble expression data by the total expression data reported in Tables II and III of Dyson et al.21 Values reported are the average over all 6× and 10× his-tagged constructs.
Molecular weight alone was not sufficient to explain all the data, given that some of the proteins that were completely incompetent for Tat export were relatively small. For instance, Ephb2(TK) and EGFR expressed without the Bla moiety were each considerably smaller than 60 kDa, yet neither was exported [Fig. 5(b)]. Likewise, several small proteins including Epha2 (21.1 kDa), Efnb2(EC1) (16.6 kDa), and Mdm2(p53-bd) conferred very weak resistance that was indistinguishable in wt and ΔtatC cells (Figs. 3 and 4). In the case of Epha2 and Efnb2(EC1), the inability to transit the Tat system appears to be related to the poor solubility of each of these clones. In fact, previous studies showed that the solubility of these two proteins was very poor.21 To quantify this, we calculated the average solubility ratio for each of the 29 mammalian proteins and GFP by dividing the amount (mg/L) of protein in the soluble fraction by the amount (mg/L) of total expression, using previously reported data in which the proteins were appended with 6× or 10× polyhistidine tags at the N- and C-termini.21 For Epha2 and Efnb2(EC1), the average solubility ratios were 0.03 and 0.06, respectively (Table I). For comparison, GFP and RAF1, proteins that were efficiently exported by the Tat system, were found to have average solubility ratios of 0.69 and 0.52, respectively (Table I). Overall, a fairly linear relationship was observed between the resistance conferred in our Tat selection system and the average solubility ratio for 19 of the proteins [Fig. 6(b), filled circles]. Thus, our assay reliably reports the solubility for the majority of proteins tested in this study. This data is also entirely consistent with our earlier observations that export efficiency through the Tat system is highly dependent on the solubility of the protein substrate in the cytoplasm.29,30,33
It should be noted, however, that 11 proteins deviated significantly from this trend [Fig. 6(b), open triangles]. Each of these conferred very weak resistance in our Tat selection despite the fact that previous expression analysis showed these to be relatively soluble proteins.21 In our hands, several of these proteins including Efna1(EC), Efnb2(EC2), and MAD were clearly insoluble as evidenced by Western blot analysis [see Fig. 5(c)]. We suspect that the observed differences in solubility between our studies and those described earlier are due to different experimental conditions including, for instance, expression vector features (i.e., promoter, 5′-UTR sequence, copy number), expression format (i.e., signal peptide, Bla fusion partner) and the extent of protein induction (i.e., amount of inducer, timing of inducer addition, duration of induction). It should also be noted that certain other proteins may be soluble but export incompetent for other reasons, akin to the Ephb2(TK) and EGFR proteins [Fig. 5(b)].
Discussion
We have developed a simple strategy for evaluating folding and solubility of mammalian proteins and protein domains in E. coli. This was accomplished by combining a rapid recombinational cloning procedure with a previously developed genetic selection for protein folding based on the Tat pathway and its intrinsic QC mechanism. Since Tat pathway QC typically prevents export of misfolded proteins in E. coli,29,33,42 only correctly folded mammalian protein fusions reached the periplasm and conferred cell survival. Our results for 29 different mammalian sequences highlight the ability of Tat selection to assess the fitness of mammalian proteins and protein domains without the need for structural or functional information about the target protein. Although not demonstrated here, we envision that our genetic selection for mammalian protein folding could be used in a combinatorial approach by preparing a library of random point mutations or gene fragments and selecting those that express in a soluble form in E. coli. Along these lines, we previously used Tat selection in a directed evolution format to furnish soluble variants of the aggregation-prone Alzheimer's Aβ42 peptide29 and poorly folded scFv antibodies.30
Our studies focused on mammalian proteins from several diverse families, and examined the relationship between Tat-dependent export competence and various protein properties. We found that there was a strong correlation between soluble expression, and hence successful Tat export, and the molecular weight of the protein. The average molecular weight of proteins that were export competent was 55.2 kDa (including the Bla moiety), whereas the majority of larger proteins (> ∼60 kDa) failed to be exported. This was not entirely surprising given the decreasing probability of successful soluble expression of mammalian proteins in E. coli with increasing molecular weight.21 In addition to poor solubility, the inability of certain large proteins to transit the Tat pathway may be due to the fact that Tat export can terminate when the length of the protein substrate exceeds a threshold of ∼20–30 nm.41 It should be noted, however, that we previously observed Tat-dependent export of heterodimeric complexes, which included Bla, as large as ∼110 kDa.43 Molecular weight also does not explain the lack of Tat export observed for smaller mammalian proteins such as Ephb2(LB), Efnb2(EC1), Efnb2(EC2), Efna1(EC), and Epha2. Interestingly, inspection of Table I revealed that each of these have a cysteine content greater than 2%. Thus, we speculate that misfolding and export incompetence of these proteins was due to the reducing redox potential of the cytoplasm, which disfavors the formation of protein disulfide bonds. In support of this notion, our previous studies demonstrated that E. coli alkaline phosphatase (PhoA) could only be exported by the Tat pathway following formation of its two disulfide bonds, which are critical for the stability and catalytic activity of the protein.33 However, expression of all the nonexported mammalian proteins in E. coli strain DR473, which permits cytoplasmic disulfide bond formation,33,44 did not promote export of any of these constructs (Lim and DeLisa, unpublished observations). Thus, improper or incomplete folding of these proteins must arise from another mechanism. Unfortunately, none of the other features including protein pI, grand average of hydropathicity index (GRAVY),45 subcellular location, highest number of contiguous hydrophobic amino acids or 3D structural features were observed to correlate with Tat selection for any of the negative clones. Likewise, examining the same protein characteristics for the positive clones that were expressed in a soluble form and exported via the Tat system did not identify any meaningful trends. Specifically, Ephb2(SAM), Efna1(FL), MAX, and RAF1 all conferred significant resistance to wt but not ΔtatC cells, yet the pI for these proteins ranged from 4.9 for Ephb2(SAM) to 9.9 for RAF1. This is interesting considering that proteins are thought to be less soluble at a pH environment near their pI. Similarly unrelated was the cysteine content, which ranged from no cysteines in Ephb2(SAM) and MAX to nearly 4% of the total amino acids in RAF1. The GRAVY values and highest number of contiguous hydrophobic amino acids were equally uninformative.
Importantly, our results are consistent with the notion that a QC mechanism that discriminates between folded and unfolded proteins, allowing the export of only the former, is an inbuilt feature of the Tat system. In support of this hypothesis, c-type cytochromes or PhoA bearing N-terminal Tat signals become compatible for export only after cytoplasmic folding.33,46 This requirement for folding appears to be sensed directly by the Tat system because even the unfolded version of Tat-targeted PhoA was found to physically associate with the Tat translocase.47,48 Moreover, a perturbed interaction was observed for the signal peptide of unfolded ssTorA-PhoA with the TatB and TatC proteins when compared with that of folded ssTorA-PhoA, suggesting some degree of quality control by TatBC.47 Along similar lines, incorrectly folded FeS substrate proteins with mutations in a single FeS cluster were completely blocked for Tat export, and the Tat apparatus was found to directly initiate the degradation of the rejected molecules.49 The notion that folding is a prerequisite for Tat export is not without controversy, primarily because unstructured, hydrophilic synthetic peptides have been transported by the Tat system.41,50 However, in one case, this could only be accomplished when the TatABC proteins that form the translocase were overexpressed from a multicopy plasmid; without overexpression of the tat genes, transport was hardly detectable.50 In the other case, the unstructured peptide was fused to a much larger soluble protein,41 which might potentially mask the smaller unstructured peptide. Moreover, unstructured synthetic peptides may have significantly different properties than bona fide unfolded proteins.41 Thus, we favor a model in which the Tat system is at the center of an integrated QC system that involves sensing of the folded state of protein substrates before transport, and also the targeted degradation of inappropriately folded or assembled substrates. Both proofreading and substrate turnover steps appear to involve productive interaction between the substrate and the Tat components. Since an exposed hydrophobic core is the typical hallmark of misfolded and aggregation-prone proteins, one possible explanation for how the Tat system might monitor the folding state of its substrates is via hydrophobic interactions between the Tat components and the substrate itself. Clearly, the system must be able to tolerate a certain degree of hydrophobicity based on the observation that several of the exported mammalian proteins had >3 consecutive hydrophobic residues. Nonetheless, one thing is for certain: The Tat system clearly favors globular proteins that do not possess exposed hydrophobic patches.
Materials and Methods
Strains, growth conditions, and plasmids
Wild-type E. coli strain MC4100 and an isogenic ΔtatC derivative of MC4100 called B1LK034 were used for all Tat expression experiments. Liquid cultures were routinely grown aerobically at 37°C in Luria-Bertani (LB) medium, and antibiotic supplements were at the following concentrations: Amp, 100 μg/mL; chloramphenicol (Cam), 20 μg/mL; and kanamycin (Kan), 50 μg/mL. Protein synthesis was induced by adding 1 mM isopropyl-β-d-thiogalactopyranoside (IPTG) when the cells reached an absorbance at 600 nm of ∼0.5. Growth rate in liquid culture was assayed as described previously.29 Briefly, cells expressing the different ssTorA-ORF-Bla fusion proteins were grown in 96-well plates containing liquid LB media supplemented with 100 μg/mL Amp. Absorbance at 600 nm was measured every 30 min using a Bio-Tek Synergy HT microplate reader (Bio-Tek Instruments). The specific growth was then calculated from this data as described elsewhere.51 All growth rate data was the average of three cultures grown in parallel. Error was reported as the standard error of the mean (s.e.m.) of these data and was typically less than 5%.
All genes for mammalian proteins (Table I) were obtained from a set of 30 Gateway entry clones that were constructed previously.21 The construction of a Gateway-compatible vector system for Tat selection was accomplished as follows. First, we created pSALect-MBP-Kan by replacing the Camr gene in pSALect-MBP-Cam29 with a Kanr selection marker. The Kanr gene was PCR amplified from pBBR1MCS-252 using forward and reverse primers each containing BstHI restriction sites. The resulting pSALect-MBP-Kan vector was then converted to a destination vector using the Gateway Vector Conversion System to insert (Invitrogen). Briefly, pSALect-MBP-Kan was digested with NdeI and SpeI to remove the malE gene and treated with Klenow fragment (New England BioLabs) to make blunt ends for ligation with the Reading Frame Cassette C.1 (RfC.1). RfC.1 is a blunt-ended cassette containing attR1 and attR2 sites flanking the ccdB gene and the Camr gene. After ligation, E. coli strain DB3.1 (Invitrogen) was transformed with ligation mixture and Camr resistant colonies were selected. The resulting destination vector, called pDEST-Tat, is comprised of the Camr/ccdB cassette of RfC.1 flanked at the 5′ end by DNA encoding the ssTorA signal peptide followed by attR1 and at the 3′ end by the attR2 site followed by the bla gene encoding TEM-1 Bla [Fig. 2(a)]. The pDEST-Tat vector was used for LR recombination reactions with entry clones encoding different mammalian genes of interest. Next, all 29 mammalian genes (Table I) and GFP were PCR amplified using forward and reverse primers that introduced 5′ attB1 and 3′ attB2 linkers, respectively. The PCR products were recombined with pDONR/Zeo (Invitrogen) using BP recombinase (Invitrogen) to generate 30 unique pENTR-ORF vectors. Subsequently, all 30 pENTR-ORF plasmids were subjected to LR reaction with pDEST-Tat to yield the final pTatEXP-ORF plasmids for expressing ssTorA-ORF-Bla fusions. E. coli MC4100 and B1LK0 strains were transformed with the LR reaction mixtures and colonies harboring pTatEXP-ORF plasmids were selected on LB agar plates supplemented with 50 μg/mL Kan. The plasmid pTorA-cassette,53 a derivative of pTrc99A, was used as an expression vector for mammalian proteins in the absence of the C-terminal Bla moiety. Individual mammalian clones were PCR amplified with primers that introduced a C-terminal FLAG epitope (DYKDDDDKG) and the resulting PCR products were ligated in pTrc99A-TorA immediately after the DNA encoding ssTorA.
Fluorescence microscopy
Cells expressing ssTorA-GFP-Bla from either pSALect-GFP29 or pTatEXP-GFP were visualized as described previously54 using a Zeiss Axioskop 40 fluorescent microscope with Spotflex color digital camera and filter sets for GFP (485 nm for excitation and 505 nm for emission) and rhodamine (540 nm for excitation and 600 nm for emission).
Selective plating of bacteria
MC4100 and B1LK0 cells carrying different pTatEXP-ORF vectors were cultured in a shaking incubator overnight at 37°C in LB broth supplemented with 100 μg/mL Amp and 1 mM IPTG. Overnight cells were spun down and resuspended in an amount of phosphate buffered saline (PBS, pH 7.4) to adjust the absorbance at 600 nm of cells to a value of 0.5. Normalized cells were then serially diluted 10-fold into fresh LB with Cam. Five microliters of each dilution was spot-plated on LB agar plates with 100 μg/mL Amp and incubated at 30°C overnight. Growth on spot plates was quantified by normalizing the smallest dilution at which growth of ΔtatC cells was observed by the smallest dilution at which wt cells appeared to grow. The MBC on Amp was determined by spreading cells (∼200 CFU as estimated by OD600) expressing different target proteins onto LB plates supplemented with Amp ranging from 3 to 600 μg/mL. The MBC was defined as the Amp concentration at which no colony forming units (CFUs) were observed.
Subcellular fractionation
An equivalent number of cells were harvested 6 h after induction. To generate whole cell lysates (lys), total cell fractions were centrifuged, resuspended in PBS, and broken by 30-s sonication treatments at 0°C. Periplasmic proteins (per) were isolated from the total cell fraction as described previously.53 The pellet remaining after the isolation of the periplasmic proteins was resuspended in PBS and broken by 30-s sonication treatments at 0°C. This preparation was then centrifuged and separated into a soluble fraction, designated the cytoplasmic-soluble fraction (cyt), and an insoluble pellet fraction, designated the cytoplasmic-insoluble fraction (ins). Western blot analysis of these fractions was performed as previously described.33 Detection of C-terminally FLAG-tagged proteins was accomplished using anti-FLAG antibodies (AbCam). The quality of all fractionations was determined by immunodetection of the cytoplasmic GroEL protein and only those samples where GroEL was exclusively in the cytoplasmic fraction were reported.
References
- 1.Goulding CW, Perry LJ. Protein production in Escherichia coli for structural studies by X-ray crystallography. J Struct Biol. 2003;142:133–143. doi: 10.1016/s1047-8477(03)00044-3. [DOI] [PubMed] [Google Scholar]
- 2.Finley JB, Qiu SH, Luan CH, Luo M. Structural genomics for Caenorhabditis elegans: high throughput protein expression analysis. Protein Expr Purif. 2004;34:49–55. doi: 10.1016/j.pep.2003.11.026. [DOI] [PubMed] [Google Scholar]
- 3.Ding HT, Ren H, Chen Q, Fang G, Li LF, Li R, Wang Z, Jia XY, Liang YH, Hu MH, Li Y, Luo JC, Gu XC, Su XD, Luo M, Lu SY. Parallel cloning, expression, purification and crystallization of human proteins for structural genomics. Acta Cryst. 2002;D58:2102–2108. doi: 10.1107/s0907444902016359. [DOI] [PubMed] [Google Scholar]
- 4.Molloy PE, Harris WJ, Strachan G, Watts C, Cunningham C. Production of soluble single-chain T-cell receptor fragments in Escherichia coli trxB mutants. Mol Immunol. 1998;35:73–81. doi: 10.1016/s0161-5890(98)00019-4. [DOI] [PubMed] [Google Scholar]
- 5.Waldo GS, Standish BM, Berendzen J, Terwilliger TC. Rapid protein-folding assay using green fluorescent protein. Nat Biotechnol. 1999;17:691–695. doi: 10.1038/10904. [DOI] [PubMed] [Google Scholar]
- 6.Schein C, Noteborn N. Formation of soluble recombinant proteins in Escherichia coli is favored by lower growth temperatures. Biotechnology (NY) 1988;6:291–294. [Google Scholar]
- 7.Winograd E, Pulido MA, Wasserman M. Production of DNA-recombinant polypeptides by tac-inducible vectors using micromolar concentrations of IPTG. Biotechniques. 1993;14:886–890. [PubMed] [Google Scholar]
- 8.Wang H, Chong S. Visualization of coupled protein folding and binding in bacteria and purification of the heterodimeric complex. Proc Natl Acad Sci USA. 2003;100:478–483. doi: 10.1073/pnas.0236088100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Nishihara K, Kanemori M, Kitagawa M, Yanagi H, Yura T. Chaperone coexpression plasmids: differential and synergistic roles of DnaK-DnaJ-GrpE and GroEL-GroES in assisting folding of an allergen of Japanese cedar pollen, Cryj2, in Escherichia coli. Appl Environ Microbiol. 1998;64:1694–1699. doi: 10.1128/aem.64.5.1694-1699.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Derman AI, Prinz WA, Belin D, Beckwith J. Mutations that allow disulfide bond formation in the cytoplasm of Escherichia coli. Science. 1993;262:1744–1747. doi: 10.1126/science.8259521. [DOI] [PubMed] [Google Scholar]
- 11.Bessette PH, Aslund F, Beckwith J, Georgiou G. Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm. Proc Natl Acad Sci USA. 1999;96:13703–13708. doi: 10.1073/pnas.96.24.13703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tan WS, Dyson MR, Murray K. Hepatitis B virus core antigen: enhancement of its production in Escherichia coli, and interaction of the core particles with the viral surface antigen. Biol Chem. 2003;384:363–371. doi: 10.1515/BC.2003.042. [DOI] [PubMed] [Google Scholar]
- 13.Miroux B, Walker JE. Over-production of proteins in Escherichia coli: mutant hosts that allow synthesis of some membrane proteins and globular proteins at high levels. J Mol Biol. 1996;260:289–298. doi: 10.1006/jmbi.1996.0399. [DOI] [PubMed] [Google Scholar]
- 14.Skretas G, Georgiou G. Genetic analysis of G protein-coupled receptor expression in Escherichia coli: inhibitory role of DnaJ on the membrane integration of the human central cannabinoid receptor. Biotechnol Bioeng. 2009;102:357–367. doi: 10.1002/bit.22097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shimizu Y, Inoue A, Tomari Y, Suzuki T, Yokogawa T, Nishikawa K, Ueda T. Cell-free translation reconstituted with purified components. Nat Biotechnol. 2001;19:751–755. doi: 10.1038/90802. [DOI] [PubMed] [Google Scholar]
- 16.Stapleton D, Balan I, Pawson T, Sicheri F. The crystal structure of an Eph receptor SAM domain reveals a mechanism for modular dimerization. Nat Struct Biol. 1999;6:44–49. doi: 10.1038/4917. [DOI] [PubMed] [Google Scholar]
- 17.Holliger P, Hudson PJ. Engineered antibody fragments and the rise of single domains. Nat Biotechnol. 2005;23:1126–1136. doi: 10.1038/nbt1142. [DOI] [PubMed] [Google Scholar]
- 18.Sachdev D, Chirgwin JM. Solubility of proteins isolated from inclusion bodies is enhanced by fusion to maltose-binding protein or thioredoxin. Protein Expr Purif. 1998;12:122–132. doi: 10.1006/prep.1997.0826. [DOI] [PubMed] [Google Scholar]
- 19.Wang C, Castro AF, Wilkes DM, Altenberg GA. Expression and purification of the first nucleotide-binding domain and linker region of human multidrug resistance gene product: comparison of fusions to glutathione S-transferase, thioredoxin and maltose-binding protein. Biochem J. 1999;338:77–81. [PMC free article] [PubMed] [Google Scholar]
- 20.Hammarstrom M, Hellgren N, van Den Berg S, Berglund H, Hard T. Rapid screening for improved solubility of small human proteins produced as fusion proteins in Escherichia coli. Protein Sci. 2002;11:313–321. doi: 10.1110/ps.22102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dyson MR, Shadbolt SP, Vincent KJ, Perera RL, McCafferty J. Production of soluble mammalian proteins in Escherichia coli: identification of protein features that correlate with successful expression. BMC Biotechnol. 2004;4:32. doi: 10.1186/1472-6750-4-32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wigley WC, Stidham RD, Smith NM, Hunt JF, Thomas PJ. Protein solubility and folding monitored in vivo by structural complementation of a genetic marker protein. Nat Biotechnol. 2001;19:131–136. doi: 10.1038/84389. [DOI] [PubMed] [Google Scholar]
- 23.Dyson MR, Perera RL, Shadbolt SP, Biderman L, Bromek K, Murzina NV, McCafferty J. Identification of soluble protein fragments by gene fragmentation and genetic selection. Nucleic Acids Res. 2008;36:e51. doi: 10.1093/nar/gkn151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Pedelacq JD, Cabantous S, Tran T, Terwilliger TC, Waldo GS. Engineering and characterization of a superfolder green fluorescent protein. Nat Biotechnol. 2006;24:79–88. doi: 10.1038/nbt1172. [DOI] [PubMed] [Google Scholar]
- 25.Massey-Gendel E, Zhao A, Boulting G, Kim HY, Balamotis MA, Seligman LM, Nakamoto RK, Bowie JU. Genetic selection system for improving recombinant membrane protein expression in E. coli. Protein Sci. 2009;18:372–383. doi: 10.1002/pro.39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Maxwell KL, Mittermaier AK, Forman-Kay JD, Davidson AR. A simple in vivo assay for increased protein solubility. Protein Sci. 1999;8:1908–1911. doi: 10.1110/ps.8.9.1908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Cabantous S, Terwilliger TC, Waldo GS. Protein tagging and detection with engineered self-assembling fragments of green fluorescent protein. Nat Biotechnol. 2005;23:102–107. doi: 10.1038/nbt1044. [DOI] [PubMed] [Google Scholar]
- 28.Drew D, Slotboom DJ, Friso G, Reda T, Genevaux P, Rapp M, Meindl-Beinker NM, Lambert W, Lerch M, Daley DO, Van Wijk KJ, Hirst J, Kunji E, De Gier JW. A scalable, GFP-based pipeline for membrane protein overexpression screening and purification. Protein Sci. 2005;14:2011–2017. doi: 10.1110/ps.051466205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fisher AC, Kim W, DeLisa MP. Genetic selection for protein solubility enabled by the folding quality control feature of the twin-arginine translocation pathway. Protein Sci. 2006;15:449–458. doi: 10.1110/ps.051902606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Fisher AC, DeLisa MP. Efficient isolation of soluble intracellular single-chain antibodies using the twin-arginine translocation machinery. J Mol Biol. 2009;385:299–311. doi: 10.1016/j.jmb.2008.10.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Walhout AJ, Temple GF, Brasch MA, Hartley JL, Lorson MA, van den Heuvel S, Vidal M. GATEWAY recombinational cloning: application to the cloning of large numbers of open reading frames or ORFeomes. Methods Enzymol. 2000;328:575–592. doi: 10.1016/s0076-6879(00)28419-x. [DOI] [PubMed] [Google Scholar]
- 32.Landy A. Dynamic, structural, and regulatory aspects of lambda site-specific recombination. Annu Rev Biochem. 1989;58:913–949. doi: 10.1146/annurev.bi.58.070189.004405. [DOI] [PubMed] [Google Scholar]
- 33.DeLisa MP, Tullman D, Georgiou G. Folding quality control in the export of proteins by the bacterial twin-arginine translocation pathway. Proc Natl Acad Sci USA. 2003;100:6115–6120. doi: 10.1073/pnas.0937838100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bogsch EG, Sargent F, Stanley NR, Berks BC, Robinson C, Palmer T. An essential component of a novel bacterial protein export system with homologues in plastids and mitochondria. J Biol Chem. 1998;273:18003–18006. doi: 10.1074/jbc.273.29.18003. [DOI] [PubMed] [Google Scholar]
- 35.Santini CL, Bernadac A, Zhang M, Chanal A, Ize B, Blanco C, Wu LF. Translocation of jellyfish green fluorescent protein via the Tat system of Escherichia coli and change of its periplasmic localization in response to osmotic up-shock. J Biol Chem. 2001;276:8159–8164. doi: 10.1074/jbc.C000833200. [DOI] [PubMed] [Google Scholar]
- 36.Liebscher M, Jahreis G, Lucke C, Grabley S, Raina S, Schiene-Fischer C. Fatty acyl benzamido antibacterials based on inhibition of DnaK-catalyzed protein folding. J Biol Chem. 2007;282:4437–4446. doi: 10.1074/jbc.M607667200. [DOI] [PubMed] [Google Scholar]
- 37.Lee LL, Ha H, Chang YT, DeLisa MP. Discovery of amyloid-beta aggregation inhibitors using an engineered assay for intracellular protein folding and solubility. Protein Sci. 2009;18:277–286. doi: 10.1002/pro.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ribnicky B, van Blarcom T, Georgiou G. A scFv antibody mutant isolated in a genetic screen for improved export via the twin arginine transporter pathway exhibits faster folding. J Mol Biol. 2007;369:631–639. doi: 10.1016/j.jmb.2007.03.068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Houry WA, Frishman D, Eckerskorn C, Lottspeich F, Hartl FU. Identification of in vivo substrates of the chaperonin GroEL. Nature. 1999;402:147–154. doi: 10.1038/45977. [DOI] [PubMed] [Google Scholar]
- 40.Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. CATH—a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. doi: 10.1016/s0969-2126(97)00260-8. [DOI] [PubMed] [Google Scholar]
- 41.Cline K, McCaffery M. Evidence for a dynamic and transient pathway through the TAT protein transport machinery. EMBO J. 2007;26:3039–3049. doi: 10.1038/sj.emboj.7601759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bruser T, Yano T, Brune DC, Daldal F. Membrane targeting of a folded and cofactor-containing protein. Eur J Biochem. 2003;270:1211–1221. doi: 10.1046/j.1432-1033.2003.03481.x. [DOI] [PubMed] [Google Scholar]
- 43.Waraho D, DeLisa MP. Versatile selection technology for intracellular protein-protein interactions mediated by a unique bacterial hitchhiker transport mechanism. Proc Natl Acad Sci USA. 2009;106:3692–3697. doi: 10.1073/pnas.0704048106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Marrichi M, Camacho L, Russell DG, DeLisa MP. Genetic toggling of alkaline phosphatase folding reveals signal peptides for all major modes of transport across the inner membrane of bacteria. J Biol Chem. 2008;283:35223–35235. doi: 10.1074/jbc.M802660200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157:105–132. doi: 10.1016/0022-2836(82)90515-0. [DOI] [PubMed] [Google Scholar]
- 46.Sanders C, Wethkamp N, Lill H. Transport of cytochrome c derivatives by the bacterial Tat protein translocation system. Mol Microbiol. 2001;41:241–246. doi: 10.1046/j.1365-2958.2001.02514.x. [DOI] [PubMed] [Google Scholar]
- 47.Panahandeh S, Maurer C, Moser M, DeLisa MP, Muller M. Following the path of a twin-arginine precursor along the TatABC translocase of Escherichia coli. J Biol Chem. 2008;283:33267–33275. doi: 10.1074/jbc.M804225200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Richter S, Bruser T. Targeting of unfolded PhoA to the TAT translocon of Escherichia coli. J Biol Chem. 2005;280:42723–42730. doi: 10.1074/jbc.M509570200. [DOI] [PubMed] [Google Scholar]
- 49.Matos CF, Robinson C, Di Cola A. The Tat system proofreads FeS protein substrates and directly initiates the disposal of rejected molecules. EMBO J. 2008;27:2055–2063. doi: 10.1038/emboj.2008.132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Richter S, Lindenstrauss U, Lucke C, Bayliss R, Bruser T. Functional Tat transport of unstructured, small, hydrophilic proteins. J Biol Chem. 2007;282:33257–33264. doi: 10.1074/jbc.M703303200. [DOI] [PubMed] [Google Scholar]
- 51.DeLisa MP, Li J, Rao G, Weigand WA, Bentley WE. Monitoring GFP-operon fusion protein expression during high cell density cultivation of Escherichia coli using an on-line optical sensor. Biotechnol Bioeng. 1999;65:54–64. [PubMed] [Google Scholar]
- 52.Kovach ME, Elzer PH, Hill DS, Robertson GT, Farris MA, Roop RM, II, Peterson KM. Four new derivatives of the broad-host-range cloning vector pBBR1MCS, carrying different antibiotic-resistance cassettes. Gene. 1995;166:175–176. doi: 10.1016/0378-1119(95)00584-1. [DOI] [PubMed] [Google Scholar]
- 53.Kim J-Y, Fogarty EA, Lu FJ, Zhu H, Henderson LA, DeLisa MP. Twin-arginine translocation of active human tissue plasminogen activator in Escherichia coli. Appl Environ Microbiol. 2005;71:8451–8459. doi: 10.1128/AEM.71.12.8451-8459.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kim JY, Doody AM, Chen DJ, Cremona GH, Shuler ML, Putnam D, DeLisa MP. Engineered bacterial outer membrane vesicles with enhanced functionality. J Mol Biol. 2008;380:51–66. doi: 10.1016/j.jmb.2008.03.076. [DOI] [PMC free article] [PubMed] [Google Scholar]
