Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2021 Mar 26;30(5):1022–1034. doi: 10.1002/pro.4068

Enzyme catalysis prior to aromatic residues: Reverse engineering of a dephospho‐CoA kinase

Mikhail Makarov 1,2, Jingwei Meng 3, Vyacheslav Tretyachenko 1,2, Pavel Srb 4, Anna Březinová 5, Valerio Guido Giacobelli 1, Lucie Bednárová 4, Jiří Vondrášek 4, A Keith Dunker 3,, Klára Hlouchová 1,4,
PMCID: PMC8040869  PMID: 33739538

Abstract

The wide variety of protein structures and functions results from the diverse properties of the 20 canonical amino acids. The generally accepted hypothesis is that early protein evolution was associated with enrichment of a primordial alphabet, thereby enabling increased protein catalytic efficiencies and functional diversification. Aromatic amino acids were likely among the last additions to genetic code. The main objective of this study was to test whether enzyme catalysis can occur without the aromatic residues (aromatics) by studying the structure and function of dephospho‐CoA kinase (DPCK) following aromatic residue depletion. We designed two variants of a putative DPCK from Aquifex aeolicus by substituting (a) Tyr, Phe and Trp or (b) all aromatics (including His). Their structural characterization indicates that substituting the aromatics does not markedly alter their secondary structures but does significantly loosen their side chain packing and increase their sizes. Both variants still possess ATPase activity, although with 150–300 times lower efficiency in comparison with the wild‐type phosphotransferase activity. The transfer of the phosphate group to the dephospho‐CoA substrate becomes heavily uncoupled and only the His‐containing variant is still able to perform the phosphotransferase reaction. These data support the hypothesis that proteins in the early stages of life could support catalytic activities, albeit with low efficiencies. An observed significant contraction upon ligand binding is likely important for appropriate organization of the active site. Formation of firm hydrophobic cores, which enable the assembly of stably structured active sites, is suggested to provide a selective advantage for adding the aromatic residues.

Keywords: aromatic amino acids, catalysis evolution, genetic code evolution, protein disorder, protein structure evolution

1. INTRODUCTION

The extant alphabet of canonical amino acids was apparently selected in the first 10–15% of Earth history from a plethora of amino acids (a) available on primordial Earth and (b) synthesized through gradually developing metabolic pathways. 1 Recent analyses reveal that, compared to alternatives, the extant alphabet comprises an unusually good repertoire of physical properties. 2 , 3 , 4 Even entirely random sequences built from the canonical alphabet give rise to secondary structure‐rich proteins. 5 Nevertheless, soluble and well‐expressing proteins have been successfully recovered from random libraries of simpler alphabet of evolutionary early amino acids. 6 , 7 However, the stage of the amino acid alphabet evolution at which proteins could have gained dominance in binding and catalysis (i.e., functionally support early metabolism) remains unclear.

Aromatic amino acids are considered among the last additions to the genetic coding system, that is, to the canonical amino acid alphabet. 8 , 9 Because of their relatively high redox reactivity, their fixation in the genetic code could be driven by the biospheric oxygen. 10 There is recent support that for some of the aromatics (Tyr and Trp) this possibly happened even in the post‐last universal common ancestor (LUCA) period. 10 , 11 , 12 These proposals suggest that there was a time when living cells existed without aromatic amino acids.

Even though different reduced sets (of 7–13) of the amino acid alphabet have been shown or predicted to be sufficient for protein folding and catalysis, to our knowledge, none of the experimental studies recovered enzyme activity in complete absence of aromatics. 13 , 14 , 15 , 16 , 17 , 18 Computational inquiry indicates that the aromatics are the strongest structure promoters among the 20 amino acid alphabet. 19 This conclusion is consistent with observation that aromatics are mostly clustered within the hydrophobic cores of structured proteins and with quantum chemistry calculations showing the interactions between aromatics to be stronger and more specific than aliphatic side chains interactions. 20 A comparison of the structure/disorder propensities of the 20 amino acids with the chronology of amino acid inclusion into the genetic code indicates that the earliest amino acids are strongly disorder‐promoting while the last to be added, for example, the aromatics, are among the most strongly structure‐promoting. 9 , 19 , 21 Indeed, aromatics are heavily under‐represented in intrinsically disordered proteins and regions (IDPs and IDRs), that is, proteins that lack stable 3D structure and yet frequently carry out crucial biological functions, associated with signaling and regulation in particular. 22 , 23 While some functions can thus be delivered even in lack of tertiary structure, it remains unclear if and how early enzymes could achieve specific catalysis without a stable hydrophobic core supported by the aromatic residues.

Here, we perform an analysis of structure/function consequences of amino acid reduction by aromatic amino acids. As an exemplary target, we choose a highly conserved metabolic enzyme from a hyperthermophilic bacteria (and hence of potential relevance to early life)—an enzyme that catalyzes the final step of coenzyme A biosynthesis, which is known to be essential for all life and considered among the most ancient cofactors. At the same time, the dephospho‐CoA kinase (DPCK) enzyme belongs to the family of P‐Loop NTPases that have been argued to be one of the oldest protein architectures, widely preserved. 24 We present evidence that enzyme catalysis can occur in the absence of aromatic amino acids and a firm hydrophobic core, formation of which evidently becomes induced upon ligand binding.

2. RESULTS

2.1. Target selection by analysis of LUCA proteins

In order to identify conserved structured protein families, we applied our VSL2B disorder predictor 25 to a collection of LUCA assigned proteins identified by their ubiquity across all kingdoms of cellular life. 26 The non‐enzymes, mostly ribosomal or other RNA binding proteins, were all predicted to be massively disordered while the enzymes were predicted to be structured. These modern‐day versions of the ancient enzymes all contain multiple aromatic residues and we have been unable to identify a single efficient modern enzyme that lacks multiple aromatic residues. Among the LUCA assigned enzymes identified by Brooks and Fresco, we selected DPCK for further study because it has the lowest number of aromatic amino acids. 26 Other advantages of this choice are that there are multiple 3D structures of different DPCK family members and that the DPCK proteins have relatively small sizes.

2.2. Sequence design, expression and purification of DPCK variants

To evaluate the significance of aromatic amino acids for the structure and function of DPCK, the PDB database was first searched for solved structures of confirmed and putative DPCKs from different thermophilic bacterial species (Table S1). An initial test of expression, solubility, ease of large‐scale purification and DPCK activity led to selection of a putative DPCK from Aquifex aeolicus (PDB ID: 2IF2) for this study (Table S1).

Mutant variants of DPCK were designed as follows. First, all Phe, Tyr and Trp residues were substituted by (a) Leu residues (DPCK‐LH) and (b) non‐aromatic amino acids based on the best preservation of thermodynamic stability (DPCK‐MH) using the Hot Spot Wizard server. 27 Second, all of the above amino acids plus His were substituted using the same logic, producing DPCK‐L and DPCK‐M variants respectively. DPCK‐LH/MH and DPCK‐L/M variants thus have 10 and 11% of the total protein sequence substituted, respectively. Synthetic genes of all these variants were subcloned and expressed in Escherichia coli with a C‐terminal polyhistidine tag using standard protocols (see Section 4 for details). Upon preliminary purification and DPCK activity characterization, only DPCK‐LH and DPCK‐M variants were selected for detailed characterization (Figure 1). Intriguingly, DPCK‐L mutant had a very poor expression (even after optimization attempts) in E. coli and both DPCK‐L and DPCK‐MH mutants did not have any measurable phosphotransferase/ATPase activity (Table S2).

FIGURE 1.

FIGURE 1

Sequence design of dephospho‐CoA kinase (DPCK) variants lacking aromatic amino acids. (a) The chronological order and ranking of 20 amino acids: (i) order of appearance in the genetic code derived by meta‐analyses by Trifonov (9); (ii) order of appearance based on the prebiotic availability and thermodynamic stability by Higgs and Pudritz (8); (iii) ranking based on their increasing propensity to promote structure (19). (b) Aromatic amino acid content of DPCK‐WT, ‐LH and ‐M variants. (c) Aromatic residues highlighted in the structure of DPCK from Aquifex aeolicus (PDB ID:2IF2), with ATP molecule positioned based on structural alignment with the Haemophilus influenzae DPCK complex with ATP (PDB ID:1JJV)

DPCK‐WT, ‐LH and ‐M variants were purified to homogeneity using a three‐step purification protocol (Figure S1). Prior to further experiments, the identity, molecular weight, and oligomeric status of the protein variants were tested by mass spectrometry and analytical size exclusion chromatography (Figure S1). All protein variants were of expected molecular weight. DPCK‐WT and ‐LH eluted as monomers while the ‐M variant resembles either a dimeric or disordered monomeric form in the elution profile.

2.3. Enzyme activity characterization

The specificity and rates of enzyme reactions of the DPCK variants were initially characterized using a commercial kit relying on a coupling detection of ADP, one of the reaction products (Figure 2a). In the assay, ADP is converted to pyruvate which is then quantified by a fluorometric method. Any basal ATP hydrolysis (in the absence of enzyme) was appropriately subtracted (Figure S2a). Because the assay was performed at two regimes (varying the concentration of ATP or dCoA), it was possible to observe significant differences in the reaction specificity of the variants (Figure 2b,c).

FIGURE 2.

FIGURE 2

Kinetic characterization of dephospho‐CoA kinase (DPCK) variants. (a) DPCK reaction scheme. (b) Michaelis–Menten plots of DPCK proteins for initial velocity versus (TOP) ATP concentration, monitoring production of ADP; reactions were performed without and with 200 μM dCoA to estimate ATPase and phosphotransferase activities of enzymes. (BOTTOM) dCoA concentration, monitoring production of ADP. Reactions were performed in 15 mM Hepes (pH 7.4), 20 mM NaCl, 1 mM EGTA, 0.02% Tween‐20, 10 mM MgCl2, and 0.1% bovine gamma globulin, and was initiated with ATP. The lines represent nonlinear least squares fits. (c) Summary of catalytic efficiencies

DPCK‐WT has similar catalytic efficiency for both ATP and dCoA as substrates while the ATP hydrolysis activity is dependent on dCoA binding (Figure 2). The herein measured catalytic efficiency of the reaction (3.4 × 104 and 5.7 × 104 M−1 s−1 for dCoA and ATP, respectively) is similar to previously reported efficiency of DPCK from Entamoeba histolytica. 28 In contrast, the catalytic efficiency of DPCK‐LH and DPCK‐M are significantly lower (355 and 118 M−1 s−1 for ATP, respectively), resulting in a decreased turnover number (Figures 2 and S2b). In the case of DPCK‐M, the reaction rates are independent of varying concentrations of dCoA implying an impaired efficiency of the phosphate transfer, that is, only ATPase activity is observed (Figure 2c). While both DPCK‐LH and ‐M variants have the ability to hydrolyze ATP in the absence of dCoA (unlike DPCK‐WT), DPCK‐LH has also the dCoA‐dependent phosphotransferase activity (above ~80 μM dCoA) with KM greater than 200 μM. This activity has been difficult to measure using the commercial kit due to the ATP concentration range limitation. In order to confirm the identity of the reaction products and reaction specificity, the DPCK reactions were performed at a fixed substrate concentration above the DPCK‐WT KM value (where the reaction rate is less dependent or independent of substrate concentration) in order to reach sufficient substrate conversion for detection of the products using HPLC‐MS analysis. This analysis detected significant CoA formation only in the reaction catalyzed by DPCK‐WT and 100x lower CoA formation was detected in the reactions catalyzed by DPCK‐LH (Figure S2b).

2.4. Secondary and tertiary structure characterization

Using the purified proteins, their structural properties were investigated using electronic circular dichroism (ECD), NMR and limited proteolysis.

ECD spectrum of DPCK‐WT (Figure 3a) with comparable intensity of negative maxima at 209 and 225 nm and with intense positive maximum at 195 nm indicates relatively high partition of α‐helical structure (~45%). This is confirmed by the numerical data analysis and agrees with the secondary structure assignment of the X‐ray structure (PDB code: 2IF2) (Table S3). In the case of DPCK‐LH, the first negative maximum is blue‐shifted to 207 nm and its intensity is comparable to that of the second negative maximum at 225 nm. This together with the positive maximum at 195 nm (almost half intensity compared to ECD spectrum of DPCK‐WT) reveals a significant content of α‐helical structure (~40%) together with more pronounced partition of β‐sheet structure, confirmed also by the numerical data analysis (Table S3). ECD spectrum of DPCK‐M has the first negative maximum also blue‐shifted up to 205 nm but this spectral band is more intense compared to the second negative maximum at 222 nm, which could imply possible formation of 310‐helical structure as well as enrichment of unordered structure. The overall spectral shape and mainly spectral intensity of a positive spectral band at 192 nm could be due to a relatively high portion of β‐sheet structure (Table S3).

FIGURE 3.

FIGURE 3

Secondary and tertiary structure characterization of dephospho‐CoA kinase (DPCK) variants. (a) Far‐UV CD spectra of DPCK proteins. The spectra were collected in PBS buffer (11.8 mM phosphate (pH 7.6), 137 mM NaCl, 5 mM MgCl2, 2.7 mM KCl and 0.5 mM DTT). (b) Change in ellipticity at 222 nm upon 0–2 M urea titration of DPCK proteins. (c) 2D NMR of DPCK proteins. The spectra were collected in 50 mM phosphate (pH 7.6), 280 mM NaCl, 20 mM KCl, 10 mM MgCl2, and 0.5 mM TCEP

To estimate the influence of aromatic amino acid substitution on overall protein structure stability, the proteins were unfolded with urea in concentration ranging from 0 to 2 M and were further studied using CD spectroscopy. While DPCK‐WT and ‐LH ECD spectra remain relatively constant upon mild urea titration (up to 2 M), the urea titration spectra indicate loss of structural stability in the DPCK‐M variant, starting already in very low urea concentrations (Figure 3b).

Similarity of structural resemblance of DPCK‐LH and DPCK‐WT was further confirmed by 1D and 2D HN NMR spectra. DPCK‐WT spectrum has a good signal dispersion in the ‐NH‐ region (6–9 ppm) and clear signals near 1 ppm indicative of methyl groups in the hydrophobic core, all features corresponding to a well‐folded protein. While the signal of the methyl groups in the hydrophobic core is absent in the DPCK‐LH variant spectrum (as expected from the removal of aromatic residues), the signal dispersion in the ‐NH‐ region implies that the ‐LH variant is at least partially folded, in contrast with that of the ‐M variant where the signal in the same region is less dispersed, implying lack of specific tertiary structure (Figures 3c and S3). Based on the analyses of N‐edited 3D NOESY spectra, the following counts of α‐helical peaks at 131, 57 and 14 were estimated for DPCK‐WT, ‐LH and ‐M variants, respectively (Table S3).

The tertiary structure of the proteins was additionally characterized by limited proteolysis using endoproteinase Lys‐C as its cleavage site map is conserved among all studied variants (Figure S4). DPCK‐WT is highly resistant to proteolytic digestion during the whole‐time scale of the limited proteolysis experiment, reflecting its globular structure. In contrast, both mutant variants are gradually digested by Lys‐C over time, with the amounts of the intact DPCK‐LH and DPCK‐M decreasing exponentially over time. While relatively large cleavage fragments with the approximate size of 15 kDa can be observed during proteolysis of DPCK‐LH, no large cleavage fragments are detected for DPCK‐M, an indication of its loose or absent tertiary structure (Figure 4).

FIGURE 4.

FIGURE 4

Limited proteolysis of dephospho‐CoA kinase (DPCK) proteins. (a) 14% SDS‐polyacrylamide gels visualized by imidazole‐zinc staining after SDS‐PAGE with the protein samples exposed to Lys‐C endoproteinase for different times. (b) Graphs representing the amount of the proteins remaining at each time point. (c) Determination of proteolysis rate constants (Kp) assuming pseudo‐first order of proteolytic reactions

In summary, DPCK‐LH variant (which has all the aromatic amino acids substituted by leucine) shows relatively high conservation of secondary structure but a loose tertiary structure (probably of molten globular nature) when compared with DPCK‐WT. On the other hand, both secondary and tertiary structures of DPCK‐M variant are severely impaired, in which all histidines were substituted in addition to aromatics.

2.5. Structural characterization of ATP binding

For an efficient phosphorylation reaction, the γ‐phosphate of ATP must be protected from a nucleophilic attack by water molecules. DPCK active site must therefore be shielded from water once the ATP molecule is bound. For several kinases this shielding is accomplished by an induced‐fit conformational change upon ATP binding. Such a conformational change has also been observed for DPCK. 29

To study the structural changes of DPCK variants upon ATP binding, 2D HN NMR spectra were collected in response to ATP titration (see Figure S5). While the NMR spectra of the DPCK‐M variant are of generally low quality, which is probably caused by complex dynamics on the millisecond time scale making the protein signals invisible for NMR spectroscopy, the spectra of both DPCK‐WT and ‐LH variants show expected perturbations upon ATP titration. For DPCK‐WT we observe typical examples of slow exchange behavior, where only free and bound forms are observed with peak intensity proportional to the population. Interestingly, for the DPCK‐LH variant, we typically observed examples of fast exchange with only a single peak visible at a given protein:ATP ratio, although examples of slow exchange are observed as well (Figure 5a). This suggests that compared to the DPCK‐WT:ATP interaction, an additional process occurs during DPCK‐LH titration with ATP.

FIGURE 5.

FIGURE 5

Structural characterization of dephospho‐CoA kinase (DPCK)‐WT and ‐LH upon substrate binding. (a) An exemplary close‐up of DPCK‐LH 2D NH NMR spectra induced by ATP binding (red—free protein [100 μM], blue—300 μM ATP, green—1000 μM ATP); labeled peaks: (P1) N‐H signal not influenced by protein‐ATP interaction. (P2, P3) N‐H signal undergoing medium‐slow to slow exchange on NMR chemical shift time scale (μs‐ms). (P4) N‐H signal documenting a slow exchange process. (b) Mean hydrodynamic radius of DPCK‐WT and ‐LH variants with and without 200 μM ATP measured by dynamic light scattering. (c) The steady‐state fluorescence spectra of ANS binding at excitation wavelength 380 nm. The spectra were measured at different concentrations of ATP (with and without 200 μM dCoA), and each spectrum is the average of three individual scans. The fluorescence was recorded between 410 and 650 nm after exciting the protein solution at 380 nm

To further investigate this intriguing observation, DPCK‐WT and ‐LH (i.e., those variants that are capable of phosphotransferase activity that requires a hydrophobic core) structural response to substrate binding was tested using dynamic light scattering and 8‐anilinonaphthalene‐1‐sulfonic acid (ANS) titration. The steady‐state fluorescence measurements lend support to the molten globule nature of DPCK‐LH variant since it shows higher fluorescence intensity values in comparison with DPCK‐WT, resulting from the high affinity of ANS to the exposed hydrophobic core of molten globular intermediates. 30 While the fluorescence intensity decreases for both variants upon substrate binding, this change is significantly more dramatic for DPCK‐LH (Figure 5c). ATP (out of the two substrates) has a remarkable effect on additional folding of DPCK‐LH protein, explaining its ability to perform the phosphotransferase activity despite its molten globular nature in the free state. Both 2D HN NMR and ANS titration observations were further supported by DLS measurements where the mean hydrodynamic radius of DPCK‐LH was recorded to be reduced by ~20% and reached that of DPCK‐WT value upon ATP addition (Figure 5b).

3. DISCUSSION

Aromatic residues are essential for formation of a stable hydrophobic core of extant proteins. 20 At the same time, tight protein folding is frequently required for enzyme catalysis even though most enzymes undergo dynamic structural changes during the reaction. With aromatics being apparently the latest addition to the amino acid alphabet, how specific protein catalysis could be achieved in their absence remains unclear. The work reported here sheds some light on this problem.

To examine the contribution of the aromatic amino acids to enzyme catalysis, we performed a detailed analysis of two aromatics‐less mutants of the Aquifex aeolicus DPCK where (a) all Phe, Tyr and Trp residues were substituted by Leu residues (DPCK‐LH), and (b) all Phe, Tyr, Trp and His were substituted by non‐aromatic amino acids based on predicted preservation of thermodynamic stability (DPCK‐M).

DPCK catalyzes the transfer of phosphate group from ATP to dCoA, where dCoA acts as the leading substrate. 31 It belongs to the ancient family of P‐loop NTPases with the preserved three‐layer αβα sandwich architecture. 24 The P‐loop motif has been detected among the primordial peptide fragments and is known to underlie hundreds of essential enzyme families. 32 , 33 Besides mononucleotide binding, polypeptides constructed around this scaffold have been shown to bind polynucleotides/RNA/ssDNA and even unwind dsDNA, pointing to the functional plasticity of the P‐loop motif. 34 , 35 The specific function of an NTPase relies on the topology and overall structural context, including many additional active site residues. DPCK is a well folded protein with high α‐helical content and its domain movements upon ATP binding play a crucial role during catalysis. 29 None of the aromatic amino acid residues has been reported essential for the ligand binding and catalysis in DPCK. 31

Both aromatics‐less mutants and wild type protein were characterized in terms of their structure and activity. Interestingly, the DPCK‐M variant (selected for best predicted preservation of thermodynamic stability) had a more impaired structural integrity than DPCK‐LH. This may be either the consequence of the specific substitutions or the indispensability of the DPCK's His residues. From an evolutionary perspective, His was among the last amino acids incorporated into genetic coding. 9 On the other hand, according to the order–disorder propensity scale, His is among the most disorder‐promoting amino acids, likely due to its significant positive charge and the two hydrogen‐bonding nitrogen atoms that could promote structural instability by hydrogen bond switching. 19 Despite its high disorder‐promoting tendency, His often plays an important role in inducing protein structure formation in the presence of divalent cations, especially zinc, due to its metal ion coordination. Given the speculative role of His for hydrogen bond switching, it would be interesting to determine whether His plays a role in facilitating the domain movements needed for catalysis.

CD and NMR measurements of the DPCK‐LH variant showed a similar content of secondary structure to the wild type protein but limited proteolysis and 2D NMR all imply its molten globule tertiary conformation. DPCK‐M variant has no measurable phosphotransferase activity while both of the mutant variants are able to hydrolyze ATP even in the absence of dCoA. This is likely due to the loss of structural orchestration of the catalytic events and demonstrates that some activities can be performed even in the absence of a firm hydrophobic core. However, this is probably untrue for the phosphotransferase activity where the gamma‐phosphate has to be protected from a nucleophilic attack by water molecules in order to be efficiently transferred to the desired substrate. Interestingly, DPCK‐LH variant is still able to perform this activity although with significantly lower efficiency (~100×) in comparison with the wild type protein. Both DPCK‐WT and ‐LH variants share slow‐exchange behavior in the NMR spectra upon ATP titration, suggesting the ATP‐induced change in their structural conformation. However, the DPCK‐LH variant undergoes significant additional folding upon ATP binding, explaining its ability to perform the phosphotransferase reaction. This ligand‐induced folding scenario is in agreement with previously reported behavior of engineered molten globular enzymes. 36 , 37 , 38 , 39

The study of the engineered molten globular enzyme 37 includes the hypothesis that modern enzymes evolved from molten globular precursors. If the earliest cells indeed existed without aromatics, then the data in this paper adds weight to this evolutionary scenario of molten globular polypeptide ➔ molten globular enzyme ➔ modern enzyme, where the last step is enabled by the expansion of the genetic code to include the aromatic residues. While small aromatics‐less peptides have been reported previously to have catalytic properties, 40 , 41 aromatic amino acids have been considered essential for formation of tight structured proteins to support high‐performance catalysis. Association of protein fold stabilization with genetic code evolution has been addressed by several recent studies. 16 , 18 , 42 , 43 Most significantly for protein folding, basic and aromatic amino acids (at least the canonical ones) were most probably absent in the prebiotic set. 8 , 9 Early protein foldability thus would not be supported by salt bridges and aromatic core packing interactions that make up extant protein cores. Earlier studies suggested that this hindrance could be compensated by a halophilic environment because high salt stabilizes proteins structure and supported halophilic origins of life scenarios. 44 , 45 Using a small designed β‐trefoil protein highly enriched in the prebiotic amino acids (and completely devoid of aromatics), Longo et al. demonstrated that incorporation of a single aromatic amino acid can convert a foldable halophilic protein to a stable mesophile. 42 However, two other studies referenced here concluded that robust protein folds can be built with prebiotically plausible subset of the current 20 amino acids while the other amino acids (i.e., evolutionary late) contribute mainly to efficient catalysis. 16 , 18 These conclusions were drawn from stability and catalytic characterization of multiple variants of nucleoside diphosphate kinase reduced to 13 and 10 amino acid alphabets, respectively. While we cannot directly deny their conclusions by our study, it is important to notice that only ~80% of the proteins' sequence was occupied by prebiotically available amino acids in the two studies by the Akanuma group. None of the successfully expressed variants was completely rid of aromatic and other amino acids that are not regarded as prebiotically plausible (such as positively charged Lys/Arg). Therefore, these studies support (or at least do not rule out) key importance of aromatics in protein fold stability and their role in the transition from molten globule to stable globular proteins.

If the transition from molten globular to stable folded enzymes was mediated by the evolutionary later amino acids, this transition was also likely accompanied by evolution of functionality and substrate specificity. The specific aim of our study was not to resurrect an early version of DPCK per se but rather to explore the specific effect of the aromatic amino acid replacements on its structure–function relationship. However, future reverse evolution studies of this enzyme class should bear in mind that the early function could be altered or less specific.

To further test the role of aromatics in protein fold evolution, work in progress is to use bioinformatics tools to carry out disorder prediction with VSL2B on DPCK and the other identified ancient enzymes with their modern sequences and with their aromatics replaced by Leu. All of the ancient enzymes so far tested are predicted to be structured with the aromatics and disordered without these residues. The next step will be to apply additional bioinformatic tools that distinguish molten globules from other types of disorder. 46

In summary, we report an enzyme without aromatic amino acids that is still capable of a specific, hydrophobic core dependent catalysis. This enzyme is rich in secondary structure but exhibits a molten globule conformation in an unliganded form. Our study provides evidence that a tightly packed protein environment can be formed upon its ligand binding. This phenomenon could be relevant in the early stages of enzyme catalysis before the fixation of the contemporary amino acid alphabet.

4. METHODS

4.1. Plasmid preparation

DPCK genes for DPCK‐WT, DPCK‐LH, ‐L, ‐MH and ‐M were amplified by PCR using Pfu‐X DNA polymerase (Jena Bioscience, Germany) according to the following program: an initial denaturation at 95°C for 2 min; followed by the 32 cycles of denaturation at 95°C for 30 s; annealing at 56°C for 30 s; elongation at 68°C for 30 s; and a final extension at 68°C for 2 min. The PCR amplification for all genes was performed with the same set of primers: forward, 5′‐AAAAACATATGAAACGTATCGGTCTGACC‐3′, and reverse, 5′‐AAAAACTCGAGTTCCAGCGGGTCACGG‐3′. The PCR fragments were digested with XhoI (New England BioLabs) and NdeI (New England BioLabs), purified with Monarch PCR & DNA Cleanup Kit (New England BioLabs) and cloned into PET‐24a (+) C‐terminal polyhistidine‐tag vector (Novagen, Germany), which was digested by XhoI and NdeI and dephosphorylated by Antarctic Phosphatase (New England BioLabs) prior to ligation.

The plasmids were introduced into One Shot TOP10 Chemically Competent E. coli cells (Thermo Fisher Scientific) by heat shock protocol at 42°C for 60 s, and the cells were grown overnight at 37°C on LB agar plates containing 50 μg/ml of kanamycin (Sigma Aldrich). A single colony was selected, cells were grown overnight at 37°C in 5 ml of LB Broth (Sigma Aldrich) supplemented with 50 μg/ml of kanamycin (Sigma Aldrich) and plasmid DNA was isolated and analyzed by Sanger sequencing.

4.2. Protein expression and purification

Isolated plasmids were introduced into BL21 (DE3) Chemically Competent E. coli cells (Thermo Fisher Scientific), and the cells were grown overnight at 37°C in 5 ml of LB Broth (Sigma Aldrich) in the presence of 50 μg/ml of kanamycin. The overnight cultures were used to inoculate 500 ml of fresh LB medium, and the culture was propagated at 37°C at 220 rpm shaking. When OD600 reached 0.7‐0.8, isopropyl β‐d‐thiogalactopyranoside (IPTG, Sigma Aldrich) was added to final concentration of 0.5 mM and the cultivation was continued for 4 hr at 37°C. The cells were harvested by centrifugation at ×3000g for 20 min at 4°C. The cell pellets were resuspended in 15 ml of lysis buffer (20 mM Tris (pH 8.0), 20 mM NaCl, and 1 mM β‐mercaptoethanol) with one tablet of EASYpack protease inhibitor cocktail (Sigma Aldrich), incubated with 50 μg/ml of Lysozyme (Sigma Aldrich) and 6 U of RNase‐free DNase I (Jena Bioscience, Germany) at room temperature for 30 min, sonicated on ice at 1.5 W (18 cycles, 10 s on, 20 s off) and centrifuged at ×35000g for 30 min at 4°C. After, Tween‐20 (Sigma Aldrich) was added to supernatants to the final concentration of 0.1% (vol/vol), and the crude lysates were applied to 5 ml HiTrap Capto Q column (GE Healthcare Life Sciences) equilibrated with 5 volumes of buffer A (20 mM Tris (pH 8.0), 20 mM NaCl, 1 mM beta‐mercaptoethanol and 0.1% (vol/vol) Tween‐20). Then, the DPCK proteins were eluted with 0–50% gradient of buffer B (20 mM Tris (pH 8.0), 1 M NaCl, 1 mM beta‐mercaptoethanol and 0.1% (vol/vol) Tween‐20), and fractions from 15 to 35% of buffer B were collected and applied to 5 ml HisTrap HP column (GE Healthcare Life Sciences) equilibrated with 5 volumes of buffer C (20 mM Tris (pH 7.6), 500 mM NaCl, 10 mM imidazole, 1 mM beta‐mercaptoethanol and 0.1% (vol/vol) Tween‐20). The column was washed with 3% of buffer D (20 mM Tris (pH 7.6), 500 mM NaCl, 500 mM imidazole, 1 mM beta‐mercaptoethanol and 0.1% (vol/vol) Tween‐20) to remove unbound proteins, and the DPCK proteins were eluted with 0–50% gradient of buffer D. Fractions from 20 to 30% of buffer D were collected, concentrated up to 0.5 ml by centrifugation using 4 ml Amicon Ultra centrifugal unit (MWCO 10000, Millipore) and applied to Superdex 75 10/300 GL column (GE Healthcare Life Sciences) equilibrated with 2 column volumes of buffer E (50 mM Tris (pH 7.6), 500 mM NaCl, 20 mM KCl, 10 mM MgCl2 and 0.5 mM DTT). The DPCK variants were eluted as single peaks with approximate sizes of 29 kDa (DPCK‐WT), 33 kDa (DPCK‐LH) and 55 kDa (DPCK‐M). Molecular weights were estimated using Gel filtration low molecular weight calibration kit (GE Healthcare Life Sciences). After the confirmation of proteins integrity and purity by SDS‐PAGE analysis on 14% SDS‐polyacrylamide gel, the purified proteins were concentrated up to 10 mg/ml concentration and aliquoted. The aliquots were flash frozen in liquid nitrogen and stored at −80°C.

4.3. Basic biophysical characterization

The identities and molecular weights of purified proteins were confirmed by mass spectrometry using UltrafleXtreme MALDI‐TOF/TOF mass spectrometer (Bruker, Germany) according to the standard procedure. Protein concentrations were determined by amino acid analysis using a Biochrom 30+ Series Amino Acid Analyser (Biochrom, United Kingdom).

The size distribution of protein samples was characterized using dynamic light scattering (DLS) technique. Protein samples were diluted in PBS buffer (11.8 mM phosphate buffer (pH 7.6), 137 mM NaCl, 5 mM MgCl2, 2.7 mM KCl and 0.5 mM DTT) to the final concentration of 0.5 mg/ml and centrifuged at ×25000g for 30 min at 4°C. In order to remove dust particles, samples were filtered using 0.22 μm Ultrafree‐MC centrifugation filter (Millipore). The DLS measurements were performed in a quartz glass cuvette (light path 10 mm) at 18°C using a laser spectroscatter‐201 system (RiNA GmbH Berlin, Germany). A series of 35 measurements with a sampling time of 30 s and a wait time of 1 s was conducted for each sample. A diode laser of wavelength 685 nm and an optical power of 30 mW was used as the source. The scattered light was collected at a fixed scattering angle of 90°, and the autocorrelation functions were analyzed with the program CONTIN to obtain hydrodynamic radius distributions. DLS measurements were performed for protein samples in the presence of 200 μM ATP to estimate the effect of ATP binding on the hydrodynamic radius of proteins.

4.4. Enzyme assays

DPCK activities of recombinant proteins were measured by a coupling assay using ADP Quest Assay kit (Eurofins DiscoverX) according to the manufacturer's instructions. Enzyme assays were carried out using 80 ng (32 nM) of DPCK‐WT, 500 ng (214 nM) of DPCK‐LH and 900 ng (386 nM) of DPCK‐M and two kind of substrates, 0–200 μM for dephospho‐CoA (dCoA) at 200 μM ATP and 0–200 μM for ATP without and with 200 μM dCoA to estimate ATPase and phosphotransferase activities of enzymes. All reactions were performed in assay buffer containing 15 mM Hepes (pH 7.4), 20 mM NaCl, 1 mM EGTA, 0.02% Tween‐20, 10 mM MgCl2, and 0.1% bovine gamma globulin in 96‐well black microplate with 40 μl total volume. After 20 μl of reagent A and 40 μl of reagent B were added, the plates were heated at 37°C for 10 min, and the reactions were started by adding ATP. The fluorescent intensity signal was measured at 37°C in kinetic mode with 2 min intervals using CLARIO star microplate reader (BMG LABTECH, Germany) at excitation/emission wavelengths of 530/590 nm. The kinetic parameters were calculated using the non‐linear regression function using the single saturating concentrations of substrates. Substrate conversion did not exceed 10%. The experiments were repeated three times, and kinetic values are presented as the means ± SE.

HPLC‐MS analysis was used for comparative detection of the reaction analytes. For this purpose, 100 μl of reaction mixtures were prepared by mixing 1 μg (0.42 μM) of protein, 100 μM dCoA and 100 μM ATP in 25 mM NH4HCO3 (pH 7.6), 300 mM NaCl, 20 mM KCl and 10 mM MgCl2. The reaction mixture was incubated at 37°C for 1 hr, then, reaction was stopped by adding 100 μl of acetonitrile (Sigma Aldrich). Precipitated recombinant protein was separated by centrifugation at ×20000g at 4°C for 20 min.

The reaction samples were analyzed using the Dionex Ultimate 3000RS HPLC equipped with TSQ Quantiva MS detector (Thermo Fisher Scientific). The ESI source was used for ionization in a positive mode. The HPLC solvent system consisted of 10 mM (NH4)2CO3 (pH 9.3) (A) and 97% acetonitrile (B). One microliter sample was injected in 50% B and the analysis was performed using the gradient of 15% A and 85% B for 3.5 min followed by an increase to 75% A and 25% B over 11.5 min and its continuation for 10 min with the SeQuant® ZIC®‐pHILIC column (5 μm, 150 mm × 2.1 mm, Merck), at a flow rate of 0.13 ml/min.

4.5. Circular dichroism spectroscopy

ECD spectra were collected using a Jasco 1500 spectrometer (JASCO, Japan) in the 195–280 nm spectral range using a 0.01 cm cylindrical quartz cell. The experimental setup was as follows: 0.05 nm step resolution, 5 nm/min scanning speed, 16 s response time, 1 nm spectral band width and 2 accumulations. After baseline correction, the spectra were expressed as molar ellipticity per residue θ (deg·cm2·dmol−1). The protein samples were diluted in PBS buffer (11.8 mM phosphate (pH 7.6), 137 mM NaCl, 5 mM MgCl2, 2.7 mM KCl and 0.5 mM DTT) with addition of 0–2 M urea (specifically 5, 10, 50, 100, 500, 1000, and 2000 mM urea concentrations). The blank spectrum of an aqueous buffer (with or without urea in a corresponding concentration) was used to correct the observed spectrum of the sample. The numerical analysis of secondary structures was performed using the CDPro software package. 47

4.6. Limited proteolysis

Kinetic studies on specific proteolytic cleavage by Lys‐C endoproteinase were performed as follows. First, recombinant proteins were diluted in Lys‐C cleavage buffer (25 mM Tris (pH 8.0), 300 mM NaCl, 1 mM EDTA, and 0.5 mM TCEP) to the final concentration of 1 mg/ml, and then reaction mixtures for proteolytic digestion were prepared by mixing 7 μl of 1 mg/ml recombinant protein and 56 μl of Lys‐C cleavage buffer. After incubation at 37°C for 10 min proteolytic cleavage was initiated by adding 7 μl of 5 ng/μl Lys‐C endoproteinase. After 0, 2, 5, 10, 20 and 40 min of incubation at 37°C 10 μl of the reaction mixture was taken out, and Lys‐C was inactivated by adding 2 μl of 6× SDS‐PAGE sample buffer (375 mM Tris–HCl (pH 6.8), 9% SDS, 50% glycerol, 9% beta‐mercaptoethanol and 0.03% bromophenol blue) followed by heating at 95°C for 10 min. All samples then were subjected to SDS‐PAGE.

For quantitative evaluation of limited proteolysis, the rate constants of proteolysis were determined by monitoring the disappearance of an intact protein in a proteolysis reaction by SDS‐PAGE. The areas of the bands corresponding to the intact proteins were estimated from the gels using the ImageJ program and then expressed as the amount of protein remaining after each time point. Assuming the pseudo‐first order kinetics, the natural logarithms of the intact protein amounts were plotted against the time, and the plots were fitted with a first‐order rate equation.

4.7. Steady‐state ANS fluorescence

Steady‐state fluorescence measurements were performed using CLARIO star microplate reader (BMG LABTECH, Germany). Protein samples were diluted to 2 μM with ANS buffer (100 mM Tris (pH 7.6), 300 mM NaCl, 20 mM KCl, 10 mM MgCl2 and 0.5 mM DTT) and incubated with 0; 1; 10; 100 and 1000 μM of adenosine 5′‐[γ‐thio]triphosphate tetralithium salt (ATP‐γ‐S, Sigma Aldrich) at room temperature for 30 min. After incubation, 8‐anilino‐1‐naphthalenesulfonic acid ammonium salt (ANS, Sigma Aldrich) was added to the reaction mixtures to the final concentration of 400 μM, and the reaction mixtures were incubated for additional 5 min. The final volume of each reaction mixture was 50 μl. The ANS fluorescence was excited at 380 nm, and emission spectra were recorded between 410 and 650 nm. To estimate the conformational changes induced upon dCoA binding the fluorescence intensity measurements were performed for protein samples in the presence of 200 μM dCoA. All measurements were performed in triplicates and then averaged to yield steady‐state fluorescence spectra of ANS binding.

4.8. NMR spectroscopy

NMR spectra were obtained using the Bruker© Avance HD III 850 MHz instrument, equipped with triple‐resonance cryo‐probe. Sample volume was 0.16 ml in 3 mm NMR tubes, in 50 mM phosphate (pH 7.6), 280 mM NaCl, 10 mM MgCl2, 20 mM KCl and 0.5 mM TCEP. Protein concentration was 150 μM for 3D 15N/1H NOESY‐HSQC spectra, 30 μM for DPCK‐WT ATP titration, and 100 μM for both aromatic amino acid‐lacking mutants ATP titration. All proteins used in the study were 15N labeled. ATP titrations were followed using a series of standard 1D and 2D HN correlation spectra.

AUTHOR CONTRIBUTIONS

Mikhail Makarov: Formal analysis; investigation; methodology; writing‐original draft; writing‐review & editing. Jingwei Meng: Investigation; methodology. Vyacheslav Tretyachenko: Formal analysis; methodology; supervision. Pavel Srb: Formal analysis; investigation; methodology; validation; visualization. Anna Březinová: Formal analysis; methodology; validation; visualization. Valerio Giacobelli: Formal analysis; methodology; supervision; validation. Lucie Bednárová: Formal analysis; methodology; validation; visualization; writing‐original draft. Jiri Vondrasek: Conceptualization; methodology; software. Keith Dunker: Conceptualization; formal analysis; methodology; project administration; resources; supervision; writing‐review & editing. Klara Hlouchova: Conceptualization; formal analysis; funding acquisition; methodology; project administration; supervision; writing‐original draft; writing‐review & editing.

Supporting information

Appendix S1: Supporting Information

ACKNOWLEDGMENTS

We would like to thank Dr Radko Souček for help with the amino acid analysis and Dr Tereza Ormsby, Dr Rozálie Hexnerová and Dr Václav Veverka for helpful suggestions. This work was supported by the Czech Science Foundation (GAČR) grant number 17‐10438Y, project SVV260572/2020 (Vyacheslav Tretyachenko and Mikhail Makarov) and by the project BIOCEV (CZ.1.05/1.1.00/02.0109), from the European Regional Development Fund. We would also like to acknowledge the facility and support of the CMS‐Biocev (“Biophysical techniques”) supported by MEYS ČR (LM2018127).

Makarov M, Meng J, Tretyachenko V, et al. Enzyme catalysis prior to aromatic residues: Reverse engineering of a dephospho‐CoA kinase. Protein Science. 2021;30:1022–1034. 10.1002/pro.4068

Mikhail Makarov and Jingwei Meng shares authorship to this study.

Funding information European Regional Development Fund, Grant/Award Number: CZ.1.05/1.1.00/02.0109; Grantová Agentura České Republiky, Grant/Award Number: 17‐10438Y; Ministerstvo Školství, Mládeže a Tělovýchovy, Grant/Award Number: LM2018127; Univerzita Karlova v Praze, Grant/Award Number: SVV260572/2020

Contributor Information

A. Keith Dunker, Email: kedunker@iu.edu.

Klára Hlouchová, Email: klara.hlouchova@natur.cuni.cz.

REFERENCES

  • 1. Cleaves HJ II. The origin of the biologically coded amino acids. J Theor Biol. 2010;263:490–498. [DOI] [PubMed] [Google Scholar]
  • 2. Philip GK, Freeland SJ. Did evolution select a nonrandom "alphabet" of amino acids? Astrobiology. 2011;11:235–240. [DOI] [PubMed] [Google Scholar]
  • 3. Ilardo M, Meringer M, Freeland SJ, Rasulev B, Cleaves HJ II. Extraordinarily adaptive properties of the genetically encoded amino acids. Sci Rep. 2015;5:9414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Ilardo M, Bose R, Meringer M, et al. Adaptive properties of the genetically encoded amino acid alphabet are inherited from its subsets. Sci Rep. 2019;9:12468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Tretyachenko V, Vymětal J, Bednárová L, et al. Random protein sequences can form defined secondary structures and are well‐tolerated in vivo . Sci Rep. 2017;7:15449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Tanaka J, Doi N, Takashima H, Yanagawa H. Comparative characterization of random‐sequence proteins consisting of 5, 12, and 20 kinds of amino acids. Protein Sci. 2010;19:786–795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Newton MS, Morrone DJ, Lee KH, Seelig B. Genetic code evolution investigated through the synthesis and characterisation of proteins from reduced‐alphabet libraries. Chembiochem. 2019;20:846–856. [DOI] [PubMed] [Google Scholar]
  • 8. Higgs PG, Pudritz RE. A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. Astrobiology. 2009;9:483–490. [DOI] [PubMed] [Google Scholar]
  • 9. Trifonov EN. Consensus temporal order of amino acids and evolution of the triplet code. Gene. 2000;261:139–151. [DOI] [PubMed] [Google Scholar]
  • 10. Granold M, Hajieva P, Toşa MI, Irimie FD, Moosmann B. Modern diversification of the amino acid repertoire driven by oxygen. Proc Natl Acad Sci USA. 2018;115:41–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Fournier GP, Alm EJ. Ancestral reconstruction of a pre‐LUCA aminoacyl‐tRNA synthetase ancestor supports the late addition of Trp to the genetic code. J Mol Evol. 2015;80:171–185. [DOI] [PubMed] [Google Scholar]
  • 12. Yang XL, Otero FJ, Skene RJ, McRee DE, Schimmel P, Ribas de Pouplana L. Crystal structures that suggest late development of genetic code components for differentiating aromatic side chains. Proc Natl Acad Sci USA. 2003;100:15376–15380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Riddle DS, Santiago JV, Bray‐Hall ST, et al. Functional rapidly folding proteins from simplified amino acid sequences. Nat Struct Biol. 1997;4:805–809. [DOI] [PubMed] [Google Scholar]
  • 14. Akanuma S, Kigawa T, Yokoyama S. Combinatorial mutagenesis to restrict amino acid usage in an enzyme to a reduced set. Proc Natl Acad Sci USA. 2002;99:13549–13553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Longo LM, Lee J, Blaber M. Simplified protein design biased for prebiotic amino acids yields a foldable, halophilic protein. Proc Natl Acad Sci USA. 2013;110:2135–2139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Shibue R, Sasamoto T, Shimada M, Zhang B, Yamagishi A, Akanuma S. Comprehensive reduction of amino acid set in a protein suggests the importance of prebiotic amino acids for stable proteins. Sci Rep. 2018;8:1227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Solis AD. Reduced alphabet of prebiotic amino acids optimally encodes the conformational space of diverse extant protein folds. BMC Evol Biol. 2019;19:158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Kimura M, Akanuma S. Reconstruction and characterization of thermally stable and catalytically active proteins comprising an alphabet of ~13 amino acids. J Mol Evol. 2020;88:372–381. [DOI] [PubMed] [Google Scholar]
  • 19. Campen A, Williams RM, Brown CJ, Meng J, Uversky VN, Dunker AK. TOP‐IDP‐scale: A new amino acid scale measuring propensity for intrinsic disorder. Protein Pept Lett. 2008;15:956–963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Burley SK, Petsko GA. Aromatic‐aromatic interaction: A mechanism of protein structure stabilization. Science. 1985;229:23–28. [DOI] [PubMed] [Google Scholar]
  • 21. Di Mauro E, Dunker AK, Trifonov EN. Disorder to order, nonlife to life: In the beginning there was a mistake. In: Seckbach J, editor. Genesis—In the beginning. New York: Springer, 2012; p. 415–435. [Google Scholar]
  • 22. Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK. Sequence complexity of disordered protein. Proteins. 2001;42:38–48. [DOI] [PubMed] [Google Scholar]
  • 23. Oldfield CJ, Dunker AK. Intrinsically disordered proteins and intrinsically disordered protein regions. Annu Rev Biochem. 2014;83:553–584. [DOI] [PubMed] [Google Scholar]
  • 24. Bukhari SA, Caetano‐Anollés G. Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes. PLoS Comput Biol. 2013;9:e1003009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. Length‐dependent prediction of protein intrinsic disorder. BMC Bioinform. 2006;7:208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Brooks DJ, Fresco JR. Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol Cell Proteomics. 2002;1:125–131. [DOI] [PubMed] [Google Scholar]
  • 27. Sumbalova L, Stourac J, Martinek T, Bednar D, Damborsky J. HotSpot wizard 3.0: Web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res. 2018;46:W356–W362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Nurkanto A, Jeelani G, Yamamoto T, et al. Biochemical, metabolomic, and genetic analyses of dephospho coenzyme a kinase involved in coenzyme a biosynthesis in the human enteric parasite Entamoeba histolytica . Front Microbiol. 2018;9:2902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Seto A, Murayama K, Toyama M, et al. ATP‐induced structural change of dephosphocoenzyme a kinase from Thermus thermophilus HB8. Proteins. 2005;58:235–242. [DOI] [PubMed] [Google Scholar]
  • 30. Semisotnov GV, Rodionova NA, Kutyshenko VP, Ebert B, Blanck J, Ptitsyn OB. Sequential mechanism of refolding of carbonic anhydrase B. FEBS Lett. 1987;224:9–13. [DOI] [PubMed] [Google Scholar]
  • 31. Walia G, Surolia A. Insights into the regulatory characteristics of the mycobacterial dephosphocoenzyme a kinase: Implications for the universal CoA biosynthesis pathway. PLoS One. 2011;6:e21390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Alva V, Söding J, Lupas AN. A vocabulary of ancient peptides at the origin of folded proteins. Elife. 2015;4:e09410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Longo LM, Jabłońska J, Vyas P, et al. On the emergence of P‐loop NTPase and Rossmann enzymes from a Beta‐alpha‐Beta ancestral fragment. Elife. 2020;9:e64415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Romero Romero ML, Yang F, Lin YR, et al. Simple yet functional phosphate‐loop proteins. Proc Natl Acad Sci USA. 2018;115:E11943–E11950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Vyas P, Trofimyuk O, Longo LM, Deshmukh FK, Sharon M, Tawfik DS. Helicase‐like functions in phosphate loop containing beta‐alpha polypeptides. BioRxiv . 10.1101/2020.07.30.228619 [DOI] [PMC free article] [PubMed]
  • 36. Vamvaca K, Vögeli B, Kast P, Pervushin K, Hilvert D. An enzymatic molten globule: Efficient coupling of folding and catalysis. Proc Natl Acad Sci USA. 2004;101:12860–12864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Pervushin K, Vamvaca K, Vögeli B, Hilvert D. Structure and dynamics of a molten globular enzyme. Nat Struct Mol Biol. 2007;14:1202–1206. [DOI] [PubMed] [Google Scholar]
  • 38. Walter KU, Vamvaca K, Hilvert D. An active enzyme constructed from a 9‐amino acid alphabet. J Biol Chem. 2005;280:37742–37746. [DOI] [PubMed] [Google Scholar]
  • 39. Sapienza PJ, Li L, Williams T, Lee AL, Carter CW Jr. An ancestral tryptophanyl‐tRNA synthetase precursor achieves high catalytic rate enhancement without ordered ground‐state tertiary structures. ACS Chem Biol. 2016;11:1661–1668. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Bonfio C, Valer L, Scintilla S, et al. UV‐light‐driven prebiotic synthesis of iron‐sulfur clusters. Nat Chem. 2017;9:1229–1234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Weber AL, Pizzarello S. The peptide‐catalyzed stereospecific synthesis of tetroses: A possible model for prebiotic molecular evolution. Proc Natl Acad Sci USA. 2006;103:12713–12717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Longo LM, Tenorio CA, Kumru OS, Middaugh CR, Blaber M. A single aromatic core mutation converts a designed "primitive" protein from halophile to mesophile folding. Protein Sci. 2015;24:27–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Longo LM, Despotović D, Weil‐Ktorza O, et al. Primordial emergence of a nucleic acid‐binding protein via phase separation and statistical ornithine‐to‐arginine conversion. Proc Natl Acad Sci USA. 2020;117:15731–15739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Longo LM, Blaber M. Protein design at the interface of the pre‐biotic and biotic worlds. Arch Biochem Biophys. 2012;526:16–21. [DOI] [PubMed] [Google Scholar]
  • 45. Longo LM, Blaber M. Prebiotic protein design supports a halophile origin of foldable proteins. Front Microbiol. 2014;4:418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Huang F, Oldfield C, Meng J, et al. Subclassifying disordered proteins by the CH‐CDF plot method. Pac Symp Biocomput. 2012;17:128–139. [PubMed] [Google Scholar]
  • 47. Sreerama N, Woody, RW . Estimation of protein secondary structure from circular dichroism spectra: comparison of CONTIN, SELCON, and CDSSTR methods with an expanded reference set. Anal Biochem. 2000;287:252–260. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1: Supporting Information


Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES