Abstract
Recent advances in machine learning techniques have led to development of a number of protein design and engineering approaches. One of them, ProteinMPNN, predicts an amino acid sequence that would fold and match user‐defined backbone structure. Its performance was previously tested for proteins composed of standard amino acids, as well as for peptide‐ and protein‐binding proteins. In this short report, we test whether ProteinMPNN can be used to reengineer a non‐proteinaceous ligand‐binding protein, flavin‐based fluorescent protein CagFbFP. We fixed the native backbone conformation and the identity of 20 amino acids interacting with the chromophore (flavin mononucleotide, FMN) while letting ProteinMPNN predict the rest of the sequence. The software package suggested replacing 36–48 out of the remaining 86 amino acids so that the resulting sequences are 55%–66% identical to the original one. The three designs that we tested experimentally displayed different expression levels, yet all were able to bind FMN and displayed fluorescence, thermal stability, and other properties similar to those of CagFbFP. Our results demonstrate that ProteinMPNN can be used to generate diverging unnatural variants of fluorescent proteins, and, more generally, to reengineer proteins without losing their ligand‐binding capabilities.
Keywords: flavin, fluorescent protein, ligand, machine learning, protein engineering
1. INTRODUCTION
Proteins are versatile molecules that are being used more and more in research and applications. In some cases, natural proteins already possess desirable properties, whereas in others the proteins need to be modified (engineered) or developed from scratch (designed de novo) (Korendovych & DeGrado, 2020; Woolfson, 2021). Both protein engineering and protein design are usually conducted by iterating in silico modeling and experimental testing. Protein engineering is usually easier, with lesser amount of experimental trial‐and‐error, whereas protein design is more complicated, requiring extensive in vitro and in vivo experimentation, which is compensated by possibility of development of completely new‐to‐nature folds and functionalities.
Recent algorithm and hardware advances led to development of a series of powerful techniques for protein structure prediction, design, and engineering (Baek et al., 2021; Dauparas et al., 2022; Jumper et al., 2021; Watson et al., 2023). Among those, ProteinMPNN is a neural network‐based method, which predicts a sequence of natural amino acids that has high probability of assuming user‐defined backbone conformation (Dauparas et al., 2022). It was used for improving expression of de novo designed (“hallucinated”) symmetric proteins (Wicky et al., 2022), increased the computational efficiency of designing protein binders (Bennett et al., 2023), and facilitated design of new soluble thermostable luciferases (Yeh et al., 2023). Finally, in conjunction with AlphaFold (Jumper et al., 2021) and diffusion models, ProteinMPNN was used to develop a pipeline for de novo protein design displaying a high success rate (Watson et al., 2023).
While many of previously developed approaches are able to take non‐protein atoms into consideration (usually requiring significant expertise and computational resources from the user) (Korbeld & Fürst, 2023), much faster and easier‐to‐use ProteinMPNN currently lacks such capability, thus precluding similarly undemanding engineering of ligand, substrate, or cofactor‐binding proteins. LigandMPNN is announced, but not very well described, and corresponding code is not yet available (Dauparas et al., 2023; Krapp et al., 2023; Krishna et al., 2023). Similar approach CARBonAra also remains to be thoroughly evaluated, and the code is not available. Here, we aimed to test whether ProteinMPNN can be used to reengineer the part of a ligand‐binding protein, which is not in direct contact with the ligand (Figure 1). A priori, it is not clear whether such an approach would work, because (i) protein dynamics might be important for binding, but might be disrupted by mutations in the ligand‐distant part of the protein; (ii) conformations of amino acids in direct contact with the ligand are extremely important for binding but might be affected by mutations that are introduced by ligand‐agnostic software in the next layer of amino acids. Consequently, we decided to test the approach (Figure 1, bottom row) using flavin‐binding fluorescent proteins (FbFPs) as a model.
FIGURE 1.

Protein reengineering approaches. (Top) Protein consists of proteinogenic amino acids only and acts as a monomer or binds peptides or proteins only; whole protein is reengineered using ligand‐agnostic approach. (middle) Protein consists of proteinogenic amino acids and a non‐proteinaceous inclusion; whole protein is reengineered. (bottom) Protein consists of proteinogenic amino acids and a non‐proteinaceous inclusion; ligand‐proximal amino acids are fixed, and ligand‐distant amino acids are redesigned.
FbFPs are ~110 amino acids long single domain soluble proteins that bind endogenous flavins and display characteristic fluorescence. FbFPs found several applications as small fluorescent reporters not requiring oxygen (Drepper et al., 2007; Ozbakir et al., 2020). LOV domains, from which FbFPs are derived, have also been developed into optogenetic tools (Losi et al., 2018) and singlet oxygen generators (Westberg et al., 2019). Given the simplicity of handling them, FbFPs are a convenient model for evaluating a new approach. At the same time, flavin‐protein interaction is far from a simple one: isoalloxazine moiety makes both hydrogen bonds and hydrophobic interactions with the protein, while the ribityl moiety is flexible and makes some of its contacts with the protein via ordered water molecules. We tested three ProteinMPNN‐predicted proteins, bearing 55%–66% sequence identity to the original variant CagFbFP (Nazarenko et al., 2019), and found that all three of them fold and function comparably to CagFbFP. Our results highlight ProteinMPNN as a fast and efficient method for reengineering proteins with complex ligand‐binding sites and reveal potential for generation of fluorescent proteins highly divergent from those found in Nature.
2. RESULTS
As the template for FbFP reengineering, we have chosen CagFbFP, a well‐functionally and structurally characterized thermostable variant (Nazarenko et al., 2019; Smolentseva et al., 2021). The protein essentially consists of a chromophore, FMN, surrounded by a single layer of secondary structure elements (Figure 2a). We reasoned that the well‐conserved amino acids in the immediate vicinity of the chromophore (Glantz et al., 2016) are important, whereas it might be possible to mutate others without compromising binding. Consequently, we fixed (i) amino acids whose side chains contained heavy (non‐hydrogen) atoms within 4.5 Å of any heavy atoms of the ligand, and (ii) glycines that contained heavy atoms at the same distance from the ligand (Gly146 in the case of CagFbFP). This approach allowed us to retain amino acids that interact with the ligand via their side chains, while not constraining the amino acids involved in interactions only through their backbones. By fixing the glycine, we ensured that ProteinMPNN would not substitute it with a larger amino acid that might disrupt the binding site. In total, 20 amino acids were chosen to be fixed during the sequence optimization process (Figure 2a). We also chose to engineer variants devoid of cysteines (to avoid undesirable links), tryptophans (to avoid interference with flavin photophysics), and histidines (to avoid ambiguous protonation in experiments and simulations).
FIGURE 2.

(a) Crystallographic structure of CagFbFP (PDB ID 6RHF (Nazarenko et al., 2019)) with FMN (green). Residues that were fixed during the ProteinMPNN sequence optimization are shown in sticks representation, and Gly146 Cα is shown as a sphere. (b) Structural ensembles of engineered proteins observed in 100 ns MD simulations. 50 frames taken each 2 ns are shown. L1 is the loop comprising residues 55–60, and L2 is the loop comprising residues 135–142. (c) Backbone atoms RMSF profiles of engineered variants during 100 ns MD simulation.
All LOV‐domains and FbFPs with known x‐ray structures exhibit nearly identical backbone conformations (core atoms RMSD of 0.5–0.7 Å), with the major differences found in the conformation and length of loops unrelated to the ligand binding site (Yudenko et al., 2021). ProteinMPNN allows introduction of Gaussian noise to the backbone coordinates to diminish influence of a particular structure used for design. Consequently, we generated two sequences named NeuroFbFP_A and B using Gaussian noise with standard deviation (SD) of 0.1 Å and one sequence named NeuroFbFP_C using noise with SD of 0.2 Å. We chose a ProteinMPNN model that predicts amino acid identity considering 48 neighboring amino acids and was trained using Gaussian noise with SD of 0.2 Å. This model provides a good balance between sequence recovery and AlphaFold2 success rate, as reported in the original study (Dauparas et al., 2022). Amino acid sequences of the generated proteins are provided in Supporting Information.
For in silico validation of generated sequences, we first used AlphaFold2‐based ColabFold pipeline (Jumper et al., 2021; Mirdita et al., 2022) to predict their structures. The predicted structures exhibited backbone conformations identical to those of the original protein (PDB ID 6RHF (Nazarenko et al., 2019)) with root‐mean‐square deviations of Cα atoms below 0.7 Å (structure files are available as Supporting Information). Next, we performed molecular dynamics simulations to see if they could identify any major issues such as unfavorable amino acid and ligand arrangements, leading to high fluctuation or distortion of the backbone, protein unfolding, or ligand unbinding. All three proteins maintained a rigid backbone conformation with minimal fluctuations (Figure 2b, c). The only noticeable fluctuations were observed in loops distant to the ligand binding site. Moreover, all hydrogen bonds between protein side chains and the flavin moiety remained intact throughout the simulations. Therefore, all predicted sequences were deemed suitable for subsequent experimental evaluation.
We synthesized NeuroFbFP genes optimized for Escherichia coli expression and observed that all three designed proteins were successfully expressed. Cells with NeuroFbFP_A and NeuroFbFP_C displayed visible fluorescence on agar plates (Supporting Figure 1). Cells with NeuroFbFP_A exhibited high chromophore‐bound protein expression level and displayed pronounced green color brighter than that of CagFbFP. Conversely, NeuroFbFP_B demonstrated a remarkably low yield of 0.6 mg per liter of LB culture (Table 1). The purified protein samples exhibited characteristic yellow color and were fluorescent, with their emission and excitation spectra matching those of the original protein without any spectral shifts (Figure 3a). None of the purified proteins displayed any propensity to aggregate during storage at 4°C. Peaks in size‐exclusion chromatograms of NeuroFbFP_A and NeuroFbFP_C correspond to molecular weights of 15.0 and 13.9 kDa (Supporting Figure 2), whereas theoretical molecular weight of the proteins is around 12.5 kDa. Small‐angle x‐ray scattering (SAXS) experiments revealed that NeuroFbFP_A exists as a mixture of monomers and dimers in solution, whereas NeuroFbFP_C is monomeric (Supporting Figure 3). NeuroFbFP_B was not tested due to low yield. Original protein CagFbFP is dimeric (Nazarenko et al., 2019).
TABLE 1.
Main properties of the original variant CagFbFP (Nazarenko et al., 2019) and reengineered proteins.
| CagFbFP | NeuroFbFP_A | NeuroFbFP_B | NeuroFbFP_C | |
|---|---|---|---|---|
| Expression yield, mg/L | 21 | 22 | 0.6 | 8 |
| λem, nm | 498 | 498 | 498 | 498 |
| λabs, nm | 447 | 447 | 447 | 448 |
| λex, nm | 450 | 451 | 451 | 452 |
| T m, °C a | 80 | 74 | 71 | 57 |
| Chromophore K d, nM | 350 | 325 | 186 | 235 |
| Extinction coefficient, M−1 cm−1 | 15,300 | 13,800 | 14,300 | 13,400 |
| Quantum yield | 0.34 | 0.39 | 0.40 | 0.39 |
| Brightness, M−1 cm−1 | 5200 | 5400 | 5700 | 5200 |
Only the highest transition temperature that presumably corresponds to FMN‐bound species (Smolentseva et al., 2021) is reported.
FIGURE 3.

(a) Fluorescence excitation (dashed) and emission (solid) spectra of engineered variants compared to those of CagFbFP. The spectra are normalized so that the maxima correspond to 1. CagFbFP, NeuroFbFP_A, and NeuroFbFP_B spectra are shifted by 0.2 for clarity. (b) Decay of CagFbFP and NeuroFbFP fluorescence upon thermal denaturation. Fluorescence is normalized so that the starting value for the first melt is 1. The same sample is heated to obtain the first melt, then cooled down to 30°С, equilibrated for 10 min at 30°С and then reheated to obtain the second melt in each case. Different transitions correspond to protein species bound to different chromophores: riboflavin and FMN at ~66°C and ~81°C for CagFbFP (Smolentseva et al., 2021).
Given that protein stability may be important in some applications, we measured decay of NeuroFbFP fluorescence upon thermal denaturation (Figure 3b). We observed that NeuroFbFP_A purified from E. coli displays two transitions at 67 and 74°C, which most probably indicates presence of protein species bound to two different chromophores, riboflavin and FMN, as in the case of CagFbFP (Smolentseva et al., 2021). NeuroFbFP_B purified from E. coli displayed gradual loss of fluorescence in the range from 45 to 75°C (Supporting Figure 4), while the sample additionally incubated with FMN displayed a clear peak at ~71°C (Figure 3b). Finally, NeuroFbFP_C purified from E. coli displayed a transition at 57°C (Figure 3b). To evaluate the ability of the proteins to refold and display the same properties, we repeated the denaturation experiments with the same samples. Second heating cycles produced similar results, reflecting similar properties of refolded NeuroFbFPs (Figure 3b). Interestingly, while CagFbFP previously exhibited ~70% refolding efficiency (Nazarenko et al., 2019) (as low as 0% in saturated NaCl and urea solutions), followed by slow partial refolding of the remaining fraction (unpublished data), in the experiments presented here CagFbFP, NeuroFbFP_A and C display very high renaturation efficiency at the timescale of the experiment (conducted at 1°C/min heating and cooling rates, with the two denaturation‐renaturation cycles separated by 10 min of equilibration at 30°С). This refolding efficiency, as judged by fluorescence recovery, is even more remarkable given that the flavins are damaged at high temperature by the probe light of the instrument (Smith & Metzler, 1963; Wingen et al., 2017), thus making complete recovery of fluorescence impossible.
FIGURE 4.

(a) Phylogenetic tree showcasing representative FbFPs and LOV domains. CagFbFP and NeuroFbFPs are highlighted in red. Bacterial proteins are shown with green dots, archaeal with red, and eukaryotic with blue. (b) Sequence alignment comparing CagFbFP and NeuroFbFPs, with the sequence logo of all previously identified putative LOV domains (Glantz et al., 2016) displayed on top. Indices correspond to amino acid numbering in CagFbFP. Amino acids are colored in accordance with BLOSUM62 scores relative to the consensus sequence. Asterisks mark the sites where most natural homologues have hydrophobic or small polar amino acids, but ProteinMPNN suggested a charged amino acid (or Asn instead of Gly90). Triangles mark the sites where most natural homologues have charged amino acids, but ProteinMPNN suggested glycine or alanine in at least one design.
One of the main characteristics of any ligand‐binding protein is the corresponding dissociation constant. Given that FbFPs expressed in E. coli may bind different endogenous chromophores (Smolentseva et al., 2021), we reconstituted NeuroFbFPs with FMN and found the dissociation constants to be in the range of 186–325 nM (Supporting Figure 5, Table 1), similar to the dissociation constant of 350 nM observed for the original protein (Smolentseva et al., 2021). We have also measured extinction coefficients and quantum yields of NeuroFbFPs (Table 1). Interestingly, while the extinction coefficients of all three NeuroFbFPs were slightly lower than that of the original protein, their slightly greater quantum yields resulted in a comparable brightness to that of the CagFbFP.
Finally, to rationalize the effects of mutations introduced by ProteinMPNN, we have generated a phylogenetic tree of representative LOV domains and FbFPs (Valle et al., n.d.; Glantz et al., 2016; Nazarenko et al., 2019; Wingen et al., 2017; Yee et al., 2015) (Figure 4a). Surprisingly, all three NeuroFbFPs are positioned further from the root of the tree, rather than closer to it, given that many engineering approaches rely on attaining the ancestral/consensus sequences (Kazlauskas, 2018). Phylogenetic distances (average number of amino acid substitutions per site) from the tree root to CagFbFP, NeuroFbFP_A, B and C are 0.70, 0.81, 0.91 and 1.08, respectively, with sequence identities to other protein family representatives shown in Figure 4a of 47.2% ± 8.2%, 46.7% ± 5.5%, 46.6% ± 5.7%, and 43.7% ± 5.0%, respectively (means and standard deviations). This further deviation of NeuroFbFPs from the root of phylogenetic tree compared to CagFbFP can possibly be attributed to how ProteinMPNN handles surface residues. Notably, it replaced several conserved hydrophobic sites and small polar sites with charged and polar residues (Figure 4b). Moreover, we observed that NeuroFbFP_C displayed an unusually high content of 20 glutamate and lysine residues, compared to the average of 10.6 in natural LOV domains (calculated for a dataset by Glantz et al. (2016)) and 5 in CagFbFP. This finding aligns with the known tendency of ProteinMPNN to position glutamate and lysine residues on the protein surface (Dauparas et al., 2022). Nevertheless, in four cases (Asp55, Asp59, Glu103, Glu137) ProteinMPNN replaced charged amino acids, some of them well‐conserved, with glycine or alanine in some of the designs. On the other hand, positioning of NeuroFbFPs on the same branch of phylogenetic tree as CagFbFP might follow (i) from conservation of all ligand‐binding amino acids (hydrophobic ones vary to some degree among LOVs/FbFPs) or (ii) from ProteinMPNN recovering CagFbFP amino acids due to implicit features of the protein backbone such as characteristic structure of loops.
3. DISCUSSION
Our results show that it is possible to use ProteinMPNN to reengineer ligand‐binding (in our case, flavin‐binding) protein to obtain variants with 55%–66% sequence identity to the original variant, although it was not originally designed to be used in this way. Mutations suggested by ProteinMPNN presumably do not affect conformations of amino acids that are in direct contact with the ligand, as evidenced by relatively unchanged dissociation constants (Table 1). Moreover, inherent ProteinMPNN features allowed engineering of variants devoid of particular undesirable amino acids (in our case, cysteines, tryptophans, and histidines). Generation of new sequences using commonly available hardware takes several minutes, and the genes can be synthesized in a straightforward fashion thereafter. Use of molecular dynamics to validate the designs prior to experiments helped to visualize the engineered variants but didn't exclude any designs. Unexpectedly, in our case, all three out of three predicted and tested variants expressed and were functional (fluorescent).
In a paper submitted after this one, Sumida and colleagues demonstrate that ProteinMPNN can be used to reengineer another ligand‐binding colored protein, human myoglobin, with a comparable success rate (Sumida et al., 2024). There, a more conservative cutoff of 7 Å for fixing the ligand‐proximal amino acids during reengineering was chosen, but given the larger size of myoglobin, resulting designs had 41%–55% sequence identity with the most similar protein in the UniRef100 database (Sumida et al., 2024). Several examples where protein‐ or peptide‐binding proteins (TEV protease, ubiquitin, ghrelin receptor) were reengineered using ProteinMPNN similarly display high success rates (de Haas et al., 2023; Goverde et al., 2023; Sumida et al., 2024). Finally, new methods called LigandMPNN and CARBonAra were recently described that explicitly model non‐protein components, but their codes are not yet readily available (Dauparas et al., 2023; Krapp et al., 2023; Krishna et al., 2023). While the optical properties of reengineered FbFPs were similar, the expression level, thermal stability, and refoldability varied, highlighting the need for testing multiple variants. When bound to FMN, all of the proteins were less stable than the original variant, which is based on a protein from a thermophilic host (Nazarenko et al., 2019), but otherwise were well folded and stable at room temperature or 37°C. Lower expression level of NeuroFbFP_B might be the consequence of its lower ability to refold, which might be indicative of slower folding kinetics or lower probability of spontaneous folding, resulting in digestion by cellular proteases in vivo.
While the original protein, CagFbFP, is dimeric (Nazarenko et al., 2019), homooligomerization is hampered in the engineered variants: NeuroFbFP_A is partially monomeric and NeuroFbFP_C is fully monomeric. This is not unexpected, given that we did not aim at engineering proteins that assume a particular oligomeric form and monomeric structure was used as an input for sequence optimization in ProteinMPNN. The core of the dimerization interface is mostly hydrophobic in CagFbFP, and some of the interfacial hydrophobic amino acids were substituted for the charged ones by ProteinMPNN (mutation M51E in all three variants, and mutations F64D and V145R in NeuroFbFP_B and NeuroFbFP_C), which presumably resulted in lower dimerization efficiency in NeuroFbFP variants. Thus, reengineering with ProteinMPNN could be used to disrupt undesirable oligomerization interfaces. On the other hand, engineering of strictly monomeric proteins might be laborious and not straightforward, as observed for GFP‐like fluorescent proteins (Ai et al., 2014; Cranfill et al., 2016).
Previously, FbFPs and LOV domains were already used as model proteins for testing protein engineering approaches, including FoldX computations (Song et al., 2013), plasmid recombineering (Higgins et al., 2017), and directed evolution (Ko et al., 2019; Liang et al., 2022). CagFbFP was used as a scaffold for generation of color‐shifted variants harboring mutations that are usually highly destabilizing (Nikolaev et al., 2023; Röllen et al., 2021). Present data further highlights FbFPs and, in particular, CagFbFP, as convenient and easy‐to‐work‐with model proteins for testing various protein engineering techniques.
In general, application of ProteinMPNN and similar methods to reengineer FbFPs, LOV domains, and other fluorescent proteins might allow for generation of more diverse (more distant from ancestral sequences) and robust templates, allowing directed evolution campaigns that start from yet unexplored points in the sequence space. This strategy is also likely to work for those enzymes, for which dynamics and allostery of the protein overall is less important, given that ProteinMPNN is likely to stabilize the protein. Finally, we should note that facile generation of new diverse variants of well‐known proteins might influence intellectual property‐related strategies, and patenting of the identities and positions of active site amino acids appears necessary for robust protection.
4. MATERIALS AND METHODS
4.1. Sequence design
NeuroFbFP sequences were generated using ProteinMPNN (Dauparas et al., 2022) graphical interface (Dürr, 2023) hosted on Hugging Face platform on March 13, 2023. Model was set to v_48_020, sampling temperature to 0.1, backbone noise to 0.1 for NeuroFbFP_A and B, and 0.2 for NeuroFbFP_C. Sequences for NeuroFbFP_A and B were generated in fully independent runs. Cysteines, tryptophans, or histidines were explicitly excluded from the design. Nucleotide sequences optimized for E. coli expression were generated using Invitrogen GeneArt web service (Thermo Fisher Scientific, USA). iTerm‐PseKNC (Feng et al., 2019) was used for identification of transcriptional terminators. Detected terminators were removed manually by altering respective codons. Synthesized genes were cloned into pET‐28a(+) plasmid via Ncol and BamHI restriction sites. Amino acid and nucleotide sequences are provided in Supporting Information.
4.2. Expression, purification, and characterization
Proteins were expressed and purified as described previously (Nikolaev et al., 2023; Remeeva et al., 2023, respectively). E. coli strain C41 (DE3) cells transformed with protein‐encoding plasmids were cultured in shaking flasks in 400 mL LB medium containing 150 mg/L ampicillin until reaching optical density of 0.5–0.7. Protein expression was induced by addition of 1 mM IPTG and followed by incubation for 5 h at 37°C. Harvested cells were resuspended in lysis buffer containing 300 mM NaCl and 50 mM Tris–HCl, pH 8.0, and were disrupted in an M‐110P Lab Homogenizer (Microfluidics, USA). Cell membrane fraction was removed from the lysate by ultracentrifugation at 100 kg for 45 min at 10°C. Clarified supernatant was incubated with Ni‐nitrilotriacetic acid (Ni‐NTA) resin (Qiagen, Germany) with constant stirring. The supernatant with Ni‐NTA resin was loaded on a gravity flow column and washed with buffer containing 300 mM NaCl and 50 mM Tris–HCl, pH 8.0. The protein was eluted in a buffer containing 200 mM imidazole, 300 mM NaCl, and 50 mM Tris–HCl, pH 8.0. Eluted protein was transferred to the final buffer containing 300 mM NaCl and 50 mM Tris–HCl, pH 8.0 by dialysis.
Emission and excitation spectra, thermostability, extinction coefficients (via denaturation in guanidine hydrochloride), and quantum yields were measured exactly as described previously (Nikolaev et al., 2023). Briefly, the spectra were recorded using Synergy H4 Hybrid Microplate Reader (BioTek, USA). Melting curves of proteins were measured using the Rotor‐Gene Q real‐time PCR cycler (Qiagen, Germany) with excitation at 470 nm and emission measured at 510 nm. For estimation of the extinction coefficients, flavin concentration in each sample was determined by comparing the absorption of the sample incubated in 5.6 M GuHCl at 80°C for 10 min, where the protein is presumably fully denatured and the flavin is fully dissociated, to that of free flavin in the matching conditions. Quantum yields were obtained by comparing the fluorescence emission in the range of 648–457 nm of the NeuroFbFP samples versus that of free flavin, normalized by the absorption at 455 nm. Expression yields of holo‐proteins were calculated based on absorption of purified protein at 450 nm and on measured extinction coefficients.
4.3. Small‐angle x‐ray scattering
SAXS measurements were conducted at the 4C SAXS II beamline (Kim et al., 2017) (PLS II, Pohang, Korea) at 4°C. X‐ray beam wavelength was 1.24 Å. X‐ray beam size at the sample stage was 0.15 (V) × 0.24 (H) mm2, and a 2D SX 165 charge‐coupled detector (Rayonix, Evanston, IL, USA) was placed at a sample‐to‐detector distance of 1.00 m. The magnitude range of the scattering vector, q = (4π/λ) sin θ, was 0.032 Å−1 <q <0.6 Å−1, where 2θ is the scattering angle. For each ampule, 20 frames with integration time of 5 s were collected. The data were normalized to the intensity of the transmitted beam and radially averaged. The scattering of the buffer solution was acquired under the same conditions and subtracted for background compensation. For the primary data treatment, PrimusQt program from the ATSAS software suite was used (Manalastas‐Cantos et al., 2021). For information on sample oligomerization polydispersity, we prepared models of the proteins in monomeric and dimeric states using AlphaFold, with FMN added manually; models are available in the Supporting Information. Form factors were calculated using CRYSOL (Svergun et al., 1995). The experimental data were approximated by superposition of form factors using OLIGOMER (Konarev et al., 2003).
4.4. Determination of dissociation constants
Dissociation constants were estimated by measuring the bound‐to‐free chromophore ratio in a series of 7–9 twofold dilutions (Supporting Figure 5) in 300 mM NaCl, 50 mM Tris–HCl, pH 8.0, similar to our previous work (Smolentseva et al., 2021). After each dilution, the solution was gently mixed using a pipette and incubated at 37°C for 1 h to ensure complete equilibration (Jarmoskaite et al., 2020). Starting concentrations of protein‐bound and free flavin were determined from linear decomposition of the solution's absorption spectrum into a weighted sum of the reference spectra of individual components. Due to low absorption, bound‐to‐free flavin ratios for diluted samples were determined from linear decomposition of fluorescence emission spectra. Theoretical curve was fitted to the data using nonlinear least squares regression, with the dissociation constant and the ratio of bound‐to‐free chromophore fluorescence brightness serving as free parameters of the model.
4.5. Molecular dynamics simulations
Initial coordinates were generated using AlphaFold2‐based ColabFold pipeline v. 1.5 with the default options and 3 recycles without AMBER relaxation (Jumper et al., 2021; Mirdita et al., 2022). FMN position was modeled from the crystallographic structure of CagFbFP (PDB ID 6RHF (Nazarenko et al., 2019)), the molecule coordinates were copied manually after aligning the model to PDB ID 6RHF. Two structural water molecules bound to the backbone of 60th and 109th residues were added manually. Protonated structure of monomeric protein was prepared using CHARMM‐GUI PDB Reader & Manipulator tool (Park et al., 2023). Protonation states of titratable residues were assigned to pH = 8 based on pKa. The phosphate group of FMN was deprotonated (charge −2e). The system was neutralized by adding Na+ and Cl− ions to ionic strength of 0.15 M. The protein was solvated in a cubic TIP3P water (Jorgensen et al., 1983) box with a water layer of 10 Å around the protein using CHARMM‐GUI Solution Builder (Lee et al., 2016). Amber14SB (Maier et al., 2015) and General Amber Force Field 2 (GAFF2) (He et al., 2020) parameters were used for protein and FMN, respectively. Molecular dynamics simulations were conducted using OpenMM 7.6 (Eastman et al., 2017). The systems were energy minimized, equilibrated, and energy minimized again. After that, accurate partial charges of FMN cofactor were calculated using QM/MM modeling in the presence of the protein environment.
QM/MM modeling was conducted using ORCA 5.0.3 (Neese, 2012; Neese, 2017; Neese et al., 2020). Additive QM/MM scheme with electrostatic embedding was applied. B3LYP density functional (Becke, 1993; Lee et al., 1988; Stephens et al., 1994; Vosko et al., 1980) with ma‐def2‐SVP (Weigend & Ahlrichs, 2005; Zheng et al., 2011) basis set was used for QM part that included the FMN molecule. Optimized geometry of FMN in a static amino acid environment was obtained for the ground state. Partial atomic charges of FMN were calculated using the CHELPG approach (Breneman & Wiberg, 1990) and used for production runs instead of the charges provided by CHARMM‐GUI.
Systems with the updated FMN charges were energy minimized, equilibrated and production runs for 100 ns were carried out. Langevin integrator with a time step of 3 fs, friction coefficient of 1 ps, and reference temperature of 300 K was used. Pressure was kept by Monte‐Carlo barostat at 1 bar, applied every 25 steps. Cutoff of 1 nm was implemented for calculating Lennard‐Jones interaction. Particle mesh Ewald method (Essmann et al., 1995) was used for long‐range electrostatic interactions with error tolerance of 0.0005. Length of bonds containing hydrogen were constrained during the simulation.
4.6. Sequence alignment and phylogenetic tree
Multiple sequence alignment of the respective protein sequences was performed using the MUSCLE (Edgar, 2004) algorithm as implemented in MEGA v. 11.0.13 (Tamura et al., 2021). Non‐LOV/FbFP amino acids were trimmed from the alignment. Phylogenetic tree was calculated using FastTree2 v. 2.1.11 (Price et al., 2010). The tree was visualized using FigTree v. 1.4.4 (Rambaut, 2012). Default parameter sets were used for all algorithms. Sequence logo was generated using the Logomaker Python library (Tareen & Kinney, 2020).
AUTHOR CONTRIBUTIONS
Ivan Gushchin: Conceptualization; investigation; funding acquisition; project administration; writing – original draft; supervision; visualization; writing – review and editing. Andrey Nikolaev: Conceptualization; investigation; writing – original draft; visualization; writing – review and editing. Alexander Kuzmin: Investigation; writing – review and editing. Elena Markeeva: Investigation; writing – review and editing. Elizaveta Kuznetsova: Investigation; writing – review and editing. Yury L. Ryzhykau: Investigation; writing – review and editing. Oleg Semenov: Investigation; writing – review and editing. Arina Anuchina: Investigation; writing – review and editing. Alina Remeeva: Investigation; funding acquisition; writing – review and editing.
CONFLICT OF INTEREST STATEMENT
The authors declare no competing financial interests.
Supporting information
Supplementary Information: Supporting information including nucleotide and amino acid sequences of studied proteins, supporting Figures 1–5, AlphaFold models and molecular dynamics starting models of studied proteins is available.
Data S1: Supporting Information
ACKNOWLEDGMENTS
SAXS data were acquired at the 4C SAXS II beamline at the Pohang Accelerator in Korea. We are grateful to Anna Yudenko for involvement in earlier stages of planning the project. Protein engineering tasks were supported by a grant from the Ministry of Science and Higher Education of the Russian Federation (agreement 075‐03‐2023‐106, project FSMG‐2021‐0002). Protein characterization tasks were supported by the grant 21‐64‐00018 from the Russian Science Foundation.
Nikolaev A, Kuzmin A, Markeeva E, Kuznetsova E, Ryzhykau YL, Semenov O, et al. Reengineering of a flavin‐binding fluorescent protein using ProteinMPNN . Protein Science. 2024;33(4):e4958. 10.1002/pro.4958
Review Editor: Nir Ben‐Tal
REFERENCES
- Ai H, Baird MA, Shen Y, Davidson MW, Campbell RE. Engineering and characterizing monomeric fluorescent proteins for live‐cell imaging applications. Nat Protoc. 2014;9(4):910–928. 10.1038/nprot.2014.054 [DOI] [PubMed] [Google Scholar]
- Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a 3‐track neural network. Science. 2021;373(6557):871–876. 10.1126/science.abj8754 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Becke AD. Density‐functional thermochemistry. III. The role of exact exchange. J Chem Phys. 1993;98(7):5648–5652. 10.1063/1.464913 [DOI] [Google Scholar]
- Bennett NR, Coventry B, Goreshnik I, Huang B, Allen A, Vafeados D, et al. Improving de novo protein binder design with deep learning. Nat Commun. 2023;14:2625. 10.1038/s41467-023-38328-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breneman CM, Wiberg KB. Determining atom‐centered monopoles from molecular electrostatic potentials. The need for high sampling density in formamide conformational analysis. J Comput Chem. 1990;11(3):361–373. 10.1002/jcc.540110311 [DOI] [Google Scholar]
- Cranfill PJ, Sell BR, Baird MA, Allen JR, Lavagnino Z, de Gruiter HM, et al. Quantitative assessment of fluorescent proteins. Nat Methods. 2016;13(7):557–562. 10.1038/nmeth.3891 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, et al. Robust deep learning based protein sequence design using ProteinMPNN. Science. 2022;378(6615):49–56. 10.1126/science.add2187 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dauparas J, Lee GR, Pecoraro R, An L, Anishchenko I, Glasscock C, et al. Atomic context‐conditioned protein sequence design using LigandMPNN. bioRxiv 2023. 10.1101/2023.12.22.573103 [DOI]
- de Haas RJ, Brunette N, Goodson A, Dauparas J, Yi SY, Yang EC, et al. Rapid and automated design of two‐component protein nanomaterials using ProteinMPNN. bioRxiv 2023. 10.1101/2023.08.04.551935 [DOI] [PMC free article] [PubMed]
- Drepper T, Eggert T, Circolone F, Heck A, Krauß U, Guterl J‐K, et al. Reporter proteins for in vivo fluorescence without oxygen. Nat Biotech. 2007;25(4):443–445. 10.1038/nbt1293 [DOI] [PubMed] [Google Scholar]
- Dürr SL. ProteinMPNN Gradio Webapp (v0.3). Zenodo 2023.
- Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, et al. OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput Biol. 2017;13(7):e1005659. 10.1371/journal.pcbi.1005659 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. 10.1093/nar/gkh340 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG. A smooth particle mesh Ewald method. J Chem Phys. 1995;103(19):8577–8593. 10.1063/1.470117 [DOI] [Google Scholar]
- Feng C‐Q, Zhang Z‐Y, Zhu X‐J, Lin Y, Chen W, Tang H, et al. iTerm‐PseKNC: a sequence‐based tool for predicting bacterial transcriptional terminators. Bioinformatics. 2019;35(9):1469–1477. 10.1093/bioinformatics/bty827 [DOI] [PubMed] [Google Scholar]
- Glantz ST, Carpenter EJ, Melkonian M, Gardner KH, Boyden ES, Wong GK‐S, et al. Functional and topological diversity of LOV domain photoreceptors. Proc Natl Acad Sci. 2016;113(11):E1442–E1451. 10.1073/pnas.1509428113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goverde CA, Pacesa M, Dornfeld LJ, Goldbach N, Georgeon S, Rosset S, et al. Computational design of soluble analogues of integral membrane protein structures. bioRxiv 2023. 10.1101/2023.05.09.540044 [DOI] [PMC free article] [PubMed]
- He X, Man VH, Yang W, Lee T‐S, Wang J. A fast and high‐quality charge model for the next generation general AMBER force field. J Chem Phys. 2020;153(11):114502. 10.1063/5.0019056 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Higgins SA, Ouonkap SVY, Savage DF. Rapid and programmable protein mutagenesis using plasmid recombineering. ACS Synth Biol. 2017;6(10):1825–1833. 10.1021/acssynbio.7b00112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jarmoskaite I, AlSadhan I, Vaidyanathan PP, Herschlag D. How to measure and evaluate binding affinities. Elife. 2020;9:e57264. 10.7554/eLife.57264 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J Chem Phys. 1983;79(2):926–935. 10.1063/1.445869 [DOI] [Google Scholar]
- Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–589. 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kazlauskas R. Engineering more stable proteins. Chem Soc Rev. 2018;47(24):9026–9045. 10.1039/C8CS00014J [DOI] [PubMed] [Google Scholar]
- Kim K‐W, Kim J, Yun YD, Ahn H, Min B, Kim NH, et al. Small‐angle x‐ray scattering beamline BL4C SAXS at pohang light source II.Biodesign. 2017;5(1):24–29. [Google Scholar]
- Ko S, Hwang B, Na J‐H, Lee J, Jung ST. Engineered arabidopsis blue light receptor LOV domain variants with improved quantum yield, brightness, and thermostability. J Agric Food Chem. 2019;67(43):12037–12043. 10.1021/acs.jafc.9b05473 [DOI] [PubMed] [Google Scholar]
- Konarev PV, Volkov VV, Sokolova AV, Koch MHJ, Svergun DI. PRIMUS: a windows PC‐based system for small‐angle scattering data analysis. J Appl Cryst. 2003;36(5):1277–1282. 10.1107/S0021889803012779 [DOI] [Google Scholar]
- Korbeld KT, Fürst MJLJ. Curse and blessing of non‐proteinogenic parts in computational enzyme engineering. Chembiochem. 2023;24(12):e202300192. 10.1002/cbic.202300192 [DOI] [PubMed] [Google Scholar]
- Korendovych IV, DeGrado WF. De novo protein design, a retrospective. Q Rev Biophys. 2020;53:e3. 10.1017/S0033583519000131 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krapp LF, Meireles FA, Abriata LA, Dal Peraro M. Context‐aware geometric deep learning for protein sequence design bioRxiv 2023. doi: 10.1101/2023.06.19.545381. [DOI] [PMC free article] [PubMed]
- Krishna R, Wang J, Ahern W, Sturmfels P, Venkatesh P, Kalvet I, et al. Generalized biomolecular modeling and design with RoseTTAFold all‐atom. bioRxiv 2023. 10.1101/2023.10.09.561603 [DOI] [PubMed]
- Lee C, Yang W, Parr RG. Development of the Colle‐Salvetti correlation‐energy formula into a functional of the electron density. Phys Rev B. 1988;37(2):785–789. 10.1103/PhysRevB.37.785 [DOI] [PubMed] [Google Scholar]
- Lee J, Cheng X, Swails JM, Yeom MS, Eastman PK, Lemkul JA, et al. CHARMM‐GUI input generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM simulations using the CHARMM36 additive force field. J Chem Theory Comput 2016;12(1):405–413. 10.1021/acs.jctc.5b00935 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liang G‐T, Lai C, Yue Z, Zhang H, Li D, Chen Z, et al. Enhanced small green fluorescent proteins as a multisensing platform for biosensor development. Front Bioeng Biotechnol. 2022;10:1039317. 10.3389/fbioe.2022.1039317 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Losi A, Gardner KH, Möglich A. Blue‐light receptors for optogenetics. Chem Rev. 2018;118(21):10659–10709. 10.1021/acs.chemrev.8b00163 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maier JA, Martinez C, Kasavajhala K, Wickstrom L, Hauser KE, Simmerling C. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J Chem Theory Comput. 2015;11(8):3696–3713. 10.1021/acs.jctc.5b00255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manalastas‐Cantos K, Konarev PV, Hajizadeh NR, Kikhney AG, Petoukhov MV, Molodenskiy DS, et al. ATSAS 3.0: expanded functionality and new tools for small‐angle scattering data analysis. J Appl Cryst. 2021;54(1):343–355. 10.1107/S1600576720013412 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nat Methods. 2022;19(6):679–682. 10.1038/s41592-022-01488-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nazarenko VV, Remeeva A, Yudenko A, Kovalev K, Dubenko A, Goncharov IM, et al. A thermostable flavin‐based fluorescent protein from chloroflexus aggregans: a framework for ultra‐high resolution structural studies. Photochem Photobiol Sci. 2019;18(7):1793–1805. 10.1039/c9pp00067d [DOI] [PubMed] [Google Scholar]
- Neese F. The ORCA program system. WIREs Comput Mol Sci. 2012;2(1):73–78. 10.1002/wcms.81 [DOI] [Google Scholar]
- Neese F. Software update: the ORCA program system, version 4.0. Wiley Interdiscip Rev Comput Mol Sci. 2017;8(1):73–78. 10.1002/wcms.81 [DOI] [Google Scholar]
- Neese F, Wennmohs F, Becker U, Riplinger C. The ORCA quantum chemistry program package. J Chem Phys. 2020;152(22):224108. 10.1063/5.0004608 [DOI] [PubMed] [Google Scholar]
- Nikolaev A, Yudenko A, Smolentseva A, Bogorodskiy A, Tsybrov F, Borshchevskiy V, et al. Fine spectral tuning of a flavin‐binding fluorescent protein for multicolor imaging. J Biol Chem. 2023;299(3):102977. 10.1016/j.jbc.2023.102977 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ozbakir HF, Anderson NT, Fan K‐C, Mukherjee A. Beyond the Green fluorescent protein: biomolecular reporters for anaerobic and deep‐tissue imaging. Bioconjug Chem. 2020;31(2):293–302. 10.1021/acs.bioconjchem.9b00688 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park S‐J, Kern N, Brown T, Lee J, Im W. CHARMM‐GUI PDB manipulator: various PDB structural modifications for biomolecular modeling and simulation. J Mol Biol. 2023;435(14):167995. 10.1016/j.jmb.2023.167995 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Price MN, Dehal PS, Arkin AP. FastTree 2—approximately maximum‐likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rambaut A. FigTree v1.4. 2012.
- Remeeva A, Yudenko A, Nazarenko VV, Semenov O, Smolentseva A, Bogorodskiy A, et al. Development and characterization of flavin‐binding fluorescent proteins, part I: basic characterization. Methods Mol Biol. 2023;2564:121–141. 10.1007/978-1-0716-2667-2_6 [DOI] [PubMed] [Google Scholar]
- Röllen K, Granzin J, Remeeva A, Davari MD, Gensch T, Nazarenko VV, et al. The molecular basis of spectral tuning in blue‐ and red‐shifted flavin‐binding fluorescent proteins. J Biol Chem. 2021;296:100662. 10.1016/j.jbc.2021.100662 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith E, Metzler DE. The photochemical degradation of riboflavin. J Am Chem Soc. 1963;85(20):3285–3288. 10.1021/ja00903a051 [DOI] [Google Scholar]
- Smolentseva A, Goncharov IM, Yudenko A, Bogorodskiy A, Semenov O, Nazarenko VV, et al. Extreme dependence of chloroflexus aggregans LOV domain thermo‐ and photostability on the bound flavin species. Photochem Photobiol Sci. 2021;20:1645–1656. 10.1007/s43630-021-00138-3 [DOI] [PubMed] [Google Scholar]
- Song X, Wang Y, Shu Z, Hong J, Li T, Yao L. Engineering a more thermostable blue light photo receptor Bacillus subtilis YtvA LOV domain by a computer aided rational design method. PLoS Comput Biol. 2013;9(7):e1003129. 10.1371/journal.pcbi.1003129 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stephens PJ, Devlin FJ, Chabalowski CF, Frisch MJ. Ab initio calculation of vibrational absorption and circular dichroism spectra using Density functional force fields. J Phys Chem. 1994;98(45):11623–11627. 10.1021/j100096a001 [DOI] [Google Scholar]
- Sumida KH, Núñez‐Franco R, Kalvet I, Pellock SJ, Wicky BIM, Milles LF, et al. Improving protein expression, stability, and function with ProteinMPNN. J Am Chem Soc. 2024;146:2054–2061. 10.1021/jacs.3c10941 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Svergun D, Barberato C, Koch MHJ. CRYSOL—a program to evaluate x‐ray solution scattering of biological macromolecules from atomic coordinates. J Appl Cryst. 1995;28(6):768–773. 10.1107/S0021889895007047 [DOI] [Google Scholar]
- Tamura K, Stecher G, Kumar S. MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021;38(7):3022–3027. 10.1093/molbev/msab120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tareen A, Kinney JB. Logomaker: beautiful sequence logos in python. Bioinformatics. 2020;36(7):2272–2274. 10.1093/bioinformatics/btz921 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valle L, Coronel Y, Bravo G, Albarracín V, Farias ME, Abatedaga I. Archaeal LOV domains from lake diamante: first functional characterization of an halo‐adapted photoreceptor. 10.21203/rs.3.rs-3073767/v1 [DOI]
- Vosko SH, Wilk L, Nusair M. Accurate spin‐dependent electron liquid correlation energies for local spin density calculations: a critical analysis. Can J Phys. 1980;58(8):1200–1211. 10.1139/p80-159 [DOI] [Google Scholar]
- Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, et al. Novo Design of Protein Structure and Function with RFdiffusion. Nature. 2023;620:1089–1100. 10.1038/s41586-023-06415-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weigend F, Ahlrichs R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design and assessment of accuracy. Phys Chem Chem Phys. 2005;7(18):3297–3305. 10.1039/B508541A [DOI] [PubMed] [Google Scholar]
- Westberg M, Etzerodt M, Ogilby PR. Rational design of genetically encoded singlet oxygen photosensitizing proteins. Curr Opin Struct Biol. 2019;57:56–62. 10.1016/j.sbi.2019.01.025 [DOI] [PubMed] [Google Scholar]
- Wicky BIM, Milles LF, Courbet A, Ragotte RJ, Dauparas J, Kinfu E, et al. Hallucinating symmetric protein assemblies. Science. 2022;378(6615):56–61. 10.1126/science.add1964 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wingen M, Jaeger K‐E, Gensch T, Drepper T. Novel thermostable flavin‐binding fluorescent proteins from thermophilic organisms. Photochem Photobiol. 2017;93(3):849–856. 10.1111/php.12740 [DOI] [PubMed] [Google Scholar]
- Woolfson DN. A brief history of de novo protein design: minimal, rational, and computational. J Mol Biol. 2021;433(20):167160. 10.1016/j.jmb.2021.167160 [DOI] [PubMed] [Google Scholar]
- Yee EF, Diensthuber RP, Vaidya AT, Borbat PP, Engelhard C, Freed JH, et al. Signal transduction in light‐oxygen‐voltage receptors lacking the adduct‐forming cysteine residue. Nat Commun. 2015;6(1):1–10. 10.1038/ncomms10079 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yeh AH‐W, Norn C, Kipnis Y, Tischer D, Pellock SJ, Evans D, et al. De novo design of luciferases using deep learning. Nature. 2023;614(7949):774–780. 10.1038/s41586-023-05696-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yudenko A, Smolentseva A, Maslov I, Semenov O, Goncharov IM, Nazarenko VV, et al. Rational design of a split flavin‐based fluorescent reporter. ACS Synth Biol. 2021;10(1):72–83. 10.1021/acssynbio.0c00454 [DOI] [PubMed] [Google Scholar]
- Zheng J, Xu X, Truhlar DG. Minimally augmented Karlsruhe basis sets. Theor Chem Acc. 2011;128(3):295–305. 10.1007/s00214-010-0846-z [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Information: Supporting information including nucleotide and amino acid sequences of studied proteins, supporting Figures 1–5, AlphaFold models and molecular dynamics starting models of studied proteins is available.
Data S1: Supporting Information
