Abstract

The ability to control cellular processes using optogenetics is inducer-limited, with most optogenetic systems responding to blue light. To address this limitation, we leverage an integrated framework combining Lustro, a powerful high-throughput optogenetics platform, and machine learning tools to enable multiplexed control over blue light-sensitive optogenetic systems. Specifically, we identify light induction conditions for sequential activation as well as preferential activation and switching between pairs of light-sensitive split transcription factors in the budding yeast, Saccharomyces cerevisiae. We use the high-throughput data generated from Lustro to build a Bayesian optimization framework that incorporates data-driven learning, uncertainty quantification, and experimental design to enable the prediction of system behavior and the identification of optimal conditions for multiplexed control. This work lays the foundation for designing more advanced synthetic biological circuits incorporating optogenetics, where multiple circuit components can be controlled using designer light induction programs, with broad implications for biotechnology and bioengineering.
Keywords: optogenetics, automation, MoClo, yeast, high throughput, synthetic transcription factors, neural network, modeling, machine learning, multiplexing
Introduction
Optogenetics leverages genetically encoded light-sensitive proteins fused to biological effectors to precisely control cellular behavior in response to light.1−3 Such optogenetic technologies empower researchers to orchestrate cellular processes with exquisite spatiotemporal precision. Optogenetic technologies have found diverse applications, ranging from modulating gene expression,4−7 dissecting intricate signaling pathways,8−10 manipulating protein localization,11−15 or inducing targeted protein degradation.16,17 If independent and simultaneous control of distinct optogenetic systems is possible, engineering multiple optogenetic systems into the same cell or community of cells allows for a higher degree of control over complex biological functions. This independent control can be achieved via multiplexing, where multiple control signals are sent over a shared medium, in this case light.18,19 One approach is orthogonal multiplexing, where optogenetic systems responsive to different wavelengths of light are used. However, a significant limitation in optogenetics is the fact that most protein photoswitches are responsive to blue light.18 One approach to overcome this limitation is dynamic multiplexing, where specific light induction programs of the same wavelength, but with different duration and period of illumination pulses, are used to selectively activate optogenetic systems.
Benzinger and Khammash developed one strategy for dynamic multiplexed control of optogenetic systems by taking advantage of EL222 mutants with different response kinetics, that is different activation and reversion time scales in the light and dark, respectively.19 The authors built a falling edge detector from two different EL222 mutants, and this circuit generated a distinct response profile to light induction programs relative to another optogenetic system based on the cryptochome CRY2 and its binding partner CIB1. With sufficiently differentiated response kinetics, multiplexed control of optogenetic systems could be possible without the need for such additional circuitry. However, the response kinetics of optogenetics systems in vivo are not well understood and are difficult to measure at high throughput. The ability to rapidly construct and characterize optogenetic systems, coupled with data-driven modeling, presents a promising avenue to navigate this challenge, cutting down the search space to find maximally differentiated outputs for tailored multiplexing schemes.
In this work, we present a strategy for taking advantage of the native differences in response kinetics between protein photoswitches by using a previously described automated high-throughput optogenetic platform, Lustro,7,20 to identify light induction programs that allow for dynamic multiplexed control over optogenetic systems in Saccharomyces cerevisiae. We use the high-throughput measurement capabilities of Lustro to characterize a set of 13 blue light-responsive optogenetic split transcription factors (TFs). Optogenetic split TFs use complementary dimerization domains fused to a DNA-binding domain (DBD) and an activation domain (AD), such that light-induced dimerization of the protein pair reconstitutes the split TF and expression of the gene of interest is induced. We selected optogenetic split TFs for developing multiplexing strategies as their activity can be readily measured using a fluorescent protein reporter, control of gene expression is useful for a broad range of biological applications, and many mutants of optical dimerizers with different response kinetics are known (see Table 1). We used this high-throughput characterization to empirically identify specific sets of light induction programs that result in distinct activation levels for different blue light-sensitive optogenetic systems, allowing us to multiplex control over them using those light induction programs. We identified conditions for sequential activation, where differences in light sensitivity between optogenetic systems result in differential activation of each optogenetic system. We also identified conditions for “switching,” where one light induction program preferentially activates a first optogenetic system over a second but changing to another light program “switches” to preferential activation of the second optogenetic system over the first. We combine the high-throughput characterization with a Bayesian machine learning framework that aims to predict and optimize objectives for optogenetic control. Furthermore, we highlight the powerful synergy between high-throughput data collection and predictive models, showcasing how their integration can unravel the complexities of optogenetic systems, paving the way for a new era of finer cellular control and optimization.
Table 1. Optogenetic Split TFs Characterized in This Work (Figure 1)a.
| Optical dimerizer | Binding partner | Description |
|---|---|---|
| eMagA | eMagB, eMagBF, or eMagBM | Enhanced magnet dimerizer24 |
| eMagAF | eMagB, eMagBF, or eMagBM | Enhanced magnet dimerizer with faster response kinetics24 |
| eMagB | eMagA or eMagAF | Enhanced magnet dimerizer24 |
| eMagBF | eMagA or eMagAF | Enhanced magnet dimerizer with faster response kinetics24 |
| eMagBM | eMagA or eMagAF | Enhanced magnet dimerizer with slower response kinetics7 |
| CRY2FL | CIB1 | Full length CRY225 |
| CRY2PHR | CIB1 | CRY2 truncation (residues 1–498 of CRY2)25 |
| CRY2(535) | CIB1 | CRY2 truncation (residues 1–535 of CRY2)22 |
| CRY2PHR (L348F) | CIB1 | Long-reversion mutant of CRY2PHR22 |
| CRY2PHR (W349R) | CIB1 | Short-reversion mutant of CRY2PHR22 |
| EL222 | N/A (homodimerizer) | Homodimerizer with fast activation and reversion response kinetics26 |
| EL222 (A79Q) | N/A (homodimerizer) | Medium-reversion mutant of EL22227 |
| EL222 (AQTrip) | N/A (homodimerizer) | Long-reversion mutant of EL22227 |
The eMagA/eMagB (and variants) dimerizer pair was previously engineered from VVD (a homodimerizer taken from Neurospora crassa) into a heterodimerizer and used to create two-component split TFs.23,24 The CRY2/CIB1 (and variants) dimerizer pair was derived from Arabidopsis thaliana and engineered into a two-component split TF.25 EL222 and its variants are a homodimerizer taken from Erythrobacter litoralis and engineered into a single-component split transcription factor (with the VP16AD) for use in eukaryotes.26 Split TFs derived from heterodimerizers (CRY2/CIB1, eMagA/eMagB, and their variants) use Gal4AD and Gal4DBD.7,28 Additional plasmid information in Table S3.
Results and Discussion
Characterization of Optogenetic Transcription Factors Using Lustro
Lustro7 was used to characterize the expression profiles of a set of blue light-sensitive split transcription factors (see Table 1) in response to different light induction programs. These optogenetic systems drive expression of a fluorescent protein, mScarlet-I,21 allowing measurement of gene induction by proxy measurement of fluorescence level. Square-wave light pulses were used to induce optogenetic TFs, with varying light pulse intensity, period, and duty cycle, where the period is the amount of time between light pulses and the duty cycle is the percentage of time the light is on during the period. The response of each optogenetic system to this range of light induction programs is dependent on the response kinetics (activation and reversion time) of the light-sensitive proteins as well as their native light sensitivity. Relative induction level of gene expression by an optogenetic TF is determined for each light condition by normalizing fluorescence measurements under those conditions to the difference of fluorescence between the constant light and constant dark control conditions for that same TF. This allows comparison of relative induction to be made between optogenetic systems, even when the magnitude of response of one system differs from another. While in-depth characterizations have been performed for a subset of optogenetic tools,22,23 this sweep directly compares response kinetics and sensitivity of a range of optogenetic systems side-by-side in the same biological context. The maximum light pulse period within light induction programs used for screening was limited to 4 h with data compared 10 h into induction. We chose 10 h as cultures are in exponential growth for these strains and conditions. Because Lustro collects data over time, other time points can be readily compared when desirable for a given application.
Sequential Activation of Optogenetic Systems
Data from the initial scan (Figure 1) were used to identify candidates for sequential activation, where the first light program preferentially activates one optogenetic system of a pair, and the second light program activates both optogenetic systems. Sequential activation could be useful for bioproduction processes where different stages of fermentation are desired to optimize yield.29 CRY2(L348F)/CIB122,25,28 and eMagAF/eMagBF23,24,30 were identified as a candidate TF pair. CRY2(L348F)/CIB1 is very sensitive to light intensity and reaches a high level of activation at low light doses. eMagAF/eMagBF is less sensitive to light intensity, requiring a higher dose of light to reach maximal activation. In order to demonstrate that sequential control of blue light systems in the same strain is possible, the TF pair was cloned into the same strain with CRY2(L348F)/CIB1 driving expression of mScarlet-I and eMagAF/eMagBF driving expression of a second, orthogonal reporter, miRFP68031 (and using an orthogonal DNA-binding domain, LexA22). Each strain was characterized in response to a range of light intensities using Lustro (see Figure 2).
Figure 1.
(A) Diagram showing activation of an optogenetic split TF. Blue light causes the split TF to dimerize, localizing the activation domain (AD) to the DNA-binding domain (DBD). This induces expression of the gene of interest (mScarlet-I), causing red fluorescence to increase in the cell. (B) Lustro workflow. Using laboratory automation, cells are cultured in a 96-well plate and subjected to successive rounds of illumination, shaking, and measuring, every 30 min. Fluorescence values are measured and analyzed. (C) Lustro was used to characterize the responses of several different optogenetic split TFs to varying light pulse intensity (μW/cm2), period (min), and duty cycle (%). AUC (area under the curve), indicating the total light dose during the experiment, is in μW·h/cm2. Data shown are relative fluorescence levels (where the constant dark value is set to 0 and the constant light value is set to 1) collected 10 h into light induction for the given TFs. LH (Low–High) and HH (High–High) designate relative expression levels of the two components of a split TF7 (yMM1760 and yMM1761; see Table S1), used here to demonstrate that changes in relative expression levels affect the response kinetics of two-component split TFs.
Figure 2.

Sequential activation of optogenetic systems by a range of light intensities tested with Lustro. Two optogenetic systems are compared, CRY2PHR(L348F)/CIB1 and eMagAF/eMagBF, both engineered into the same strain (yMM1826; darker dots) and each in an individual strain (yMM1825 and yMM1781; lighter dots). The CRY2PHR(L348F)/CIB1 split TF drives expression of mScarlet (blue dots) and the eMagAF/eMagBF split TF drives expression of an orthogonal fluorescent reporter, miRFP680 (red dots). Values shown were measured after 10 h of constant light induction. The CRY2PHR(L348F)/CIB1 system activates at lower light intensities than the eMagAF/eMagBF system.
Multiplexed Control for Switching between Optogenetic TFs
We next identified candidate pairs of optogenetic systems for dynamic multiplexed control over switching states18 (Figure 3). These are pairs of optogenetic systems where one system is more highly activated than another with one light condition, but under a different light condition, they switch and the second optogenetic system is more highly activated. We took advantage of the characterization of different response kinetics and sensitivity in response to different light pulses from the initial screen performed (see Figure 1). We empirically compared relative induction between optogenetic systems and light conditions in a pairwise manner to find which pairs of optogenetic systems and light conditions had the largest difference in relative induction, indicating potential for switching (Figure 3A). The 16 candidate pairs were further validated in technical quadruplicate (with a subset shown in Figure 3B). Additional switching pairs are found in Figure S2. Other useful behaviors, such as one light condition inducing both optogenetic systems to similar relative fluorescence or one optogenetic system that stays at similar relative induction between two light conditions while the other system switches, were also discovered (Figure S2). These optogenetic split TFs were characterized in separate strains, as any split TF pair combination that uses the same binding partners can freely interact and change the optogenetic activation profiles. Selection of split TFs that do not share binding partners will allow for independent control over switchable optogenetics systems in the same strain (as demonstrated in Figure 2 for sequential activation).
Figure 3.
(A) Multiplexing potential for given optogenetic system pairs. The multiplexing potential is given as the negative product of the highest and lowest differences in relative induction between pairs of optogenetic systems and light conditions. First, pairwise differences between relative induction for all light conditions tested (Figure 1) are calculated for each pair of optogenetic systems. The pair of light conditions that yields the largest product of differences for each pair of optogenetic systems is then calculated and plotted as a heat map (larger values of the product of differences are represented by darker green squares, indicating higher potential for multiplexing). (B) Relative activation of pairs of optogenetic systems for pairs of light conditions where switching occurs (with standard error indicated). Intensity is measured in μW/cm2, period in min, duty cycle by %, and AUC (area under the curve which measures the total light dose) is in μW·h/cm2. Data shown are averaged quadruplicates of relative fluorescence, recorded at 10 h into induction. Additional examples are presented in Figure S2.
Response Dynamics Are Insensitive to Activation Domain Strength
In this study, we aimed to optimize the differences in relative response between various optogenetic systems. While controlling relative response is crucial for developing multiplexed control strategies, we recognize that real-world applications may require control over the absolute magnitude of the response. Aiming to modify the magnitude of the gene expression response of our optogenetic split TFs, we explored swapping out the activation domain. The Gal4AD in the eMagA/eMagB split TF was replaced with VP16AD, p65AD, or Msn2AD. These ADs were selected because they have been used successfully in other synthetic split TFs and have been shown to have different strengths.19,26,28,32 Each version was characterized using Lustro, and they were found to exhibit different magnitudes of gene expression (Figure 4A), but similar relative responses across light programs (Figure 4B). This indicates that it is possible to change the activation domain without significantly changing the response kinetics of the optogenetic system, suggesting the ability to adjust the magnitude of a specific optogenetic TF’s gene expression response without changing the activation and deactivation kinetics, thus maintaining differential responses to specific light programs.
Figure 4.

Activation domain swapping. (A) eMagA/eMagB split TF systems utilizing different activation domains are tested under a range of light conditions. Fluorescence values are shown after 10 h of induction, with each condition performed in triplicate. The 50% duty cycle condition has an intensity of 1500 μW/cm2 and a period of 2 s. (B) Comparison of relative induction level (scaling each dark condition to 0 and each light condition to 1) for intermediate light induction conditions of each activation domain. Relative induction does not vary significantly across light conditions except with small effect for Msn2AD (see Methods). Error bars represent the bootstrapped confidence interval for n = 3 technical replicates.
Predicting System Behavior Using Machine Learning
We next sought to apply the high-throughput data collected by Lustro to generate a predictive model that would allow for selection of bespoke objective functions for various biological applications. We used a feedforward neural network (NN) to predict the relative induction of each split TF given the duty cycle, intensity, and period of the light condition. To train the neural network, we used a Bayesian inference approach to determine an approximate Gaussian distribution for the parameter posterior.33 To evaluate model prediction performance of relative induction, we used 20-fold cross validation. This process involves dividing the data into 20 subsets, training on 19 of the subsets, and evaluating prediction performance on the held-out set. Using only the training data, we use an Expectation-Maximization (EM) algorithm to optimize the precision of a zero-mean Gaussian parameter prior, which determines the degree of “regularization” that penalizes deviations from the prior in order to prevent overfitting (see Methods). The process is repeated 20 times so that each subset is subjected to held-out testing. Prediction performance (Pearson correlation) is computed by comparing the measured relative induction to the predicted relative induction for every condition in the data set (Figure 5). The NN predicted relative induction with a Pearson correlation that ranged from 0.92 to 0.98, demonstrating that this data-driven approach provides accurate predictions of system behavior.
Figure 5.
Prediction performance of relative induction using the machine learning model. Prediction performance ranges between a Pearson correlation of 0.92 and 0.98.
Bayesian Optimization for Maximizing Switching
Once trained, the NN can be used to guide the design of experiments to select pairs of light conditions that maximize predicted switching in induction levels between two optogenetic systems. We therefore define an experimental condition as a pair of light conditions applied to a pair of split TFs. The batch data-collection capabilities of the Lustro platform enable the use of a Bayesian optimization algorithm called Thompson sampling34 that selects experimental conditions predicted to maximize the difference in induction between each TF pair. We use an approximate Bayesian inference approach to infer the posterior parameter distribution of the NN.33 Once equipped with this posterior parameter distribution, Thompson sampling involves sampling parameter values from the posterior, and using the resulting model to identify the condition that maximizes the objective. The process of randomly sampling from the posterior and selecting an experimental condition that optimizes the design objective can be repeated in order to design a batch of experimental conditions. To demonstrate the potential utility of this approach, we defined a design space of pairs of light conditions scanning a range of light intensities, duty cycles, and periods. We then used a NN trained on all available experimental data to predict relative induction of all TFs for all conditions in the design space. We used these model predictions as a “ground truth” data set relating light conditions to TF induction. We then randomly selected a batch of 10 light conditions as a preliminary data set and used this preliminary data set to train a new NN. Using the trained model, 10 new pairs of light conditions were selected using the Thompson sampling approach to optimize an objective function defined as the negative of the product of the difference in induction levels between all pairs of TFs (corresponding to the multiplexing potential described in Figure 3). The set of selected light conditions and corresponding induction levels of each TF were then queried from the ground truth data set and appended to the training data, which was then used to update the model. The process of selecting a new set of 10 pairs of light conditions was continued over 5 rounds (each round containing a batch of experiments). The overall process was repeated over 10 trials to assess the variation in the ability of the model to optimize the system. We found that when compared to random selection of light programs, the Bayesian optimization framework quickly identified combinations of light conditions that approach the maximum possible switching in relative induction levels (Figure 6), where the maximum possible value corresponds to the light condition with the largest objective in the entire simulated data set. These results illustrate how Lustro and machine learning can be combined to quickly identify optogenetic systems and light induction programs that enable desirable switching properties.
Figure 6.

Validation of batch experimental design algorithm using the Bayesian optimization framework. Using simulated experimental data, a neural network was initialized with data from 10 randomly selected light conditions. Using the trained model, a Thompson-sampling Bayesian optimization algorithm was used to select new pairs of light conditions in subsequent experiment rounds. Compared to random selection, the model-guided experimental design algorithm more efficiently identifies light conditions with improved switching in optogenetic TF induction levels. Solid lines indicate the median performance taken over 10 trials in which the initial set of light conditions was randomly selected and shaded regions represent the interquartile range. The “Best possible” line corresponds to the light condition with the largest switching objective in the entire simulated data set.
Conclusion
Using a high-throughput measurement platform Lustro in combination with machine learning we describe a strategy for dynamic multiplexed control of optogenetic systems that takes advantage of the native differences in response kinetics between different protein photoswitches. We use this approach to identify conditions for sequential activation and “switching” between optogenetic systems. Previous approaches integrate circuits to tune the response of an optogenetic system or generate different types of response behavior, such as OptoINVRT35 and OptoAMP.36 Such circuits could be combined with the optogenetic systems we have identified to display sequential activation and switching to enable more binary on–off control as well as finer control of biological behavior. We also demonstrated the ability to tune the magnitude of response of an optogenetic system without affecting light sensitivity by changing the activation domain of the light-sensitive transcription factor.
Leveraging the predictive capabilities of a NN model, we harnessed data-driven insights to forecast the response of optogenetic systems to specific light conditions. In a simulated example, we showed that the proposed Bayesian optimization approach could rapidly identify candidate sets of TF pairs and light conditions that optimize switching in relative induction. In parallel, the implementation of workflow validation with batching represents progress toward more efficient experimental design. Reducing unnecessary iterations streamlines the process of designing and executing future experiments. In these controlled experiments, the NN model was highly successful at predicting the behavior of optogenetic TFs. Transfer learning strategies can be explored for reusing this model if culture conditions or process parameters are modified in future applications. As Lustro allows for dynamic measurements, this information could be incorporated into a dynamic model such as a recurrent neural network or more mechanistic model such as an ordinary differential equation-based model, and a similar optimization approach to the one described here undertaken to design experimental conditions.
While we implemented this strategy for multiplexing light-sensitive split TFs, we propose that this method can be extended to other types of optogenetic systems. For example, this strategy could be applied to characterize and optimize other optogenetic split protein systems, such as split Cas13 systems for regulating RNA in mammalian cells.37 The strategy could also be applied to multiplex control over optogenetic systems that regulate protein localization or oligomerization. The synergistic integration of high-throughput data collection from Lustro, NN predictive modeling, and workflow validation techniques offers a potent toolkit for advancing the frontiers of biological control and could be adapted for real-time feedback control. Optogenetic multiplexed control strategies could be used to control bioproduction processes,35,38 design engineered living materials,39 regulate microbial consortia,40−42 or interrogate complex cellular gene expression networks. Combining high-throughput characterization with machine learning to predict and optimize the behavior of optogenetic systems will rapidly accelerate the design, build, test cycle.
Methods
Strain Construction and Culture Conditions
Strains used in this study were constructed using standard molecular biology techniques, specifically a modular Type IIS Golden Gate assembly toolkit as previously described.4,7,43 The details of constructs used in this work can be found in Table S3 and Table S5. Part plasmids (Level 0) were created through BsmBI Golden Gate assembly of PCR-amplified products (refer to Table S2 for primer details) or gBlocks (see Table S4) into the yTK entry vector (yTK001). Following this, part plasmids were further combined to form cassette plasmids (Level 1) using BsaI Golden Gate assembly. These cassette plasmids were then integrated into multigene plasmids (Level 2) through BsmBI Golden Gate assembly.
Single-construct strains were generated by introducing multigene plasmids, linearized with NotI-HF, into the genome of Saccharomyces cerevisiae strain BY4741 with the genotype MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 gal80::KANMX gal4::spHIS5. The transformations followed an established LiAc/SS carrier DNA/PEG protocol.44 Construct integration occurred at the URA3 or LEU2 sites, and transformants were selected using SC-Ura or SC-Leu dropout media, respectively. Transformants were further screened using previously established methods.7
Overnight yeast cultures were inoculated from colonies on YPD agar plates into 3 mL of liquid SC media overnight at 30 °C with agitation. Postincubation, the overnight cultures were diluted to an optical density of 700 nm (OD700, to avoid bias from the red fluorescent marker,45 mScarlet-I) of 0.1 in SC media. Subsequently, 200 μL of each culture was dispensed into individual wells of a 96-well glass-bottom plate with black walls (Cat. #P96–1.5H–N).
Lustro
Automated optogenetic experiments were conducted as previously described7,20 using a Tecan Fluent Automation Workstation equipped with a Robotic Gripper Arm (RGA) and integrated with an optoPlate,46 a BioShake 3000-T elm heater shaker designed for well plates, and a Tecan Spark plate reader. The optoPlate was assembled and calibrated in accordance with previously established procedures. Programming of the optoPlate was achieved using scripts available at https://github.com/mccleanlab/Optoplate-96. Throughout the experiments, the Fluent workstation was shielded from ambient light by a blackout curtain. Cellvis 96-well glass bottom plates with #1.5 cover glass (Cat. #P96–1.5H–N) were used for all experiments.
Each 96-well plate, containing cultures diluted to an optical density (OD700) of 0.1, underwent a 5-h incubation in the dark at 30 °C with continuous shaking. Light induction commenced after this incubation period. For each light induction cycle, the plate was first positioned on the optoPlate for 26.5 min at 21 °C. It was then transferred to the plate shaker, where it underwent agitation at 1000 rpm with a 2 mm orbital movement for 1 min to resuspend cells. Following this, the plate was moved to the Tecan Spark plate reader for optical density (OD700) and fluorescence measurements (without the lid). Subsequently, the plate was returned to the optoPlate, and this cycle was repeated throughout the experiment. For mScarlet-I,21 fluorescence measurements were recorded with excitation at 563 nm and emission at 606 nm, with an optical gain of 130. For miRFP680,31 fluorescence measurements were recorded with excitation at 652 nm and emission at 697 nm, with an optical gain of 230. The Z-value (vertical distance) was set at 28410 for all fluorescence measurements.
Results from Lustro
are reported in either fluorescence arbitrary
units (a.u.) or relative induction. Relative induction level for a
given light induction program is calculated using the fluorescence
measurements from that condition (cond), the constant light induction
condition (light), and constant dark condition (dark). Relative induction
= (cond – dark)/(light – dark). Standard error is calculated
as
where σ is the standard deviation
and n is the sample size. Statistical significance in Figure 4 was calculated by performing
two-way ANOVA using anova2 in Matlab which identified significant
variation between activation domains in relative induction (p = 0.0016). Tukey’s HSD implemented in Matlab multcompare
was used to identify the Msn2AD as causing a significant difference
in relative induction, though the effect size was small (on average
10% across light conditions) (p < 0.009) with
no other activation domains varying significantly in relative induction
(p > 0.95).
Machine Learning Model
The data and code used for creating the ML model presented in this section can be found at https://github.com/zavalab/ML/tree/master/Optogenetics.
The NN model utilized in this study is designed to predict the relative induction of each TF as a function of a particular light condition. We used a feedforward neural network architecture with a single hidden layer,
where u is a vector defining the light intensity, duty cycle, and the period of the light input and ŷ is a vector of predicted induction levels of each TF. The parameters of the model include the weights and biases, θ = {Wuh,Why,bh,by}, which are learned from data.
Bayesian Inference and Uncertainty Quantification
We used a Bayesian framework to infer a Gaussian approximation of the NN posterior parameter distribution and an Expectation-Maximization (EM) algorithm to optimize model hyperparameters, with methods adapted from Thompson et al. 2023.33 Model hyperparameters include the precision (inverse variance) of the parameter prior and the precision in measurement noise for each TF. The parameter prior is assumed to be a zero mean Gaussian with a precision parameter, α. We assume that error associated with measuring induction levels of m different TFs is a zero mean Gaussian random variable with precision βj for TF j. Given a set of measurements of the induction levels for each TF in response to n different light conditions, D = {y(u1),...,y(un)}, we define the likelihood of the data as p(D|θ) = ∏ni = 1 ∏mj = 1N(yj (ui) | (ŷj)(ui,θ), βj–1). Maximizing the posterior parameter distribution with respect to model parameters is equivalent to maximizing the log of the product of the likelihood and the prior, which gives the maximum a posteriori (MAP) estimate, θMAP = argmaxθ ∑ni = 1 ∑mj = 1log N(yj (ui) | (ŷj(ui,θ), βj–1) + log N(θ|0,α–1). The posterior parameter distribution is approximated as a Gaussian centered at θMAP with a covariance matrix given by the inverse of the matrix of second derivatives of the negative log posterior, which we approximate using the outer product, Σ–1 = αIθ + ∑ni = 1 ∑mj = 1βj·∇θ ŷj (ui,θ)·∇θ ŷj (ui,θ)T where Iθ is the identity matrix with dimension equal to the number of model parameters. The model hyperparameters α and β are optimized using the EM algorithm, which involves maximizing the expectation of the log of the joint probability of the data and the parameter distribution with respect to α and β, where the expectation is taken with respect to the parameter posterior distribution. Using the updated hyperparameters, inference of the posterior parameter distribution is repeated until convergence of the marginal likelihood.
Experimental Design Using Bayesian Optimization
We used a Bayesian optimization algorithm called Thompson sampling to design a batch of experimental conditions predicted to maximize the difference in induction levels between pairs of TFs in separate light conditions. To do so, we define the objective function as the negative of the minimum product in the difference between predicted induction for each TF,
J(ui,uj,θ) = −min {(ŷk (ui,θ) – ŷk (uj,θ))·(ŷl (ui,θ) – ŷl (uj,θ)) ∀ k,l ∈ 1,...,m}.
We define the experimental design space as all possible pairs of light conditions, Q = {(ui,uj) ∀ i ≠ j}. The Thompson sampling algorithm involves sampling parameter values from the posterior, θ* ∼ N(θMAP,Σ), and then determining the experimental condition that maximizes the objective, (ui,uj)* = argmax(ui,uj)∈QJ(ui,uj,θ*). This process is repeated as many times as necessary to generate the desired number of experimental conditions to be tested in the next experiment.
Acknowledgments
This work was supported by National Institutes of Health grants R35GM128873 (awarded to M.N.M.), R01EB030340, R35GM124774 (awarded to O.S.V.), Army Research Office under grant number W911NF1910269 (awarded to O.S.V.), and National Science Foundation grants CBET 2315963 (awarded to V.M.Z.) and MCB 2045493 (awarded to M.N.M). Megan Nicole McClean, PhD, holds a Career Award at the Scientific Interface from the Burroughs Wellcome Fund. Z.P.H. was supported by an NHGRI training grant to the Genomic Sciences Training Program 5T32HG002760. We thank Amit Nimunkar and Edvard Grødem for building and modifying the optoPlate. We acknowledge fruitful discussions with McClean lab members, Neydis Moreno Morales for providing feedback on the manuscript, and Stephanie Geller for providing pMM1134 and pMM1137.
Data Availability Statement
Key plasmids have been deposited on Addgene. For all other reagent requests, please contact the corresponding author.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acssynbio.3c00761.
Additional data and schematics for the experiments described in the text, strains, plasmids, oligos, gene blocks, and optogenetic constructs used in this work (PDF)
Author Contributions
Z.P.H. and M.N.M. conceived of the study. J.C.T., D.L.C., and V.M.Z. conceived of the modeling approach. Z.P.H. designed optogenetic parts, performed experiments, and analyzed data. J.C.T. designed the neural network and generated the predictive model. M.N.M., V.M.Z. and O.S.V. provided funding. Z.P.H. and J.C.T. wrote the original draft of the manuscript, and Z.P.H., J.C.T., D.L.C., O.S.V., V.M.Z., and M.N.M. wrote, reviewed, and edited the final manuscript.
The authors declare no competing financial interest.
Supplementary Material
References
- Lan T.-H.; He L.; Huang Y.; Zhou Y. Optogenetics for Transcriptional Programming and Genetic Engineering. Trends in Genetics 2022, 38 (12), 1253–1270. 10.1016/j.tig.2022.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olson E. J.; Tabor J. J. Optogenetic Characterization Methods Overcome Key Challenges in Synthetic and Systems Biology. Nat. Chem. Biol. 2014, 10 (7), 502–511. 10.1038/nchembio.1559. [DOI] [PubMed] [Google Scholar]
- Pérez A. L. A.; Piva L. C.; Fulber J. P. C.; de Moraes L. M. P.; De Marco J. L.; Vieira H. L. A.; Coelho C. M.; Reis V. C. B.; Torres F. A. G. Optogenetic Strategies for the Control of Gene Expression in Yeasts. Biotechnology Advances 2022, 54, 107839. 10.1016/j.biotechadv.2021.107839. [DOI] [PubMed] [Google Scholar]
- An-adirekkun J.; Stewart C. J.; Geller S. H.; Patel M. T.; Melendez J.; Oakes B. L.; Noyes M. B.; McClean M. N. A Yeast Optogenetic Toolkit (yOTK) for Gene Expression Control in Saccharomyces Cerevisiae. Biotechnol. Bioeng. 2020, 117 (3), 886–893. 10.1002/bit.27234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geller S. H.; Antwi E. B.; Di Ventura B.; McClean M. N. Optogenetic Repressors of Gene Expression in Yeasts Using Light-Controlled Nuclear Localization. Cell Mol. Bioeng 2019, 12 (5), 511–528. 10.1007/s12195-019-00598-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moreno Morales N.; Patel M. T.; Stewart C. J.; Sweeney K.; McClean M. N. Optogenetic Tools for Control of Public Goods in Saccharomyces Cerevisiae. mSphere 2021, 6 (4), e00581-21 10.1128/mSphere.00581-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harmer Z. P.; McClean M. N. Lustro: High-Throughput Optogenetic Experiments Enabled by Automation and a Yeast Optogenetic Toolkit. ACS Synth. Biol. 2023, 12, 1943. 10.1021/acssynbio.3c00215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott T. D.; Sweeney K.; McClean M. N. Biological Signal Generators: Integrating Synthetic Biology Tools and in Silico Control. Current Opinion in Systems Biology 2019, 14, 58–65. 10.1016/j.coisb.2019.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levskaya A.; Weiner O. D.; Lim W. A.; Voigt C. A. Spatiotemporal Control of Cell Signalling Using A Light-Switchable Protein Interaction. Nature 2009, 461 (7266), 997–1001. 10.1038/nature08446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silva P. M.; Puerner C.; Seminara A.; Bassilana M.; Arkowitz R. A. Secretory Vesicle Clustering in Fungal Filamentous Cells Does Not Require Directional Growth. Cell Reports 2019, 28 (8), 2231–2245. 10.1016/j.celrep.2019.07.062. [DOI] [PubMed] [Google Scholar]
- Niopek D.; Benzinger D.; Roensch J.; Draebing T.; Wehler P.; Eils R.; Di Ventura B. Engineering Light-Inducible Nuclear Localization Signals for Precise Spatiotemporal Control of Protein Dynamics in Living Cells. Nat. Commun. 2014, 5 (1), 4404. 10.1038/ncomms5404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yazawa M.; Sadaghiani A. M.; Hsueh B.; Dolmetsch R. E. Induction of Protein-Protein Interactions in Live Cells Using Light. Nat. Biotechnol. 2009, 27 (10), 941–945. 10.1038/nbt.1569. [DOI] [PubMed] [Google Scholar]
- Chen S. Y.; Osimiri L. C.; Chevalier M.; Bugaj L. J.; Nguyen T. H.; Greenstein R. A.; Ng A. H.; Stewart-Ornstein J.; Neves L. T.; El-Samad H. Optogenetic Control Reveals Differential Promoter Interpretation of Transcription Factor Nuclear Translocation Dynamics. Cell Syst 2020, 11 (4), 336–353. 10.1016/j.cels.2020.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sweeney K.; McClean M. N. Transcription Factor Localization Dynamics and DNA Binding Drive Distinct Promoter Interpretations. Cell Reports 2023, 42 (5), 112426. 10.1016/j.celrep.2023.112426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Bergeijk P.; Adrian M.; Hoogenraad C. C.; Kapitein L. C. Optogenetic Control of Organelle Transport and Positioning. Nature 2015, 518 (7537), 111–114. 10.1038/nature14128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tague N.; Coriano-Ortiz C.; Sheets M. B.; Dunlop M. J.. Light Inducible Protein Degradation in E. coli with LOVtag. bioRxiv, October 26, 2023. 10.1101/2023.02.25.530042. [DOI] [PMC free article] [PubMed]
- Renicke C.; Schuster D.; Usherenko S.; Essen L.-O.; Taxis C. A LOV2 Domain-Based Optogenetic Tool to Control Protein Degradation and Cellular Function. Chem. Biol. 2013, 20 (4), 619–626. 10.1016/j.chembiol.2013.03.005. [DOI] [PubMed] [Google Scholar]
- Dwijayanti A.; Zhang C.; Poh C. L.; Lautier T. Toward Multiplexed Optogenetic Circuits. Front. Bioeng. Biotechnol. 2022, 10.3389/fbioe.2021.804563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benzinger D.; Ovinnikov S.; Khammash M. Synthetic Gene Networks Recapitulate Dynamic Signal Decoding and Differential Gene Expression. Cell Syst 2022, 13 (5), 353–364. 10.1016/j.cels.2022.02.004. [DOI] [PubMed] [Google Scholar]
- Harmer Z. P.; McClean M. N. High-Throughput Optogenetics Experiments in Yeast Using the Automated Platform Lustro. JoVE (Journal of Visualized Experiments) 2023, (198), e65686 10.3791/65686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bindels D. S.; Haarbosch L.; van Weeren L.; Postma M.; Wiese K. E.; Mastop M.; Aumonier S.; Gotthard G.; Royant A.; Hink M. A.; Gadella T. W. J. mScarlet: A Bright Monomeric Red Fluorescent Protein for Cellular Imaging. Nat. Methods 2017, 14 (1), 53–56. 10.1038/nmeth.4074. [DOI] [PubMed] [Google Scholar]
- Taslimi A.; Zoltowski B.; Miranda J. G.; Pathak G.; Hughes R. M.; Tucker C. L. Optimized Second Generation CRY2/CIB Dimerizers and Photoactivatable Cre Recombinase. Nat. Chem. Biol. 2016, 12 (6), 425–430. 10.1038/nchembio.2063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kawano F.; Suzuki H.; Furuya A.; Sato M. Engineered Pairs of Distinct Photoswitches for Optogenetic Control of Cellular Proteins. Nat. Commun. 2015, 6 (1), 6256. 10.1038/ncomms7256. [DOI] [PubMed] [Google Scholar]
- Benedetti L.; Marvin J. S.; Falahati H.; Guillén-Samander A.; Looger L. L.; De Camilli P. Optimized Vivid-Derived Magnets Photodimerizers for Subcellular Optogenetics in Mammalian Cells. eLife 2020, 9, e63230 10.7554/eLife.63230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kennedy M. J.; Hughes R. M.; Peteya L. A.; Schwartz J. W.; Ehlers M. D.; Tucker C. L. Rapid Blue-Light-Mediated Induction of Protein Interactions in Living Cells. Nat. Methods 2010, 7 (12), 973–975. 10.1038/nmeth.1524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Motta-Mena L. B.; Reade A.; Mallory M. J.; Glantz S.; Weiner O. D.; Lynch K. W.; Gardner K. H. An Optogenetic Gene Expression System with Rapid Activation and Deactivation Kinetics. Nat. Chem. Biol. 2014, 10 (3), 196–202. 10.1038/nchembio.1430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zoltowski B. D.; Motta-Mena L. B.; Gardner K. H. Blue Light-Induced Dimerization of a Bacterial LOV-HTH DNA-Binding Protein. Biochemistry 2013, 52 (38), 6653–6661. 10.1021/bi401040m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pathak G. P.; Strickland D.; Vrana J. D.; Tucker C. L. Benchmarking of Optical Dimerizer Systems. ACS Synth. Biol. 2014, 3 (11), 832–838. 10.1021/sb500291r. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deindoerfer F. H.; Humphrey A. E. Design of Multistage Systems for Simple Fermentation Processes. Ind. Eng. Chem. 1959, 51 (7), 809–812. 10.1021/ie50595a023. [DOI] [Google Scholar]
- di Pietro F.; Herszterg S.; Huang A.; Bosveld F.; Alexandre C.; Sancéré L.; Pelletier S.; Joudat A.; Kapoor V.; Vincent J.-P.; Bellaïche Y. Rapid and Robust Optogenetic Control of Gene Expression in Drosophila. Developmental Cell 2021, 56 (24), 3393–3404. 10.1016/j.devcel.2021.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matlashov M. E.; Shcherbakova D. M.; Alvelid J.; Baloban M.; Pennacchietti F.; Shemetov A. A.; Testa I.; Verkhusha V. V. A Set of Monomeric Near-Infrared Fluorescent Proteins for Multicolor Imaging across Scales. Nat. Commun. 2020, 11 (1), 239. 10.1038/s41467-019-13897-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X.; Chen X.; Yang Y. Spatiotemporal Control of Gene Expression by a Light-Switchable Transgene System. Nat. Methods 2012, 9 (3), 266–269. 10.1038/nmeth.1892. [DOI] [PubMed] [Google Scholar]
- Thompson J. C.; Zavala V. M.; Venturelli O. S. Integrating a Tailored Recurrent Neural Network with Bayesian Experimental Design to Optimize Microbial Community Functions. PLoS Comput. Biol. 2023, 19 (9), e1011436 10.1371/journal.pcbi.1011436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kandasamy K.; Krishnamurthy A.; Schneider J.; Poczos B.. Parallelised Bayesian Optimisation via Thompson Sampling. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics; PMLR, 2018; pp 133–142.
- Zhao E. M.; Zhang Y.; Mehl J.; Park H.; Lalwani M. A.; Toettcher J. E.; Avalos J. L. Optogenetic Regulation of Engineered Cellular Metabolism for Microbial Chemical Production. Nature 2018, 555 (7698), 683–687. 10.1038/nature26141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao E. M.; Lalwani M. A.; Chen J.-M.; Orillac P.; Toettcher J. E.; Avalos J. L. Optogenetic Amplification Circuits for Light-Induced Metabolic Control. ACS Synth. Biol. 2021, 10 (5), 1143–1154. 10.1021/acssynbio.0c00642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ding Y.; Tous C.; Choi J.; Chen J.; Wong W. W. Orthogonal Inducible Control of Cas13 Circuits Enables Programmable RNA Regulation in Mammalian Cells. Nat. Commun. 2024, 10.1038/s41467-024-45795-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bertaux F.; Sosa-Carrillo S.; Gross V.; Fraisse A.; Aditya C.; Furstenheim M.; Batt G. Enhancing Bioreactor Arrays for Automated Measurements and Reactive Control with ReacSight. Nat. Commun. 2022, 13 (1), 3363. 10.1038/s41467-022-31033-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilbert C.; Tang T.-C.; Ott W.; Dorr B. A.; Shaw W. M.; Sun G. L.; Lu T. K.; Ellis T. Living Materials with Programmable Functionalities Grown from Engineered Microbial Co-Cultures. Nat. Mater. 2021, 20 (5), 691–700. 10.1038/s41563-020-00857-5. [DOI] [PubMed] [Google Scholar]
- Gutiérrez Mena J.; Kumar S.; Khammash M. Dynamic Cybergenetic Control of Bacterial Co-Culture Composition via Optogenetic Feedback. Nat. Commun. 2022, 13, 4808. 10.1038/s41467-022-32392-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aditya C.; Bertaux F.; Batt G.; Ruess J. A Light Tunable Differentiation System for the Creation and Control of Consortia in Yeast. Nat. Commun. 2021, 12 (1), 5829. 10.1038/s41467-021-26129-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lalwani M. A.; Kawabe H.; Mays R. L.; Hoffman S. M.; Avalos J. L. Optogenetic Control of Microbial Consortia Populations for Chemical Production. ACS Synth. Biol. 2021, 10 (8), 2015–2029. 10.1021/acssynbio.1c00182. [DOI] [PubMed] [Google Scholar]
- Lee M. E.; DeLoache W. C.; Cervantes B.; Dueber J. E. A Highly Characterized Yeast Toolkit for Modular, Multipart Assembly. ACS Synth. Biol. 2015, 4 (9), 975–986. 10.1021/sb500366v. [DOI] [PubMed] [Google Scholar]
- Gietz R. D.; Schiestl R. H. High-Efficiency Yeast Transformation Using the LiAc/SS Carrier DNA/PEG Method. Nat. Protoc 2007, 2 (1), 31–34. 10.1038/nprot.2007.13. [DOI] [PubMed] [Google Scholar]
- Hecht A.; Endy D.; Salit M.; Munson M. S. When Wavelengths Collide: Bias in Cell Abundance Measurements Due to Expressed Fluorescent Proteins. ACS Synth. Biol. 2016, 5 (9), 1024–1027. 10.1021/acssynbio.6b00072. [DOI] [PubMed] [Google Scholar]
- Bugaj L. J.; Lim W. A. High-Throughput Multicolor Optogenetics in Microwell Plates. Nat. Protoc 2019, 14 (7), 2205–2228. 10.1038/s41596-019-0178-y. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Key plasmids have been deposited on Addgene. For all other reagent requests, please contact the corresponding author.



