BioCRNpyler: Compiling chemical reaction networks from biomolecular parts in diverse contexts

William Poole; Ayush Pandey; Andrey Shur; Zoltan A Tuza; Richard M Murray

doi:10.1371/journal.pcbi.1009987

. 2022 Apr 20;18(4):e1009987. doi: 10.1371/journal.pcbi.1009987

BioCRNpyler: Compiling chemical reaction networks from biomolecular parts in diverse contexts

William Poole ^1,^*, Ayush Pandey ², Andrey Shur ³, Zoltan A Tuza ⁴, Richard M Murray ²

Editor: Pedro Mendes⁵

PMCID: PMC9060376 PMID: 35442944

Abstract

Biochemical interactions in systems and synthetic biology are often modeled with chemical reaction networks (CRNs). CRNs provide a principled modeling environment capable of expressing a huge range of biochemical processes. In this paper, we present a software toolbox, written in Python, that compiles high-level design specifications represented using a modular library of biochemical parts, mechanisms, and contexts to CRN implementations. This compilation process offers four advantages. First, the building of the actual CRN representation is automatic and outputs Systems Biology Markup Language (SBML) models compatible with numerous simulators. Second, a library of modular biochemical components allows for different architectures and implementations of biochemical circuits to be represented succinctly with design choices propagated throughout the underlying CRN automatically. This prevents the often occurring mismatch between high-level designs and model dynamics. Third, high-level design specification can be embedded into diverse biomolecular environments, such as cell-free extracts and in vivo milieus. Finally, our software toolbox has a parameter database, which allows users to rapidly prototype large models using very few parameters which can be customized later. By using BioCRNpyler, users ranging from expert modelers to novice script-writers can easily build, manage, and explore sophisticated biochemical models using diverse biochemical implementations, environments, and modeling assumptions.

Author summary

This paper describes a new software package BioCRNpyler (pronounced “Biocompiler”) designed to support rapid development and exploration of mathematical models of biochemical networks and circuits by computational biologists, systems biologists, and synthetic biologists. BioCRNpyler allows its users to generate large complex models using very few lines of code in a way that is modular. To do this, BioCRNpyler uses a powerful new representation of biochemical circuits which defines their parts, underlying biochemical mechanisms, and chemical context independently. BioCRNpyler was developed as a Python scripting language designed to be accessible to beginning users as well as easily extendable and customizable for advanced users. Ultimately, we see Biocrnpyler being used to accelerate computer automated design of biochemical circuits and model driven hypothesis generation in biology.

This is a PLOS Computational Biology Software paper.

1 Introduction

Chemical reaction networks (CRNs) are the workhorse for modeling in systems and synthetic biology [1]. The power of CRNs lies in their expressivity; CRN models can range from physically realistic descriptions of individual molecules to coarse-grained idealizations of complex multi-step processes [2]. However, this expressivity comes at a cost. Choosing the right level of detail in a model is more an art than a science. The modeling process requires careful consideration of the desired use of the model, the available data to parameterize the model, and prioritization of certain aspects of modeling or analysis over others. Additionally, biological CRN models can be incredibly complex including dozens or even hundreds or thousands of species, reactions, and parameters [3]. Maintaining complex hand-built models is challenging and errors can quickly grow out of control for large models. Software tools can answer many of these challenges by automating and streamlining the model construction process.

Formally, a CRN is a set of species S = {S_i} and reactions $R : {I \overset{ρ (s; θ)}{\to} O}$ where I and O are multisets of species, ρ is the rate function or propensity, s is a vector of species’ concentrations (or counts), and θ are rate parameters. Typically, CRNs are simulated using as ordinary differential equations (ODEs) and numerically integrated [2]. A stochastic semantics also allows CRNs to be simulated as continuous-time Markov chains [4]. Besides their prevalence in biological modeling, there is rich theoretical body of work related to CRNs from the mathematical [5], computer science [6], and physics communities [7]. Despite these theoretical foundations, many models are phenomenological in nature and lack mechanistic details of various biological processes. The challenge of constructing correct models is compounded by the difficulty in differentiating between correct and incorrect models based upon experimental data [8–10].

Due to CRNs’ rich history and diverse applications, the available tools for a CRN modeler are vast and include: extensive software to generate, simulate, and analyze CRNs [11–14] as well as databases of models [15, 16], and many more. However, even though synthetic biologists have adopted a module and part-driven approach to their laboratory work [17], models are still typically built by hand on a case-by-case basis. Recognizing the fragile non-modular nature of hand built models, several synthetic biology design automation tools have been developed for specific purposes such as implementing transcription factor or integrase-based logic [18, 19]. These tools indicate a growing need for design and simulation automation in synthetic biology, as part and design libraries are expanded.

As the name would suggest, BioCRNpyler (pronounced bio-compiler) is a Python package that compiles CRNs from simple specifications of biological motifs and contexts. This package is inspired by the molecular compilers developed by the DNA-strand displacement community and molecular programming communities which, broadly speaking, aim to compile models of DNA circuit implementations from simpler CRN specifications [20–22], rudimentary programming languages [23, 24], and abstract sequence specifications [25]. This body of work has demonstrated the utility of molecular circuit compilers and highlights that a single specification can be compiled into multiple molecular implementations which in turn can correspond to multiple CRN models at various levels of detail. For example, there are multiple DNA-strand implementations of catalysis [21, 22, 26, 27] and the interactions of the DNA strands involved in each of these implementations can be enumerated to generate different CRN models based upon the assumptions underlying enumeration algorithm [28]. Drawing from these inspirations, BioCRNpyler is a general-purpose CRN compiler capable of converting abstract specifications of biomolecular components into CRN models with full programmatic control over the compilation process. Importantly, BioCRNpyler is not a CRN simulator—models are saved in the Systems Biology Markup Language (SBML) [29] to be compatible with the user’s simulator of choice.

There are many existing tools that provide some of the features present in BioCRNpyler. Antimony (part of the Tellerium software suite) provides an elegant high level language that is converted into SBML models [12, 30]. Systems Biology Open Language (SBOL) [31] is a format for sharing DNA-sequences with assigned functions and does not compile a CRN. Hierarchical SBML and supporting software [32] provide a file format which encapsulates CRNs as modular functions. The software package iBioSim [33, 34] can compile SBOL specifications into SBML models. Similarly, Virtual Parts Repository uses SBOL specifications to combine existing SBML models together [35]. The rule-based modeling framework BioNetGen [36] allows for a system to be defined via interaction rules which can then be simulated directly or compiled into a CRN. Similarly, PySB [37] provides a library of common biological parts and interactions that compile into more complex rule-based models. Finally, the MATLAB TX-TL Toolbox [38, 39] can be seen as a prototype for BioCRNpyler but lacks the object-oriented framework and extendability beyond cell-free extract systems.

BioCRNpyler compliments existing software packages by providing a novel abstraction and framework which allows for complex CRNs to be easily generated and explored via the compilation process. To do this, BioCRNpyler specifies a biochemical system as a set of modular biological parts, biochemical processes codified as CRNs, and biochemical and modeling context. Moreover, BioCRNpyler allows for synthetic biological parts and systems biology motifs to be reused and recombined in diverse biochemical contexts at customizable levels of model complexity with minimal coding requirements (BioCRNpyler is designed to be a scripting language). Additionally, BioCRNpyler is purposefully suited to in silico workflows because it is an extendable object-oriented framework written entirely in Python that integrates existing software development standards and allows complete control over model compilation. Simultaneously, BioCRNpyler accelerates model construction with extensive libraries of biochemical parts, models, and examples relevant to synthetic biologists, bio-engineers, and systems biologists. The BioCRNpyler package is available on GitHub [40] and can be installed via the Python package index (PyPi).

2 Design and implementation

BioCRNpyler is an open-source Python package that compiles high-level design specifications into detailed CRN models, which then are saved as SBML files [41]. BioCRNpyler is written in Python with a flexible object-oriented design, extensive documentation, and detailed examples which allow for easy model construction by modelers, customization and extension by developers, and rapid integration into data pipelines. The utility of BioCRNpyler comes from the way it abstracts biological systems using modular objects. A BioCRNpyler model consists of a collection of biological parts called Components which interact via different biological processes called Mechanisms. Sets of Components and Mechanisms are bundled together to form a system, called a Mixture, which represents a specific biological and modeling context. During compilation, each Component in a Mixture generates the species and reactions which model its behavior using Mechanisms. This abstraction is powerful; it allows modelers to examine how a specific system, represented by one or more Components, behaves in diverse environments and/or under different modeling assumptions represented by different Mixtures. Importantly, Mechanisms provide a universal underlying abstraction used to define both the way Components and Mixtures function. In the following subsections, we describe the BioCRNpyler modeling abstraction in detail.

2.1 Internal CRN representation

Underlying BioCRNpyler is a comprehensive chemical reaction network class. The species classes in BioCRNpyler consist of object-oriented data structures with increasing complexity which generate their own unique string representations. Table A in S1 Text describes the different species classes in BioCRNpyler. Similarly, BioCRNpyler comes equipped with many diverse propensity function types including mass-action, Hill functions, and general user specified propensities described in Table B in S1 Text. The CRN classes inside BioCRNpyler provide useful functionality so that users can easily modify CRNs produced via compilation, produce entire CRNs by hand, or interface hand-produced CRNs with compiled CRNs. Additionally, user-friendly printing functionality allows for the easy visualization of CRNs in multiple text formats or as interactive reaction graphs formatted and drawn using Bokeh and ForceAtlas2 [42, 43].

2.2 Mechanisms are reaction schemas

When modeling biological systems, modelers frequently make use of mass-action CRN kinetics which ensure that parameters and states have clear underlying mechanistic meanings. However, for the design of synthetic biological circuits and analysis using experimental data, phenomenological or reduced-order models are commonly utilized as well [2]. Empirical phenomenological models have been successful in predicting and analyzing complex circuit behavior using simple models with only a few lumped parameters [44–46]. Bridging the connections between the different modeling abstractions is a challenging research problem. This has been explored in the literature using various approaches such as by direct mathematical comparison of mechanistic and phenomenological models [47–49] or by studying particular examples of reduced models [2]. BioCRNpyler provides a computational approach using reaction schemas to easily change the mechanisms used in compilation from detailed mass-action to coarse-grained at various level of complexity.

Reaction schemas refer to BioCRNpyler’s generalization of switching between different mechanistic models: a single process can be modeled using multiple underlying motifs to generate a class of models which may have qualitatively different behavior. Mechanisms are the BioCRNpyler objects responsible for defining reaction schemas. In other words, various levels of abstractions and model reductions can all be represented easily by using built-in and custom Mechanisms in BioCRNpyler. Biologically, reaction schemas can represent different underlying biochemical mechanisms or modeling assumptions and simplifications. For example, to model the process of transcription (as shown in Fig 1), BioCRNpyler allows the use of various phenomenological and mass-action kinetic models by simply changing the choice of reaction schema. The simplest of these schemas “Simple Transcription” includes no details about how a gene produces a transcription. “Michaelis Menten Transcription” elaborates on this simplification by including the RNA polymerase enzyme in the model. “Michaelis Menten Transcription with a Hill Function” simplifies the previous mass action model assuming a quasi-equilibrium approximation of RNA polymerase binding. Finally, the “Multi-Occupancy Michaelis Menten Transcription Model” aims to be more realistic by examining the possibility of multiple RNA polymerase enzymes bound to a single transcript. Of course, these are not the only possible transcription Mechanisms: more detailed models may include transcript elongation or organism-specific co-factors, such as σ-factors in E. coli, which could also easily be included in a BioCRNpyler Mechanism.

Formally, reaction schemas are functions that produce CRN species and reactions from a set of input species and parameters: f : (S′, θ)→(S, R). Here the inputs S′ are chemical species and θ are rate constants. The outputs S ⊇ S′ is an increased set of species and R is a set of reactions. The functions f used to define the transcription reaction schemas in Fig 1 are examples of relatively simple Mechanisms which do not have any internal logic. However, BioCRNpyler allows for reaction schemas to be defined directly in Python. This allows for incredible flexibility in defining Mechanisms capable of complex logic, combinatoric enumeration, or other advanced functionality. The object oriented design of Mechanisms also allows modelers to generate CRNs at different levels of complexity and reuse CRN motifs for some Components while customizing Mechanisms for others. Internally, each Mechanism class has a type (e.g. transcription) which defines the input and output species it requires. BioCRNpyler contains an extensive library of Mechanisms (Table C in S1 Text) which are easy to repurpose without extensive coding. Custom Mechanisms are also easy to define by subclassing Mechanism as described in Section I in S1 Text. Ultimately, Mechanisms provides a unique capability to quickly compare system models across various levels of abstraction enabling a more nuanced approach to circuit design and exploring system parameter regimes.

2.3 Components represent functionality

In BioCRNpyler, Components are biochemical parts or motifs, such as promoters, enzymes and chemical complexes. Components represent biomolecular functionality; a promoter enables transcription, enzymes perform catalysis, and chemical complexes must bind together. Components express their functionality by calling particular Mechanism types during compilation. Importantly, Components are not the same as CRN species; one species might be represented by multiple Components and a Component might produce multiple species. For example, a promoter Component will call transcription Mechanisms like those shown in Fig 1. If the “Simple Transcription” Mechanism is used, the promoter will be represented by a single species G. On the other hand, if the “Michaelis Menten Transcription” schema is used, the promoter will actually have two forms: G and G:RNAP representing the free promoter and the promoter bound to RNA polymerase. Components are flexible and can behave differently in different contexts or behave context-independently. To define dynamic-context behavior, Components will automatically use Mechanisms and parameters provided by the Mixture. To define context-independent behavior, Components can have their own internal Mechanisms and parameters. The BioCRNpyler library includes many kinds of Component some of which are listed in S1 Text Table D. Custom Components can also be easily created by subclassing another Component as described in Section II in S1 Text.

2.4 Mixtures represent context

Mixtures are collections of Components, Mechanisms, and parameters. Mixtures can represent chemical context (e.g. cell extract vs. in vivo), as well as modeling resolution (e.g. what level of detail to model transcription or translation at) by containing different internal Components, Mechanisms, and parameters. BioCRNpyler comes with a variety of Mixtures (see Table E in S1 Text) to represent cell-extracts and cell-like systems with multiple levels of modeling complexity. Custom Mixtures can also be easily created either by subclassing an existing mixture or via a few simple scripting operations as described in Section III in S1 Text.

2.5 Flexible parameter databases

Developing models is a process that involves defining then parameterizing interactions. Often, at the early stage of model construction, exact parameter values will be unavailable. BioCRNpyler has a sophisticated parameter framework which allows for the software to search user-populated parameter databases for the parameter that closest matches a specific Mechanism, Component, and parameter name as illustrated in Fig 2. This allows for models to be rapidly constructed and simulated with “ball-park” parameters and then later refined with specific parameters derived from literature or experiments later. This framework also makes it easy to incorporate diverse parameter files together and share parameters between many chemical reactions. BioCRNpyler also allows each Component to have its own parameter database allowing for multiple parameter sources to be combined easily. Components without their own parameters default to the parameters stored in the Mixture.

Fig 2 — If a specific `ParameterKey` (orange boxes) cannot be found, the `ParameterDatabase` automatically defaults to other `ParameterKeys`. This allows for parameter sharing and rapid construction of complex models from relatively few non-specific (e.g. lower in the hierarchy) parameters.

2.6 Component enumeration allows for arbitrary complexity

Component enumeration is a powerful and specialized compilation step which allows new Components to be generated dynamically. Internally, this is achieved in BioCRNpyler by subclassing the ComponentEnumerator class to implement an arbitrary function in Python g: C → C′ where C ⊂ C′ are sets of Components. In local component enumeration the set C consist of just a single component c which contains its own ComponentEnumerator. In global component enumeration, C consists of all components in the Mixture. As more Components are generated, C′ will be fed back into g recursively until no new Components are created or a user defined recursion depth is reached. Like Mechanisms, we emphasize that component enumeration is highly flexible because the enumerators can be written as Python code, allowing for diverse logic, combinatoric enumeration, and more. Section 3.3 describes BioCRNpyler models that makes use of both local and global component enumeration.

2.7 Specification example

Before describing the compilation algorithm in detail, we illustrate the central idea of a BioCRNpyler specification via an example involving a DNAassembly Component which represents a simple piece of DNA, called X, with a promoter, ribosome binding site, and coding sequence for a protein. The DNAassembly uses transcription and translation Mechanisms which will be placed into a Mixture.

# Create Mechanisms

tx = SimpleTranscription() #Transcription

tl = SimpleTranslation() #Translation

# Create a Component

G = DNAassembly(“X”, promoter = “prom”, rbs = “rbs”, protein = “X”)

# Define Parameters

params = {“kb”:100, “ku”:10, “ktx”:0.1, “ktl”:0.5,}

# Place the Component and Mechanisms in a Mixture

M = Mixture(“mixture”, components = [G], mechanisms = [tx, tl], parameters = params)

# Compile the CRN

CRN = M.compile_crn()

This simple code compiles the CRN:

\begin{matrix} X_{DNA} \overset{0.1}{\to} X_{DNA} + X_{RNA} X_{RNA} \overset{0.5}{\to} X_{RNA} + X_{Protein} . \end{matrix}

(1)

The modularity of BioCRNpyler can be illustrated by considering what would happen if we instead used “Michaelis Menten” transcription and translation Mechanisms which model RNA-polymerase (P) and ribosomes (R):

tx = Transcription_MM(rnap = Species(“P”)) #Transcription Mechanism

tl = Translation_MM(ribosome = Species(“R”)) #Translation Mechanism

This compiles a considerably more complex CRN:

\begin{array}{l} X_{DNA} + P ⇌_{10}^{100} X_{DNA} : P \overset{0.1}{\to} X_{DNA} + P + X_{RNA} \\ X_{RNA} + R ⇌_{10}^{100} X_{RNA} : R \overset{0.5}{\to} X_{RNA} + R + X_{Protein} . \end{array}

Here, “:” indicates that two species are bound together to form a new species.

2.8 Chemical reaction network compilation

Having provided an overview of the core classes in BioCRNpyler, we will now describe the compilation algorithm in detail. First, we assume a user has specified a Mixture and populated it with Components, Mechanisms, and parameters. We note that some Components may have their own internal Mechanisms and Parameters while others will be reliant on the Mixture. Compilation proceeds in 7 steps, shown in Fig 3 and elaborated on below.

Fig 3 — A. the organization of classes in BioCRNpyler. Gray arrows indicate the hierarchical organization of objects (e.g. `Components` are contained in a `Mixture`). Dark gray arrows take precedence over light gray arrows (e.g. a `Component` will search for `Mechanisms` in itself before looking at its `Mixture`). Colored arrows denote the generate of objects: `Components` are orange, parameters are blue, and CRN species and reactions are yellow. B. The compilation sequence in BioCRNpyler. The numbers on the arrows in (A) indicate which part of compilation these connections are involved in.

Global Component Enumeration: this step is optional and will only occur if a Mixture contains a one or more global ComponentEnumerators. All Components in the Mixture will be fed into the ComponentEnumerator recursively until either no new Components are created or a user-specified recursion depth is reached.
Local Component Enumeration: this step is optional and will be applied to every Component in the Mixture that contains a one or more local ComponentEnumerators. Each of these Components will generate new Components from itself. If these new Components contain local ComponentEnumerators they will also generate new Components. Like global component enumeration, local component enumeration is stopped when no new Components are created or a user-specified maximum recursion depth is reached.
The Mixture iterates through all its internal Components (including those generated via enumeration) and calls the Component’s update_species() and update_reactions() methods.
In each Component’s update_species() and update_reactions() method, the Component first searches for Mechanisms of the types it requires. Mechanisms stored inside the Component will be used preferentially. If the Component does not have a particular internal Mechanism, that Mechanism is instead retrieved from the Mixture. The Component then calls the update_species(…) and update_reactions(…) methods of each Mechanism supplying the proper parameters for that Mechanism.
Mechanisms generate species and reactions based upon the arguments supplied by the Component that called them. Mechanisms search for rate parameters in the parameter database of the Component that called them. If no parameters are found, the Mechanism will then search for parameters in the Mixture’s parameter database. Note that the same Mechanism may be called multiple times with different parameters, effectively reusing the reaction schema to compile a large CRN. The species and reactions generated this way are returned to the Mixture.
Global Mechanisms are a special kind of Mechanism which are stored in the Mixture and produce new species and reactions from a single species parameter. All species generated in previous steps are passed into the Mixture’s global Mechanisms to generate additional species and reactions. Note that global Mechanisms are not called recursively.
The resulting species and reactions generated in the previous steps form a chemical reaction network which can be modified programatically or exported as SBML.

2.9 Integrated testing

BioCRNpyler uses GitHub Actions and Codecov [50] to automate testing on GitHub. Whenever the software is updated, a suite of tests is run including extensive unit tests and functional testing of tutorial and documentation notebooks. Automated testing ensures that changes to the core BioCRNpyler code preserve functionality of the package. The integration of Jupyter notebooks into testing allows users to easily define new functionality for the software and document that functionality with detailed explanations which are simultaneously tests cases.

2.10 Documentation and tutorials

The BioCRNpyler GitHub page contains over a dozen tutorial Jupyter notebooks [40] and video presentations explaining everything from the fundamental features of the code to specialized functionality for advanced models to how to add to the BioCRNpyler code-base [51]. This documentation has been used successfully in multiple academic courses and is guaranteed to be up-to-date and functional due to automatic testing.

3 Results

This section highlights the functionality of BioCRNpyler through a collection of models compiled using the software. All model simulations were conducted with Bioscrape [52], circuit diagrams were created with DNAplotlib [53], and reaction network graphs were created with BioCRNpyler’s plotting interface. Detailed descriptions alongside commented code for all the following examples are available in S1 Text Section A and as Jupyter notebooks on the BioCRNpyler GitHub page.

3.1 Synthetic biological circuit examples

Fig 4A, 4B and 4C show three models of synthetic biological circuits which demonstrate the modularity and expressivity of BioCRNpyler. Underlying all these models is a single Component class called a DNAassembly which was described in Section 2.7. These first three examples use idealized models of their underlying biological processes via a very simple Mixture. In Fig 4A two DNAassemblies are wired together with a repressor (red) repressing a report (yellow). The repressor is expressed at a constant rate using the “Simple Transcription” Mechanism shown in Fig 1 which is supplied by the Mixture. The reporter, on the other hand, uses a different transcription Mechanism, “Negative Hill Repression” stored in its DNAassembly. This illustrates the ability for the same process, transcription, to be modeled in different ways within a single model. In Fig 4B, two DNAassembly Components are wired to repress each other, both using Hill functions, to produce a model of the famous bistable toggle switch [54]. Similarly, Fig 4C wires three repressors together so A represses B, B represses C, and C represses A, giving rise to a transcriptional oscillator called the repressilator [55].

Fig 4D, 4E and 4F examine similar circuits to Fig 4A, 4B and 4C but with more complex implementations modeled in a more detailed context. In these three following examples, a less idealized Mixture is used which models transcription, translation, and RNA degradation with biological machinery including RNA polymerase, ribosomes, and RNAses. Fig 4D examines a detailed implementation of a repression circuit consisting DNAassembly Components which express a guide-RNA (gRNA) and deactived Cas9 (dCas9) protein [56]. The dCas9-gRNA complex is capable of binding to the promoter of the reporter assembly, repressing transcription. This more complex circuit in a complex context reveals some unexpected behavior; if the amount of dCas9 and gRNA are not carefully balanced, resource loading can give rise to unexpected increases and decreases of the reporter, a phenomena known as retroactivity [57]. Fig 4E shows a hypothetical variation of a bistable toggle switch implemented via translational regulation using targeted RNAses (RNAse A degrades the transcript for RNAse B and visa-versa). Such a system could potentially be engineered via RNA-targeting Cas9 [58] or more complex fusion proteins [59]. Finally, Fig 4F compiles a model of the repressilator which allows for multiple ribosomes to bind to each transcript. The added complexity creates much more complicated dynamics, but oscillatory behavior still clearly occurs. This example illustrates how BioCRNpyler can be used to test different modeling assumptions (e.g. does multiple occupancy of ribosomes matter?).

Finally, we comment that all the examples from Fig 4 make use of the same underlying set of 10–20 default parameters (estimated from Cell Biology by the Numbers [60]) demonstrating how BioCRNpyler’s parameter database and defaulting behavior make model construction and simulation possible even before detailed experiments or literature review. The efficiency of using BioCRNpyler to explore diverse modeling assumptions and circuit architectures is quantified in Fig 4G which compares the number of species, reactions, and ordinary differential equation terms in the compiled models to the lines of BioCRNpyler code needed to create these models. In short, BioCRNpyler allows for the rapid generation of large and diverse models. Code for these six examples can be in Sections I-IV in S1 Text.

3.2 Systems biology circuit example

Fig 5 illustrates how a set of BioCRNpyler Components and Mechanisms can be joined together to produce a systems level model of the lac operon—a highly studied gene regulatory network in E. coli which regulates whether glucose or lactose is metabolized [61]. This specification is shown in Fig 5A and consists of around a dozen Components and Mechanisms which jointly enumerate hundreds of species and reactions representing the combinatorial set of conformations of the lac operon (depicted by the cartoons in panel B) and its associated transcription factors, transcription, translation, transport, mRNA degradation and dilution. Besides showing how BioCRNpyler can be applied to model the kinds of combinatoric interactions common in systems biology, this example also graphically illustrates the BioCRNpyler abstraction where Components interact via Mechanisms in order to generate a large, complex CRN (panel C). Furthermore, this example highlights that the Component species mapping is not one-to-one. For example, the Lac Operon is modeled as two Components one representing the promoter architecture and another coupling that promoter to translation. Jointly, these two Components produce a combinatoric number of formal CRN species (shown in panel C by the many different blue dots). Similarly, β-galactosidase is modeled as two Components: as an enzyme (which metabolizes lactose) and a chemical complex (because it is a homeotetramer). Finally, we note that the simulated output of our model (Fig 5D) produces a ~1-2 hour delay between the depletion of glucose and steady state lactose metabolism, consistent with previous models and experiments [61]. Interestingly, this is observed even though we made no efforts to fine-tune our parameters, suggesting that the combinatorial nature of this system may give rise to this behavior in a manner that is robust to detailed kinetic rates. The code used to generate this model can be found in Section VII in S1 Text.

3.3 Component enumeration example

Fig 6 shows three example circuits which make use of component enumeration in order to produce sophisticated CRNs. Local component enumeration is illustrated in Fig 6A. Here, a single DNA Component (top) uses local component enumeration to read through the parts included in its plasmid and determine all possible correctly oriented terminator-promoter pairs. This information is then used to produce multiple RNA Components which model transcription and translation for complex genetic circuit architectures. The CRN and simulation output for this circuit are shown in Fig 6B and 6C, respectively. Fig 6D provides an example of global component enumeration involving the enzymatic recombination of DNA. Specifically, serine integrases (such as Bxb1) are enzymes capable of recombining strands of DNA at specific integration sites [62]. Integration events can happen within a single piece of DNA (top two reactions in panel D) or between multiple DNA species (bottom 4 reactions of panel D). In these reactions, the integrase binds to attP and attB sites and reorganizes them into attL and attR sites which can result in DNA insertions, excisions, or re-orientations. Importantly, each new DNA strand produced by an integrase reaction could potentially recombine with itself or the other strands already produced. Such systems can give rise to theoretically infinite CRNs [63]. BioCRNpyler can approximate integrase systems by recursively using a global component enumerator. In this example, only a single round of recursion is shown for clarity. The clusters of dots in Fig 6E are due to the combinatoric number bound and unbound states due to the potential for integrases to bind and unbind to attP, attB, attL, and attR sites. Finally, the BioCRNpyler framework is designed so that local and global component enumeration are mutually compatible. In Fig 6F, a model of a self-flipping promoter is shown. Initially, the promoter faces right and expresses the integrase Bxb1 which in turn flips the promoter causing Bxb1 expression to cease in favor of RFP expression. In BioCRNpyler, this model is compiled by first using global component enumeration to produce all the possible DNA Components generated by integrase recombinations. Each of these DNA Components then uses local component enumeration to produce RNA Components. All these Components can then be used to compile a CRN by calling their respective Mechanisms. More details about local and global component enumeration, including code for the example models, can be found in Sections VIII-X in S1 Text.

4 Availability and future directions

BioCRNpyler aims to be a piece of open-source community driven software that is easily accessible to biologists and bioengineers with varying levels of programming experience as well as easily customizable by computational biologists and more advanced developers. Towards these ends, the software package is available via GitHub and PyPi, requires very minimal software dependencies, contains extensive examples and documentation in the form of interactive Jupyter notebooks [40], YouTube tutorials [51], and automated testing to ensure stability. Furthermore this software has been extensively tested via inclusion in bio-modeling courses and bootcamps for users ranging from college freshmen and sophomores with minimal coding experience to advanced computational biologists demonstrating the accessibility and flexibility of the package. BioCRNpyler has already been deployed to build diverse models in synthetic biology including modeling bacterial gene regulatory networks [64], modeling bacterial circuits in the gut microbiome [65], and modeling cell extract metabolism [66]. Developing new software functionality is also a simple process documented on the GitHub contributions page.

Given the plethora of model building and simulation software already in existence, it is important to highlight how BioCRNpyler fits into the larger context of existing tools. Table 1 gives a high level overview of how BioCRNpyler compares to other tools. Firstly, BioCRNpyler stands out due to the novel Mixture-Component-Mechanism abstraction. This framework allows users to easily put together complex models using BioCRNpyler’s extensive library or to develop their own extensions by writing Python code. Rule based frameworks, such as BioNetGen [36] and PySB [37] offer similar abstractions to Mechanisms. However, these must be codified in a formal language specific to the framework (BioNetGen uses .bng files and PySB uses a specialized text format) which offers less flexibility than the arbitrary python code allowed by BioCRNpyler. The Virtual Parts Repository [35] and iBioSim [34] take a different approach to abstract specifications by generating CRNs from SBOL files. This methodology is similar in spirit to BioCRNpyler but is restricted due to the reliance on the SBOL standard, the need of software-specific SBOL annotations, and challenges in generalizing beyond gene regulatory network architectures. BioCRNpyler also differs from many other pieces of software because it includes a detailed library of biological parts and models. PySB, Virtual Parts Repository, and iBioSim similarly include a variety of built-in rules, models, and parts, respectively. However, BioCRNpyler is unique in its modularity: the ability to use the same Component with different Mechanisms placed in different Mixtures allows for a combinatoric variety of models to be easily specified and explored. Finally, we reiterate that BioCRNpyler is not a CRN simulator like COPASI [11], MATLAB Simbiology [13], or Tellurium (via libroadrunner) [12, 14]. This brings us to a final point about BioCRNpyler: it is a pure Python package with very minimal dependencies meant to be used as a scripting language, interfaced with existing simulators, used in Jupyter notebooks [67], and integrated into existing pipelines.

Table 1. Comparison of different simulation software.

Abstraction: how models can be represented in the software. Library: whether there is a substantial library of pre-existing parts/components/sub-models that can be reused. Simulator: whether the software simulates models numerically. Source: the language(s) the software is written in. UI: the primary way a user interacts with the software. API: the primary programming language the software is designed to be accessed with.

Software	Abstraction	Library	Simulator	Source	UI	API
BioCRNpyler [40]	`Mixtures`, `Components`, `Mechanisms`, & CRNs	Yes	No	Python	Python	Python
BioNetGen [36]	Rules	No	Yes	Perl C++ Python	.bng files	.bng files
PySB [37]	Rules	Yes	No	Python	Text Rules	Python
Tellurium [12] (using Antimony [30] and libRoadrunner [14])	CRNs	No	Yes	Python	Text Reactions	Python
Virtual Parts Repository [35]	SBOL	Yes	No	Java	Web	Java
iBioSim [34]	SBOL & CRNs	Yes	Yes	Java	GUI	Command line
COPASI [11]	CRNs	No	Yes	Java C++	GUI	C++ & other derived APIs
MATLAB Simbiology [13]	CRNs	No	Yes	MATLAB	MATLAB	MATLAB

Open in a new tab

BioCRNpyler is an ongoing effort which will grow and change with the needs of its community. Extending this community via outreach, documentation, and an ever expanding suite of functionalities is central to the goals of this project. We are particularly interested in facilitating the integration of BioCRNpyler into existing laboratory pipelines in order to make modeling a central part of the design-build-test cycle in synthetic biology. One avenue towards this goal is to add compatibility to existing standards such as SBOL [31] and automation platforms such as DNA-BOT [68] so BioCRNpyler can automatically compile models of circuits as they are being designed and built. This approach will be a generalization and extension of Roehner et al. [69]. In particular due to the modular BioCRNpyler compilation process, it will be possible to have programmatic control over the SBML model produced from BioCRNpyler.

We also plan on extending the library to include more realistic and diverse Mixtures, Mechanisms, and Components (particularly experimentally validated models of circuits in E. coli and in cell extracts). We hope that these models will serve as examples and inspiration for other scientists to add their own model systems in other organisms to the software library.

Finally, we believe that the Mixture-Component-Mechanism abstraction of model compilation used in BioCRNpyler is quite fundamental and could be extended to other non-CRN based modeling approaches. Advanced simulation techniques beyond chemical reaction networks will be required to accurately model the diversity and complexity of biological systems. New software frameworks such as Vivarium [64] have the potential to generate models which couple many simulation modalities. The abstractions used in BioCRNpyler could be extended to compile models beyond chemical reaction networks such as mechanical models, flux balance models, and statistical models derived from data. The integration of these models together will naturally depend on both detailed mechanistic descriptions as well as overarching system context. We emphasize that building extendable and reusable frameworks to enable quantitative modeling in biology will become increasingly necessary to understand and design ever more complex biochemical systems.

Supporting information

S1 Text

Table A: (CRN Species Classes the BioCRNpyler Library). Table B: (Reaction Propensities in the BioCRNpyler Library). Table C: (Some Mechanisms in the BioCRNpyler Library). Table D: (Some Components in the BioCRNpyler Library). Table E: (Some Mixtures in the BioCRNpyler. Library).

(PDF)

Click here for additional data file.^{(5.1MB, pdf)}

Acknowledgments

We would like to thank the https://murray.cds.caltech.edu/BE_240,_Spring_2020 and the Murray Biocircuits lab for extensive testing of this software and discussions of relevant models, library of parts, and parameters. In particular, we would like to thank Zoila Jurado, Matthieu Kratz, Liana Merk, and Ankita Roychoudhury for contributing to the software library.

Data Availability

BioCRNpyler source code and an extensive set of example notebooks, documentation, and tutorials are available in our GitHub repository: https://github.com/BuildACell/BioCRNPyler. All other data are available within the manuscript and its Supporting information files.

Funding Statement

The authors WP and AP are partially supported by US National Science Foundation (CBET-1903477). AP was also supported by the Defense Advanced Research Projects Agency (Agreement HR0011-17-2-0008). AS was supported by the Institute for Collaborative Biotechnologies through cooperative agreement W911NF-19-2-0026 from the U.S. Army Research Office. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Alon U. An introduction to systems biology: design principles of biological circuits. CRC press; 2019. [Google Scholar]
2. Vecchio DD, Murray RM. Biomolecular Feedback Systems. Princton University Press; 2014. [Google Scholar]
3. Weng G, Bhalla US, Iyengar R. Complexity in Biological Signaling Systems. Science. 1999;284(5411):92–96. doi: 10.1126/science.284.5411.92 [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Gillespie DT. Stochastic simulation of chemical kinetics. Annu Rev Phys Chem. 2007;58:35–55. doi: 10.1146/annurev.physchem.58.032806.104637 [DOI] [PubMed] [Google Scholar]
5.Gunawardena J. Chemical reaction network theory for in-silico biologists. Notes available for download at http://vcpmedharvardedu/papers/crntpdf. 2003;.
6. Soloveichik D, Cook M, Winfree E, Bruck J. Computation with finite stochastic chemical reaction networks. natural computing. 2008;7(4):615–633. doi: 10.1007/s11047-008-9067-y [DOI] [Google Scholar]
7. Schmiedl T, Seifert U. Stochastic thermodynamics of chemical reaction networks. The Journal of chemical physics. 2007;126(4):044101. doi: 10.1063/1.2428297 [DOI] [PubMed] [Google Scholar]
8.Morrison MJ, Razo-Mejia M, Phillips R. Reconciling Kinetic and Equilibrium Models of Bacterial Transcription. arXiv preprint arXiv:200607772. 2020;.
9. Cinquemani E. Identifiability and reconstruction of biochemical reaction networks from population snapshot data. Processes. 2018;6(9):136. doi: 10.3390/pr6090136 [DOI] [Google Scholar]
10. Hsiao V, Swaminathan A, Murray RM. Control theory for synthetic biology: recent advances in system characterization, control design, and controller implementation for synthetic biology. IEEE Control Systems Magazine. 2018;38(3):32–62. doi: 10.1109/MCS.2018.2810459 [DOI] [Google Scholar]
11. Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, et al. COPASI—a complex pathway simulator. Bioinformatics. 2006;22(24):3067–3074. doi: 10.1093/bioinformatics/btl485 [DOI] [PubMed] [Google Scholar]
12. Choi K, Medley JK, König M, Stocking K, Smith L, Gu S, et al. Tellurium: an extensible python-based modeling environment for systems and synthetic biology. Biosystems. 2018;171:74–79. doi: 10.1016/j.biosystems.2018.07.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.The MathWorks, Inc. MATLAB Simbiology Toolbox; 2022. Available from: https://www.mathworks.com/help/simbio/.
14. Somogyi ET, Bouteiller JM, Glazier JA, König M, Medley JK, Swat MH, et al. libRoadRunner: a high performance SBML simulation and analysis library. Bioinformatics. 2015;31(20):3315–3321. doi: 10.1093/bioinformatics/btv363 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Le Novere N, Bornstein B, Broicher A, Courtot M, Donizelli M, Dharuri H, et al. BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic acids research. 2006;34(suppl_1):D689–D691. doi: 10.1093/nar/gkj092 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Cooling MT, Rouilly V, Misirli G, Lawson J, Yu T, Hallinan J, et al. Standard virtual biological parts: a repository of modular modeling components for synthetic biology. Bioinformatics. 2010;26(7):925–931. doi: 10.1093/bioinformatics/btq063 [DOI] [PubMed] [Google Scholar]
17. Benner SA, Sismour AM. Synthetic biology. Nature Reviews Genetics. 2005;6(7):533–543. doi: 10.1038/nrg1637 [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Nielsen AAK, Der BS, Shin J, Vaidyanathan P, Paralanov V, Strychalski EA, et al. Genetic circuit design automation. Science. 2016;352(6281):aac7341–aac7341. doi: 10.1126/science.aac7341 [DOI] [PubMed] [Google Scholar]
19. Guiziou S, Pérution-Kihli G, Ulliana F, Leclère M, Bonnet J. Exploring the design space of recombinase logic circuits. bioRxiv. 2019;. [Google Scholar]
20. Soloveichik D, Seelig G, Winfree E. DNA as a universal substrate for chemical kinetics. Proceedings of the National Academy of Sciences. 2010;107(12):5393–5398. doi: 10.1073/pnas.0909380107 [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Qian L, Winfree E. Scaling up digital circuit computation with DNA strand displacement cascades. Science. 2011;332(6034):1196–1201. doi: 10.1126/science.1200520 [DOI] [PubMed] [Google Scholar]
22. Srinivas N, Parkin J, Seelig G, Winfree E, Soloveichik D. Enzyme-free Nucleic Acid Dynamical Systems. Science. 2017;358 (6369). doi: 10.1126/science.aal2052 [DOI] [PubMed] [Google Scholar]
23. Vasić M, Soloveichik D, Khurshid S. CRN++: Molecular programming language. Natural Computing. 2020; p. 1–17. [Google Scholar]
24. Spaccasassi C, Lakin MR, Phillips A. A logic programming language for computational nucleic acid devices. ACS synthetic biology. 2018;8(7):1530–1547. doi: 10.1021/acssynbio.8b00229 [DOI] [PubMed] [Google Scholar]
25. Badelt S, Grun C, Sarma KV, Wolfe B, Shin SW, Winfree E. A domain-level DNA strand displacement reaction enumerator allowing arbitrary non-pseudoknotted secondary structures. Journal of the Royal Society Interface. 2020;17(167):20190866. doi: 10.1098/rsif.2019.0866 [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Seelig G, Yurke B, Winfree E. Catalyzed Relaxation of a Metastable DNA Fuel. Journal of the American Chemical Society. 2006;128(37):12211–12220. doi: 10.1021/ja0635635 [DOI] [PubMed] [Google Scholar]
27. Zhang DY, Turberfield AJ, Yurke B, Winfree E. Engineering Entropy-driven Reactions and Networks Catalyzed by DNA. Science. 2007;318(5853):1121–1125. doi: 10.1126/science.1148532 [DOI] [PubMed] [Google Scholar]
28.Badelt S, Shin SW, Johnson RF, Dong Q, Thachuk C, Winfree E. A general-purpose CRN-to-DSD compiler with formal verification, optimization, and simulation capabilities. In: International Conference on DNA-Based Computers. Springer; 2017. p. 232–248.
29. Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19(4):524–531. doi: 10.1093/bioinformatics/btg015 [DOI] [PubMed] [Google Scholar]
30. Smith LP, Bergmann FT, Chandran D, Sauro HM. Antimony: A Modular Model Definition Language. Bioinformatics. 2009;25(18):2452–2454. doi: 10.1093/bioinformatics/btp401 [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Galdzicki M, Clancy KP, Oberortner E, Pocock M, Quinn JY, Rodriguez CA, et al. The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology. Nature biotechnology. 2014;32(6):545–550. doi: 10.1038/nbt.2891 [DOI] [PubMed] [Google Scholar]
32. Smith LP, Hucka M, Hoops S, Finney A, Ginkel M, Myers CJ, et al. SBML level 3 package: hierarchical model composition, version 1 release 3. Journal of integrative bioinformatics. 2015;12(2):603–659. doi: 10.2390/biecoll-jib-2015-268 [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Myers CJ, et al. iBioSim: a tool for the analysis and design of genetic circuits. Bioinformatics. 2009;25(21):2848–2849. doi: 10.1093/bioinformatics/btp457 [DOI] [PubMed] [Google Scholar]
34. Watanabe L, Nguyen T, Zhang M, Zundel Z, Zhang Z, Madsen C, et al. iBioSim 3: a tool for model-based genetic circuit design. ACS synthetic biology. 2018;8(7):1560–1563. doi: 10.1021/acssynbio.8b00078 [DOI] [PubMed] [Google Scholar]
35. Mısırlı G, Yang B, James K, Wipat A. Virtual Parts Repository 2: Model-Driven Design of Genetic Regulatory Circuits. ACS Synthetic Biology. 0;0(0):null. [DOI] [PubMed] [Google Scholar]
36. Harris LA, et al. BioNetGen 2.2: advances in rule-based modeling. Bioinformatics. 2016;32(21):3366–3368. doi: 10.1093/bioinformatics/btw469 [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Lopez CF, Muhlich JL, Bachman JA, Sorger PK. Programming biological models in Python using PySB. Molecular systems biology. 2013;9(1):646. doi: 10.1038/msb.2013.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Tuza ZA, et al. An in silico modeling toolbox for rapid prototyping of circuits in a biomolecular “breadboard” system. In: 52nd IEEE Conference on Decision and Control; 2013. p. 1404–1410.
39. Singhal V, Tuza ZA, Sun ZZ, Murray RM. A MATLAB toolbox for modeling genetic circuits in cell-free systems. Synthetic Biology. 2021;6(1):ysab007. doi: 10.1093/synbio/ysab007 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Poole W, Pandey A, Shur A, Tuza Z, Murray RM. BioCRNpyler Github Repository; 2022. Accessed 01-09-2022. https://github.com/BuildACell/BioCRNpyler.
41. Hucka M, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19(4):524–531. doi: 10.1093/bioinformatics/btg015 [DOI] [PubMed] [Google Scholar]
42.Bokeh Development Team. Bokeh: Python library for interactive visualization; 2020. Available from: https://bokeh.org/.
43. Jacomy M, Venturini T, Heymann S, Bastian M. ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software. PLoS ONE. 2014;9(6):e98679. doi: 10.1371/journal.pone.0098679 [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Moore SJ, MacDonald JT, Wienecke S, Ishwarbhai A, Tsipa A, Aw R, et al. Rapid acquisition and model-based analysis of cell-free transcription–translation reactions from nonmodel bacteria. Proceedings of the National Academy of Sciences. 2018;115(19):E4340–E4349. doi: 10.1073/pnas.1715806115 [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Meyer AJ, Segall-Shapiro TH, Voigt CA. Marionette: E. coli containing 12 highly-optimized small molecule sensors. bioRxiv. 2018; p. 285866. [DOI] [PubMed] [Google Scholar]
46. Hu CY, Varner JD, Lucks JB. Generating effective models and parameters for RNA genetic circuits. ACS synthetic biology. 2015;4(8):914–926. doi: 10.1021/acssynbio.5b00077 [DOI] [PubMed] [Google Scholar]
47. Pasotti L, Bellato M, De Marchi D, Magni P. Mechanistic models of inducible synthetic circuits for joint description of DNA copy number, regulatory protein level, and cell load. Processes. 2019;7(3):119. doi: 10.3390/pr7030119 [DOI] [Google Scholar]
48. Transtrum MK, Qiu P. Bridging mechanistic and phenomenological models of complex biological systems. PLoS computational biology. 2016;12(5):e1004915. doi: 10.1371/journal.pcbi.1004915 [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Pandey A, Murray RM. Model Reduction Tools For Phenomenological Modeling of Input-Controlled Biological Circuits. bioRxiv. 2020;. [Google Scholar]
50.Team CD. Codecov Software Package; 2022. https://codecov.io/.
51.Poole W, Pandey A. BuildaCell Youtube Channel; 2020. https://www.youtube.com/watch?v=mu-9MSntd2w&list=PLb2LmjoxZO-g2vbTr3HBcnvVZur8JFiqf.
52. Swaminathan A, et al. Fast and flexible simulation and parameter estimation for synthetic biology using bioscrape. bioRxiv. 2019; p. 121152. [Google Scholar]
53. Der BS, Glassey E, Bartley BA, Enghuus C, Goodman DB, Gordon DB, et al. DNAplotlib: programmable visualization of genetic designs and associated data. ACS synthetic biology. 2017;6(7):1115–1119. doi: 10.1021/acssynbio.6b00252 [DOI] [PubMed] [Google Scholar]
54. Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in Escherichia coli. Nature. 2000;403(6767):339–342. doi: 10.1038/35002131 [DOI] [PubMed] [Google Scholar]
55. Elowitz MB, et al. A synthetic oscillatory network of transcriptional regulators. Nature. 2000;403(6767):335–338. doi: 10.1038/35002125 [DOI] [PubMed] [Google Scholar]
56. Cress BF, Toparlak OD, Guleria S, Lebovich M, Stieglitz JT, Englaender JA, et al. CRISPathBrick: modular combinatorial assembly of type II-A CRISPR arrays for dCas9-mediated multiplex transcriptional repression in E. coli. ACS synthetic biology. 2015;4(9):987–1000. doi: 10.1021/acssynbio.5b00012 [DOI] [PubMed] [Google Scholar]
57. Jayanthi S, Nilgiriwala KS, Del Vecchio D. Retroactivity controls the temporal dynamics of gene transcription. ACS synthetic biology. 2013;2(8):431–441. doi: 10.1021/sb300098w [DOI] [PubMed] [Google Scholar]
58. Strutt SC, Torrez RM, Kaya E, Negrete OA, Doudna JA. RNA-dependent RNA targeting by CRISPR-Cas9. elife. 2018;7:e32724. doi: 10.7554/eLife.32724 [DOI] [PMC free article] [PubMed] [Google Scholar]
59. Dang DT, Phan AT. Development of a ribonuclease containing a G4-specific binding motif for programmable RNA cleavage. Scientific reports. 2019;9(1):1–7. doi: 10.1038/s41598-019-42143-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
60. Milo R, et al. Cell biology by the numbers. Garland Science; 2015. [Google Scholar]
61. Santillán M, Mackey MC. Quantitative approaches to the study of bistability in the lac operon of Escherichia coli. Journal of The Royal Society Interface. 2008;5(suppl_1):S29–S39. [DOI] [PMC free article] [PubMed] [Google Scholar]
62. Rutherford K, Yuan P, Perry K, Sharp R, Van Duyne GD. Attachment site recognition and regulation of directionality by the serine integrases. Nucleic acids research. 2013;41(17):8341–8356. doi: 10.1093/nar/gkt580 [DOI] [PMC free article] [PubMed] [Google Scholar]
63. Paun G. On the Power of the Splicing Operation. International Journal of Computer Mathematics. 1995;59(1-2):27–35. doi: 10.1080/00207169508804451 [DOI] [Google Scholar]
64. Agmon E, Spangler RK, Skalnik CJ, Poole W, Peirce SM, Morrison JH, et al. Vivarium: an interface and engine for integrative multiscale modeling in computational biology. bioRxiv. 2021;. [DOI] [PMC free article] [PubMed] [Google Scholar]
65. Merk LN, Shur AS, Pandey A, Murray RM, Green LN. Engineering Logical Inflammation Sensing Circuit for Gut Modulation. bioRxiv. 2020;. [Google Scholar]
66.Roychoudhury A. Understanding the Lifetime and Rate of Protein Production in Cell-Free Reactions While Maximizing Energy Use [B.S. Thesis]. California Institute of Technology; 2021.
67. Perkel JM. Why Jupyter is Data Scientists’ Computational Notebook of Choice. Nature. 2018;563(7732):145–147. doi: 10.1038/d41586-018-07196-1 [DOI] [PubMed] [Google Scholar]
68. Storch M, Haines MC, Baldwin GS. DNA-BOT: a low-cost, automated DNA assembly platform for synthetic biology. Synthetic Biology. 2020;5(1):ysaa010. doi: 10.1093/synbio/ysaa010 [DOI] [PMC free article] [PubMed] [Google Scholar]
69. Roehner N, Zhang Z, Nguyen T, Myers CJ. Generating systems biology markup language models from the synthetic biology open language. ACS synthetic biology. 2015;4(8):873–879. doi: 10.1021/sb5003289 [DOI] [PubMed] [Google Scholar]

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009987.r001

Decision Letter 0

Pedro Mendes, Dina Schneidman-Duhovny

4 Oct 2021

Dear Mr. Poole,

Thank you very much for submitting your manuscript "BioCRNpyler: Compiling Chemical Reaction Networks from Biomolecular Parts in Diverse Contexts" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

Please follow reviewer's suggestions regarding the manuscript organization: the manuscript should explain the functionality of the BioCRNpyler, while the manual and tutorial should be part of the github repository.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Pedro Mendes, PhD

Associate Editor

PLOS Computational Biology

Dina Schneidman

Software Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The paper presents a new compiler from genetic circuit designs specified in a high-level language into chemical reaction network models appropriate for simulation and other analyses. This high-level language and compiler can greatly reduce the size of the description of practical genetic circuit designs as demonstrated in their results shown in Figure 1G. Therefore, this tool is likely to prove to be very useful. The authors are commended for providing this open source with detailed documentation, examples, and tutorials.

The paper though needs some work to better articulate their contributions while positioning themselves with respect to the related work. Here are my more detailed comments:

1) Abstract (and introduction) - you talk about the high-level design specification, which is your key contribution, but very little detail about it is provided in the abstract or introduction. It would be useful to provide a bit of a description in both the abstract and introduction, so the reader can understand at what level of abstraction it works and provide some intuition why it may result in reductions in model complexity. A small motivating example early in the paper would likely be useful to engage the reader sooner. Indeed, all the examples of the actual specification seem to only appear in the supplemental.

2) Intro - "relatively few tools exist to aid in the automated construction of general CRN models from simple specifications", I'm not so sure I completely agree with this statement. There are several tools that the authors do not reference that do just this, such as Antimony, ShortBOL, and the Virtual Parts Repository. Even pySBOL coupled with model generation tools such as VPR or iBioSim provide similar functionality to what they are proposing. I do believe that their approach is novel and useful, but the authors need to do a better job articulating the similarities and differences with these other approaches.

3) Page 8 and a few other places, "objected-oriented" -> "object-oriented"

4) Figure 1G is the key result demonstrating the utility of this work. A bit more intuition for this result is needed. Part of the issue is not having a detailed example up to this point.

5) Figure 6, repeated "relatively" in the caption.

6) A key contribution cited is the flexibility to have alternative models. My understanding is this requires writing python code to create new mechanisms. In some sense, this lessens the impact of this contribution, since other model generators can (and do) provide alternative modeling methods via added code to their model generators. It is unclear to me how much easier it is to develop a new model using their approach versus other model generators. A detailed example of how this is done would be useful to demonstrate this.

Overall, this is a useful tool that may have the potential to enable model-driven design for biologists and bioengineers with limited programming experience. The authors need to be clear that some experience is still needed though, as python programs still need to be created. The authors need to also better articulate the differences between their tool and similar approaches.

Reviewer #2: The manuscript describes a Python package for creating of reaction networks in SBML from a high-level design specification. It will be very useful for biologists using Python for modeling.

Pros:

- Easy to install, no extra libraries were required on Mac. A lot of tutorials and examples.

- SBML is correct, tested by importing into COPASI general simulator. Simulations run nicely with LibRoadRunner from Jupyter notebook.

- The libraries of components and mechanisms are useful for compiling multiple models from standard parts.

- Creating new components and mechanisms is very useful way to extend these libraries.

- A very interesting and useful approach of maintaining a parameter database with hierarchy of components and mechanisms (types have multiple names) and automatic substitution of parameters using this hierarchy, for example if no parameter exists for a specific mechanism name, but a parameter value exists for a mechanism type.

Major issues:

1. The manuscript is rather difficult to read. It’s neither a biological paper describing a use of a modeling approach, nor a computational paper describing a software.

a. The manuscript lists a lot of Python classes but does not give any details on how they work. The only way to understand how the package works is to follow examples on GitHub and run them one by one. For example, the authors mention OrderedPolymerSpecies and a PolymerConformation, but never explain what are these, or how DNAconstruct enumerates parts. Most classes are defined so briefly that it’s impossible to understand what they do. “GlobalMechanisms are rules used to generate Species and Reactions at the end of compilation” – it is repeated three times in different contexts, but it does not help with understanding of how it works. …DillutionMixture is neither defined biologically nor explained programmatically, despite being used many times in different contexts. I just recommend the authors to look for any complicated Python class name in the manuscript and ask themselves whether the description in the manuscript is enough to understand the term.

b. The use of the term “hierarchical” and mentioning SBML hierarchical extension are misleading: BioCRNpyler is not generating hierarchical models.

c. There are many GitHub folders with many tutorials. Some guide on which classes are described in which tutorials would be helpful.

2. The authors mention several tools that are comparable to BioCRNpyler, but don’t compare and don’t demonstrate current (not potential) advantages over other tools.

a. At which point the use of BioCRNpyler becomes easier than specifying models in SBML simulators like Copasi and Tellurium? The complex model of Lac Operon is definitely easier to define in any rule-based language, but is BioCRNpyler better than BioNetGen or PySB?

b. What’s the difference with PySB? I noticed parameters, but otherwise the same mechanisms can be defined in PySB, and using BioNetGen in PySB is more powerful.

c. What rule-based features are used? Is it a plain combination of all components without any constraints?

d. More comparing with BioNetGen would be useful – are there any advantages of BioCRNpyler high-level specification over the BNGL language? Can it specify something that BioNetGen cannot?

e. Comparing with iBioSim would be helpful. It also can define genetic components and mechanisms.

f. Comparing with Tellurium/Antimony would be helpful – it has human-readable language.

g. The manuscript will gain a lot if an example will be provided (may be as a supplement) in all four languages: BioCRNpyler, PySB, BioNetGen, and Antimony.

3. The only biological use of BioCRNpyler is in Ref 56, but it is not discussed in the manuscript, all examples are just test examples that repeat well-known and many times modelled biological systems that are simple to be defined in any biomodelling simulator.

4. Lac Operon model is the most complicated model described, but the authors don’t mention what to do with their model of 173 species and 343 reactions, is it comparable with any previous models? And then, I could not find the code for this specific model among examples.

Minor issues:

1. Biorxiv 50, 55-57 – provide full citations.

2. Ref 54 is just a review mentioning BioCRNpyler, it does not demonstrate that “BioCRNpyler has already been deployed to build diverse models in systems and synthetic biology.

3. Ref 55 is not using BioCRNpyler.

4. Why the new term “mixture” is introduced instead of a classical “model”?

Reviewer #3: This work by Poole et al. introduces BioCRNpyler as a tool for users to build reaction networks from high-level design specifications. The tool seems to automate several steps of the network-building process and provides a library of biochemical reactions to build networks using these components. The user is also offered a library of parameters which provides a good starting point for simulation.

Overall, I really wanted to like this work but I feel the authors missed a chance to present their work and build enthusiasm for their tool. The paper itself is structured in a way that makes it hard to understand what and how the tool works and what the benefit of the tool would be for a reader. Rather the paper reads like a user manual with many examples/tutorials rather than a narrative about the tool.

Major comments:

1. The introduction provides a good context for how CRNs are used but it does not provide a compelling argument for why BioCRNpyler is needed. Why is compiling reactions better than what other tools do? Why would the end-user pick BioCRNpyler over other tools? I think a brief introduction to molecular compilers and their use in DNA circuits could help place the tool in context for the reader.

2. In the introduction, the authors mention what existing tools are comparable to BioCRNpyler. I think that a table in the results section would be better suited to provide this information, along with some benchmarks in the supplement.

3. The authors provide a "laundry list" of motivating examples to describe BioCRNpyler. However, these examples are hard to follow. First, the biology context is not very familiar for most readers. Second, Each section is one or two paragraphs with a large amount of data/figures, making it hard to follow for readers. Third, the authors reference software calls without context. This all makes it very hard to follow.

4. Figure 2 is not very informative. It is meant to provide a hierarchical organization of BioCRNpyler but it left me feeling lost. What am I supposed to learn from this figure? Perhaps the author should consider replacing this with a flowchart.

5. I think the readers of this journal would mostly benefit from a well-explained example throughout. This could be perhaps the Lac Operon model from Figure 3. Using one example may better help readers understand the tool.

6. The idea of reaction schemas seems very interesting/compelling for this work! I would like to see this idea expanded and explained in a biological context more thoroughly. The authors instead start by defining the need for a schema but quickly devolve into mathematical and algorithmic details that likely belong in the supplement or in a more specialized section.

Minor comments:

1. Various spelling mistakes are present throuhgout. For example, in the abstract the authors wrote "complies" when I believe they meant "compile".

2. Similarly, the tone of the paper often sounds like a user manual or an advertisement for BioCRNpyler rather than a tool that solves a biological problem.

3. Figure 1 is too cluttered, while Figure 2 is not very informative. It is unclear to me what the other figures are trying to convey as well. The figures work best when they flow with the narrative.

4. Although the authors do a decent job in the introduction to present other tools, they also miss a chance to place BioCRNpyler in the context of other python tools. For example, Tellurium, COBRApy, COPASI, etc are Python tools that may be complementary to BioCRNpyler and should at least be mentioned.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Chris Myers

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. 2022 Apr 20;18(4):e1009987. doi: 10.1371/journal.pcbi.1009987.r002

Author response to Decision Letter 0

11 Jan 2022

Attachment

Submitted filename: BioCRNpyler_reviewer_responses.pdf

Click here for additional data file.^{(621.9KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009987.r003

Decision Letter 1

Pedro Mendes, Dina Schneidman-Duhovny

14 Feb 2022

Dear Mr. Poole,

Thank you very much for submitting your manuscript "BioCRNpyler: Compiling Chemical Reaction Networks from Biomolecular Parts in Diverse Contexts" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Pedro Mendes, PhD

Associate Editor

PLOS Computational Biology

Dina Schneidman

Software Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have addressed all my concerns.

Reviewer #2: The authors significantly improved the readability of the manuscript.

My general note is about supplemental material. In Figures 4 and 6 it would be very useful to quickly look into the supplemental material and find a snippet of a code describing multiple models mentioned in the manuscript. However, there are no direct references to supplemental material describing specific code. The same way as the authors refer to specific supplemental tables, it would be nice to refer to specific sections of supplemental material describing models. Now a reader needs to read the whole supplemental material to match it to specific figures in the manuscript.

There are minor issues in the manuscript that should be fixed:

Lines 91-96 are almost verbatim repetition of lines 70-74. Perhaps the introduction can be without specific terms that would come later.

Line 175: exclamation sign is not necessary.

Lines 226-227: are “a promoter (prom), ribosome binding site (rbs)” necessary for the first code? Explain when and why you’ll need them, or remove – they confuse the reader.

Line 259: fix ComponentEnuemrators

Line 318: is DNAasembly the same as defined in line 234?

Figure 4: explicitly refer to each part of supplemental material for code snippets for each model.

Figure 5 – what do colors for LacOperon species (light blue, dark blue, bright blue and violet) mean?

Lines 424-425: “bio-modeling course and bootcamps with dozens of users ranging from college freshmen and sophomores with minimal coding experience“ - it reads like a grant application. I would remove it or be more specific on why these users use this software, e.g. which biological systems do they model that it makes it easier.

Link http://buildacell.io/BioCRNPyler from https://github.com/BuildACell/bioCRNpyler is dead.

Reviewer #3: The authors have addressed all my concerns with this revision. There are still some lingering typos throughout (e.g. complies -> compiles that should be fixed.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Chris J. Myers

Reviewer #2: No

Reviewer #3: No

Figure Files:

Data Requirements:

Reproducibility:

References:

Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.

If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLoS Comput Biol. 2022 Apr 20;18(4):e1009987. doi: 10.1371/journal.pcbi.1009987.r004

Author response to Decision Letter 1

23 Feb 2022

Attachment

Submitted filename: Reviewer Resposnes 2.pdf

Click here for additional data file.^{(138.9KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009987.r005

Decision Letter 2

Pedro Mendes, Dina Schneidman-Duhovny

3 Mar 2022

Dear Mr. Poole,

We are pleased to inform you that your manuscript 'BioCRNpyler: Compiling Chemical Reaction Networks from Biomolecular Parts in Diverse Contexts' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Pedro Mendes, PhD

Associate Editor

PLOS Computational Biology

Dina Schneidman

Software Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009987.r006

Acceptance letter

Pedro Mendes, Dina Schneidman-Duhovny

12 Apr 2022

PCOMPBIOL-D-21-01367R2

BioCRNpyler: Compiling Chemical Reaction Networks from Biomolecular Parts in Diverse Contexts

Dear Dr Poole,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Agnes Pap

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Text

(PDF)

Click here for additional data file.^{(5.1MB, pdf)}

Attachment

Submitted filename: BioCRNpyler_reviewer_responses.pdf

Click here for additional data file.^{(621.9KB, pdf)}

Attachment

Submitted filename: Reviewer Resposnes 2.pdf

Click here for additional data file.^{(138.9KB, pdf)}

Data Availability Statement

[pcbi.1009987.ref001] 1. Alon U. An introduction to systems biology: design principles of biological circuits. CRC press; 2019. [Google Scholar]

[pcbi.1009987.ref002] 2. Vecchio DD, Murray RM. Biomolecular Feedback Systems. Princton University Press; 2014. [Google Scholar]

[pcbi.1009987.ref003] 3. Weng G, Bhalla US, Iyengar R. Complexity in Biological Signaling Systems. Science. 1999;284(5411):92–96. doi: 10.1126/science.284.5411.92 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref004] 4. Gillespie DT. Stochastic simulation of chemical kinetics. Annu Rev Phys Chem. 2007;58:35–55. doi: 10.1146/annurev.physchem.58.032806.104637 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref005] 5.Gunawardena J. Chemical reaction network theory for in-silico biologists. Notes available for download at http://vcpmedharvardedu/papers/crntpdf. 2003;.

[pcbi.1009987.ref006] 6. Soloveichik D, Cook M, Winfree E, Bruck J. Computation with finite stochastic chemical reaction networks. natural computing. 2008;7(4):615–633. doi: 10.1007/s11047-008-9067-y [DOI] [Google Scholar]

[pcbi.1009987.ref007] 7. Schmiedl T, Seifert U. Stochastic thermodynamics of chemical reaction networks. The Journal of chemical physics. 2007;126(4):044101. doi: 10.1063/1.2428297 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref008] 8.Morrison MJ, Razo-Mejia M, Phillips R. Reconciling Kinetic and Equilibrium Models of Bacterial Transcription. arXiv preprint arXiv:200607772. 2020;.

[pcbi.1009987.ref009] 9. Cinquemani E. Identifiability and reconstruction of biochemical reaction networks from population snapshot data. Processes. 2018;6(9):136. doi: 10.3390/pr6090136 [DOI] [Google Scholar]

[pcbi.1009987.ref010] 10. Hsiao V, Swaminathan A, Murray RM. Control theory for synthetic biology: recent advances in system characterization, control design, and controller implementation for synthetic biology. IEEE Control Systems Magazine. 2018;38(3):32–62. doi: 10.1109/MCS.2018.2810459 [DOI] [Google Scholar]

[pcbi.1009987.ref011] 11. Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, et al. COPASI—a complex pathway simulator. Bioinformatics. 2006;22(24):3067–3074. doi: 10.1093/bioinformatics/btl485 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref012] 12. Choi K, Medley JK, König M, Stocking K, Smith L, Gu S, et al. Tellurium: an extensible python-based modeling environment for systems and synthetic biology. Biosystems. 2018;171:74–79. doi: 10.1016/j.biosystems.2018.07.006 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref013] 13.The MathWorks, Inc. MATLAB Simbiology Toolbox; 2022. Available from: https://www.mathworks.com/help/simbio/.

[pcbi.1009987.ref014] 14. Somogyi ET, Bouteiller JM, Glazier JA, König M, Medley JK, Swat MH, et al. libRoadRunner: a high performance SBML simulation and analysis library. Bioinformatics. 2015;31(20):3315–3321. doi: 10.1093/bioinformatics/btv363 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref015] 15. Le Novere N, Bornstein B, Broicher A, Courtot M, Donizelli M, Dharuri H, et al. BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic acids research. 2006;34(suppl_1):D689–D691. doi: 10.1093/nar/gkj092 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref016] 16. Cooling MT, Rouilly V, Misirli G, Lawson J, Yu T, Hallinan J, et al. Standard virtual biological parts: a repository of modular modeling components for synthetic biology. Bioinformatics. 2010;26(7):925–931. doi: 10.1093/bioinformatics/btq063 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref017] 17. Benner SA, Sismour AM. Synthetic biology. Nature Reviews Genetics. 2005;6(7):533–543. doi: 10.1038/nrg1637 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref018] 18. Nielsen AAK, Der BS, Shin J, Vaidyanathan P, Paralanov V, Strychalski EA, et al. Genetic circuit design automation. Science. 2016;352(6281):aac7341–aac7341. doi: 10.1126/science.aac7341 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref019] 19. Guiziou S, Pérution-Kihli G, Ulliana F, Leclère M, Bonnet J. Exploring the design space of recombinase logic circuits. bioRxiv. 2019;. [Google Scholar]

[pcbi.1009987.ref020] 20. Soloveichik D, Seelig G, Winfree E. DNA as a universal substrate for chemical kinetics. Proceedings of the National Academy of Sciences. 2010;107(12):5393–5398. doi: 10.1073/pnas.0909380107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref021] 21. Qian L, Winfree E. Scaling up digital circuit computation with DNA strand displacement cascades. Science. 2011;332(6034):1196–1201. doi: 10.1126/science.1200520 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref022] 22. Srinivas N, Parkin J, Seelig G, Winfree E, Soloveichik D. Enzyme-free Nucleic Acid Dynamical Systems. Science. 2017;358 (6369). doi: 10.1126/science.aal2052 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref023] 23. Vasić M, Soloveichik D, Khurshid S. CRN++: Molecular programming language. Natural Computing. 2020; p. 1–17. [Google Scholar]

[pcbi.1009987.ref024] 24. Spaccasassi C, Lakin MR, Phillips A. A logic programming language for computational nucleic acid devices. ACS synthetic biology. 2018;8(7):1530–1547. doi: 10.1021/acssynbio.8b00229 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref025] 25. Badelt S, Grun C, Sarma KV, Wolfe B, Shin SW, Winfree E. A domain-level DNA strand displacement reaction enumerator allowing arbitrary non-pseudoknotted secondary structures. Journal of the Royal Society Interface. 2020;17(167):20190866. doi: 10.1098/rsif.2019.0866 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref026] 26. Seelig G, Yurke B, Winfree E. Catalyzed Relaxation of a Metastable DNA Fuel. Journal of the American Chemical Society. 2006;128(37):12211–12220. doi: 10.1021/ja0635635 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref027] 27. Zhang DY, Turberfield AJ, Yurke B, Winfree E. Engineering Entropy-driven Reactions and Networks Catalyzed by DNA. Science. 2007;318(5853):1121–1125. doi: 10.1126/science.1148532 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref028] 28.Badelt S, Shin SW, Johnson RF, Dong Q, Thachuk C, Winfree E. A general-purpose CRN-to-DSD compiler with formal verification, optimization, and simulation capabilities. In: International Conference on DNA-Based Computers. Springer; 2017. p. 232–248.

[pcbi.1009987.ref029] 29. Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19(4):524–531. doi: 10.1093/bioinformatics/btg015 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref030] 30. Smith LP, Bergmann FT, Chandran D, Sauro HM. Antimony: A Modular Model Definition Language. Bioinformatics. 2009;25(18):2452–2454. doi: 10.1093/bioinformatics/btp401 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref031] 31. Galdzicki M, Clancy KP, Oberortner E, Pocock M, Quinn JY, Rodriguez CA, et al. The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology. Nature biotechnology. 2014;32(6):545–550. doi: 10.1038/nbt.2891 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref032] 32. Smith LP, Hucka M, Hoops S, Finney A, Ginkel M, Myers CJ, et al. SBML level 3 package: hierarchical model composition, version 1 release 3. Journal of integrative bioinformatics. 2015;12(2):603–659. doi: 10.2390/biecoll-jib-2015-268 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref033] 33. Myers CJ, et al. iBioSim: a tool for the analysis and design of genetic circuits. Bioinformatics. 2009;25(21):2848–2849. doi: 10.1093/bioinformatics/btp457 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref034] 34. Watanabe L, Nguyen T, Zhang M, Zundel Z, Zhang Z, Madsen C, et al. iBioSim 3: a tool for model-based genetic circuit design. ACS synthetic biology. 2018;8(7):1560–1563. doi: 10.1021/acssynbio.8b00078 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref035] 35. Mısırlı G, Yang B, James K, Wipat A. Virtual Parts Repository 2: Model-Driven Design of Genetic Regulatory Circuits. ACS Synthetic Biology. 0;0(0):null. [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref036] 36. Harris LA, et al. BioNetGen 2.2: advances in rule-based modeling. Bioinformatics. 2016;32(21):3366–3368. doi: 10.1093/bioinformatics/btw469 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref037] 37. Lopez CF, Muhlich JL, Bachman JA, Sorger PK. Programming biological models in Python using PySB. Molecular systems biology. 2013;9(1):646. doi: 10.1038/msb.2013.1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref038] 38.Tuza ZA, et al. An in silico modeling toolbox for rapid prototyping of circuits in a biomolecular “breadboard” system. In: 52nd IEEE Conference on Decision and Control; 2013. p. 1404–1410.

[pcbi.1009987.ref039] 39. Singhal V, Tuza ZA, Sun ZZ, Murray RM. A MATLAB toolbox for modeling genetic circuits in cell-free systems. Synthetic Biology. 2021;6(1):ysab007. doi: 10.1093/synbio/ysab007 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref040] 40.Poole W, Pandey A, Shur A, Tuza Z, Murray RM. BioCRNpyler Github Repository; 2022. Accessed 01-09-2022. https://github.com/BuildACell/BioCRNpyler.

[pcbi.1009987.ref041] 41. Hucka M, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19(4):524–531. doi: 10.1093/bioinformatics/btg015 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref042] 42.Bokeh Development Team. Bokeh: Python library for interactive visualization; 2020. Available from: https://bokeh.org/.

[pcbi.1009987.ref043] 43. Jacomy M, Venturini T, Heymann S, Bastian M. ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software. PLoS ONE. 2014;9(6):e98679. doi: 10.1371/journal.pone.0098679 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref044] 44. Moore SJ, MacDonald JT, Wienecke S, Ishwarbhai A, Tsipa A, Aw R, et al. Rapid acquisition and model-based analysis of cell-free transcription–translation reactions from nonmodel bacteria. Proceedings of the National Academy of Sciences. 2018;115(19):E4340–E4349. doi: 10.1073/pnas.1715806115 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref045] 45. Meyer AJ, Segall-Shapiro TH, Voigt CA. Marionette: E. coli containing 12 highly-optimized small molecule sensors. bioRxiv. 2018; p. 285866. [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref046] 46. Hu CY, Varner JD, Lucks JB. Generating effective models and parameters for RNA genetic circuits. ACS synthetic biology. 2015;4(8):914–926. doi: 10.1021/acssynbio.5b00077 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref047] 47. Pasotti L, Bellato M, De Marchi D, Magni P. Mechanistic models of inducible synthetic circuits for joint description of DNA copy number, regulatory protein level, and cell load. Processes. 2019;7(3):119. doi: 10.3390/pr7030119 [DOI] [Google Scholar]

[pcbi.1009987.ref048] 48. Transtrum MK, Qiu P. Bridging mechanistic and phenomenological models of complex biological systems. PLoS computational biology. 2016;12(5):e1004915. doi: 10.1371/journal.pcbi.1004915 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref049] 49. Pandey A, Murray RM. Model Reduction Tools For Phenomenological Modeling of Input-Controlled Biological Circuits. bioRxiv. 2020;. [Google Scholar]

[pcbi.1009987.ref050] 50.Team CD. Codecov Software Package; 2022. https://codecov.io/.

[pcbi.1009987.ref051] 51.Poole W, Pandey A. BuildaCell Youtube Channel; 2020. https://www.youtube.com/watch?v=mu-9MSntd2w&list=PLb2LmjoxZO-g2vbTr3HBcnvVZur8JFiqf.

[pcbi.1009987.ref052] 52. Swaminathan A, et al. Fast and flexible simulation and parameter estimation for synthetic biology using bioscrape. bioRxiv. 2019; p. 121152. [Google Scholar]

[pcbi.1009987.ref053] 53. Der BS, Glassey E, Bartley BA, Enghuus C, Goodman DB, Gordon DB, et al. DNAplotlib: programmable visualization of genetic designs and associated data. ACS synthetic biology. 2017;6(7):1115–1119. doi: 10.1021/acssynbio.6b00252 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref054] 54. Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in Escherichia coli. Nature. 2000;403(6767):339–342. doi: 10.1038/35002131 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref055] 55. Elowitz MB, et al. A synthetic oscillatory network of transcriptional regulators. Nature. 2000;403(6767):335–338. doi: 10.1038/35002125 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref056] 56. Cress BF, Toparlak OD, Guleria S, Lebovich M, Stieglitz JT, Englaender JA, et al. CRISPathBrick: modular combinatorial assembly of type II-A CRISPR arrays for dCas9-mediated multiplex transcriptional repression in E. coli. ACS synthetic biology. 2015;4(9):987–1000. doi: 10.1021/acssynbio.5b00012 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref057] 57. Jayanthi S, Nilgiriwala KS, Del Vecchio D. Retroactivity controls the temporal dynamics of gene transcription. ACS synthetic biology. 2013;2(8):431–441. doi: 10.1021/sb300098w [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref058] 58. Strutt SC, Torrez RM, Kaya E, Negrete OA, Doudna JA. RNA-dependent RNA targeting by CRISPR-Cas9. elife. 2018;7:e32724. doi: 10.7554/eLife.32724 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref059] 59. Dang DT, Phan AT. Development of a ribonuclease containing a G4-specific binding motif for programmable RNA cleavage. Scientific reports. 2019;9(1):1–7. doi: 10.1038/s41598-019-42143-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref060] 60. Milo R, et al. Cell biology by the numbers. Garland Science; 2015. [Google Scholar]

[pcbi.1009987.ref061] 61. Santillán M, Mackey MC. Quantitative approaches to the study of bistability in the lac operon of Escherichia coli. Journal of The Royal Society Interface. 2008;5(suppl_1):S29–S39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref062] 62. Rutherford K, Yuan P, Perry K, Sharp R, Van Duyne GD. Attachment site recognition and regulation of directionality by the serine integrases. Nucleic acids research. 2013;41(17):8341–8356. doi: 10.1093/nar/gkt580 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref063] 63. Paun G. On the Power of the Splicing Operation. International Journal of Computer Mathematics. 1995;59(1-2):27–35. doi: 10.1080/00207169508804451 [DOI] [Google Scholar]

[pcbi.1009987.ref064] 64. Agmon E, Spangler RK, Skalnik CJ, Poole W, Peirce SM, Morrison JH, et al. Vivarium: an interface and engine for integrative multiscale modeling in computational biology. bioRxiv. 2021;. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref065] 65. Merk LN, Shur AS, Pandey A, Murray RM, Green LN. Engineering Logical Inflammation Sensing Circuit for Gut Modulation. bioRxiv. 2020;. [Google Scholar]

[pcbi.1009987.ref066] 66.Roychoudhury A. Understanding the Lifetime and Rate of Protein Production in Cell-Free Reactions While Maximizing Energy Use [B.S. Thesis]. California Institute of Technology; 2021.

[pcbi.1009987.ref067] 67. Perkel JM. Why Jupyter is Data Scientists’ Computational Notebook of Choice. Nature. 2018;563(7732):145–147. doi: 10.1038/d41586-018-07196-1 [DOI] [PubMed] [Google Scholar]

[pcbi.1009987.ref068] 68. Storch M, Haines MC, Baldwin GS. DNA-BOT: a low-cost, automated DNA assembly platform for synthetic biology. Synthetic Biology. 2020;5(1):ysaa010. doi: 10.1093/synbio/ysaa010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009987.ref069] 69. Roehner N, Zhang Z, Nguyen T, Myers CJ. Generating systems biology markup language models from the synthetic biology open language. ACS synthetic biology. 2015;4(8):873–879. doi: 10.1021/sb5003289 [DOI] [PubMed] [Google Scholar]

PERMALINK

BioCRNpyler: Compiling chemical reaction networks from biomolecular parts in diverse contexts

William Poole

Ayush Pandey

Andrey Shur

Zoltan A Tuza

Richard M Murray

Roles

Abstract

Author summary

1 Introduction

2 Design and implementation

2.1 Internal CRN representation

2.2 Mechanisms are reaction schemas

Fig 1. Mechanisms (reaction schemas) representing transcription.

2.3 Components represent functionality

2.4 Mixtures represent context

2.5 Flexible parameter databases

Fig 2. BioCRNpyler parameter defaulting hierarchy.

2.6 Component enumeration allows for arbitrary complexity

2.7 Specification example

2.8 Chemical reaction network compilation

Fig 3.

2.9 Integrated testing

2.10 Documentation and tutorials

3 Results

3.1 Synthetic biological circuit examples

Fig 4. Motivating examples.

3.2 Systems biology circuit example

Fig 5. A model of the lac operon compiled using BioCRNpyler specifications with 141 species and 271 reactions using ∼50 lines of code.

3.3 Component enumeration example

Fig 6. Examples involving component enumeration.

4 Availability and future directions

Table 1. Comparison of different simulation software.

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Pedro Mendes

Dina Schneidman-Duhovny

Roles

Author response to Decision Letter 0

Decision Letter 1

Pedro Mendes

Dina Schneidman-Duhovny

Roles

Author response to Decision Letter 1

Decision Letter 2

Pedro Mendes

Dina Schneidman-Duhovny

Roles

Acceptance letter

Pedro Mendes

Dina Schneidman-Duhovny

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases