Overview
With the modeling, we aimed to apply a control systems approach to achieve stability of gene expression across bacterial species. The behavior of genetic circuits depends on a lot of variables, most of which change when transferring to different organisms. To make expression hostindependent, we included an incoherent feedforward loop (iFFL) in our design. An iFFL can be used to make the output of a system independent of the input. The input of a genetic circuit can be many variables, such as plasmid copy number, transcriptional and translational rates. We therefore, wanted to apply the iFFL system to make our genetic circuit independent to plasmid copy number, transcriptional and translational rates.
We made a mathematical model of a genetic implementation of the iFFL and derived a steadystate solution analytically. Our analytical steadystate solution of this loop showed that expression was completely independent of plasmid copy number and transcriptionaltranslational rates. We verified this analytical solution by the implementation of a full kinetic model.
The key variables in the design of genetic circuits are plasmid copy number and transcriptionaltranslational rates. These variables determine the steadystate levels of gene expression. However, when transferring genetic circuits between organisms, these variables change in unpredictable ways.
Promoters have different strengths in different organisms. Some promoters only work in a very narrow range of bacterial species (Yang et al., 2018). To circumvent hostrelated changes, we chose our system orthogonal to the host. We implement orthogonality in our system by using T7 RNA polymerase. However, orthogonal transcription might not behave similarly when applied in varying biological contexts. Through the modeling, we show that gene expression levels remain the same in varying biological contexts when using our genetic circuit implementation of an iFFL.
Ribosome binding sites contain the ShineDalgarno sequence, where the 16s rRNA of the ribosome binds. However, this sequence varies across species, and often ribosome binding sites are extremely inefficient when applied in phylogenetically distant species (Salis et al., 2009). The model shows that similar expression levels across organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Assuming translation elongation is similar across different species, the model shows that expression levels in different organisms can be maintained when all genes in our genetic circuit contain the same ribosome binding site. Nevertheless, translation elongation is influenced by codon usage, which differs per organism. We therefore developed a software tool that determines a coding sequence that is similar in codon usage across different species. Similar codon usage minimizes the chance of different translation elongation rates between organisms.
The Core of Our Design  The Incoherent FeedForward Loop
We implemented an incoherent feedforward loop (iFFL) in a genetic circuit. In an iFFL, the input signal regulates both the activator and the repressor of the output of the system in the same way (Figure 1). The iFFL results in perfect adaptation to the input when the binding of the repressor is fully noncooperative (binding of one repressor at a time) (SegallShapiro et al., 2018). In our case, the input is the plasmid copy number of the DNA template, and the output is the steadystate expression of a GOI. In our system, we use a transcription activatorlike effector (TALE) protein as a repressor. TALE proteins recognize DNA by a simple DNAbinding mechanism (Doyle, 2013) and have been shown to bind fully noncooperative (Figure 2) (SegallShapiro et al., 2018). The promoter controlling the GOI has been engineered to contain a binding site of a TALE protein. When the TALE protein is bound to the promoter, the expression of the GOI is repressed as demonstrated by 2018 iGEM Thessaloniki.
We have modeled the function of the genetics of this system. An analytical steadystate solution of the system showed that the steadystate expression level of a GOI is completely independent to plasmid copy number and can be independent of transcriptional and translational rates when the right design choices are made. After further verification through the implementation of a full ordinary differential equation (ODE) model, we designed experiments to test the independence to these variables. Using the model, we were able to identify some key design choices of our project. These consist of:
 The need for good insulation of the genes.
 The promoter strengths of the TALE protein and the GOI need to maintain the same ratio.
 The ribosome binding site strengths of the TALE protein and the GOI need to maintain the same ratio.
Reaction Kinetics
In this section, we explain the kinetics of our iFFL and derive a system of ordinary differential equations to describe the interactions within the genetic circuit. We derive a steadystate solution from the system of equations and describe the properties of the system. In the next sections, we use the steadystate solution and ODE model to describe how our circuit can be used to transfer genetic circuits between prokaryotes. Figure 3 depicts all interactions considered in our system.

Click here to find out more about the details of the kinetic model﹀
 The amount of TALE protein is much larger than the number of binding sites for TALE.
 TALE binding and unbinding occurs much more rapidly than protein production and degradation.
 When the promoter is repressed the expression level is negligable.

Click here to see the full derivation﹀
By making use of assumption 2, we can assume a quasisteady state for $\frac{dP_G}{dt}$ and $\frac{dP_{G.T}}{dt}$. Quasisteady state means we assume the equation reaches steadystate much faster than the other variables in the model and thus the rate of change is zero. This results in the following system of equations:
 ${dm_T \over dt} = {c \cdot a_T  y_m \cdot m_T}$
 $\frac{dT}{dt} = b_T \cdot m_T  y_T \cdot T  n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T} + n \cdot (n1)\cdot y_T \cdot P_{G.T}$
 $\frac{dP_G}{dt} = k_{off} \cdot P_{G.T}  n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$
 $\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G  k_{off} \cdot P_{G.T}  n \cdot y_T \cdot P_{G.T} = 0$
 $\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T}  y_m \cdot m_G $
 $\frac{dG}{dt} = b_G \cdot m_G  y_G \cdot G $
We can now use equation 3 to simplify the system:
 $\color{red}{k_{off} \cdot P_{G.T}  n \cdot k_{on} \cdot T^n \cdot P_G} + n \cdot y_T \cdot P_{G.T} = 0$
 The red part can be taken to one side of the equation:
 $\color{red}{n \cdot k_{on} \cdot T^n \cdot P_G  k_{off} \cdot P_{G.T}} = n \cdot y_T \cdot P_{G.T}$
 Then we substitute that expression in equation 2:
 $\frac{dT}{dt} = b_T \cdot m_T  y_T \cdot T \color{red}{ n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}}+ n \cdot (n1)\cdot y_T \cdot P_{G.T}$
 Which becomes:
$\frac{dT}{dt} = b_T \cdot m_T  y_T \cdot T (n \cdot y_T \cdot P_{G.T}) + n \cdot (n1)\cdot y_T \cdot P_{G.T} = b_T \cdot m_T  y_T \cdot T$
Furthermore, we can use equation 3 to get an expression for $P_G$
 $k_{off} \cdot P_{G.T}  n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T} = 0$
 $P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T} + \frac{1}{k_{on}} \cdot n \cdot y_T \cdot P_{G.T}$
Using assumption 1, we can say T >> c. It follows that the amount of free repressor barely changes when some of the repressors bind to $P_G$, meaning $T \approx T + nP_{G.T}$
 $P_G = \frac{k_{off}}{k_{on}} \cdot P_{G.T}$
 Using: $c = P_G + P_{G.T}$ we get the following:
 $P_G = \frac{c}{1 + K_D \cdot R^n}$, where: $K_D = \frac{k_{off}}{k_{on}}$
 Plugging this into equation 5 and again making use of $c = P_G + P_{G.T}$, equation 5 becomes:
 $\frac{dm_G}{dt} = c \cdot (a_{Gmin} + (a_{Gmax}  a_{Gmin})[\frac{K_D}{K_D + R^n}])  y_m \cdot m_G$
Assumption 3 tells us that we can assume $a_{Gmin} \approx 0$. Again using assumption 1, we can say $T^n >> K_D$, resulting in $T^n + K_D \approx T^n$. Using these two assumptions equation 5 can be further simplified to:
 $\frac{dm_G}{dt} = c \cdot a_{G} \cdot[\frac{K_D}{T^n}]  y_m \cdot m_G$, where $a_{G} = a_{Gmax}K_D$
Using this reduced system of equations we can now derive the steadystate solution for the GOI.
 $ c \cdot a_T  y_m \cdot m_T = 0$ $\rightarrow$ $ m_T = c \frac{a_T}{y_m} $
 $ b_T \cdot m_{T}  y_T \cdot y_T \cdot T = 0$ $\rightarrow$ $ T = \frac{b_T \cdot m_{T}}{y_T}$
 $ c \cdot a_{G} \cdot[\frac{K_D}{R^n}]  y_m \cdot m_G = 0$ $\rightarrow$ $ m_G = c \frac{a_{G} \cdot[\frac{K_D}{R^n}]}{y_m}$
 $ b_G \cdot m_{G}  \cdot y_G \cdot G = 0$ $\rightarrow$ $ G = \frac{b_G \cdot m_{G}}{y_G} $
Plugging everything into the last equation gives: $$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$
From these interactions we can derive the following system of ordinary differential equations:
${dm_T \over dt} = {c \cdot a_T  y_m \cdot m_T}$
$\frac{dT}{dt} = b_T \cdot m_T  y_T \cdot T  n \cdot k_{on}\cdot T^n \cdot P_G + n \cdot k_{off} \cdot P_{G.T}$
$\frac{dP_G}{dt} = k_{off} \cdot P_{G.T}  n \cdot k_{on} \cdot T^n \cdot P_G + n \cdot y_T \cdot P_{G.T}$
$\frac{dP_{G.T}}{dt} = n \cdot k_{on} \cdot T^n \cdot P_G  k_{off} \cdot P_{G.T}  n \cdot y_T \cdot P_{G.T} $
$\frac{dm_G}{dt} = a_{Gmax} \cdot P_G + a_{Gmin} \cdot P_{G.T}  y_m \cdot m_G $
$\frac{dG}{dt} = b_G \cdot m_G  y_G \cdot G $
Parameter Value Unit Explanation Source $a_T$ 1.03 nM/min Transcription rate TALE 2018 iGEM Thessaloniki $y_m$ log(2)/5 1/min Degradation rate mRNA Kushwaha and Salis (2015) $b_T$ 0.44 1/min Translation rate TALE 2018 iGEM Thessaloniki $y_T$ 0.0347 1/min Degradation rate TALE Assuming degradation is only dependent on growth rate (20 min) n 1  Cooperativity of binding SegallShapiro et al. (2018) $k_{on}$ 9.85 1/(nM*min) Binding of TALE to promoter 2018 iGEM Thessaloniki $k_{off}$ 2.19 1/min Unbinding of TALE to promoter 2018 iGEM Thessaloniki $a_{Gmax}$ 3.78 1/(nM*min) Maximum transcription of GFP 2018 iGEM Thessaloniki $a_{Gmin}$ 0 1/(nM*min) Minimum transcription of GFP SegallShapiro et al. (2018) $b_G$ 3.65 1/min Translation rate GFP 2018 iGEM Thessaloniki $y_G$ 0.0347 1/min Degradation rate GFP Assuming degradation is only dependent on growth rate (20 min) $c$ variable Unitless Plasmid copy number of plasmid
Variable Explanation $m_T$ Concentration of TALE mRNA T Concentration of TALE $P_G$ Promoter GFP $P_{G.T}$ Promoter GFP with TALE bound $m_G$ Concentration of mRNA GFP G Concentration of GFP
Simplification of the system
This system can be simplified by making a few assumptions (SegallShapiro et al., 2018):
Using these assumptions, we can derive analytically a steadystate solution for this system. This derivation results in the following steadystate solution:
$$G = \left(\frac{c}{c^n}\right) \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$
According to our analytical solution, the level of the protein of interest is only dependent on plasmid copy number, and the ratios of transcription and translation rates of the genes in the circuit. In the next sections, we use this steadystate solution to demonstrate how it can be used to transfer genetic circuits between organisms. Furthermore, we solve the full system of ordinary differential equations in Matlab to gain insight in the kinetics of the system.
Plasmid Copy Number
The expression levels in a genetic circuit are strongly correlated to the plasmid copy number of the DNA template (SegallShapiro et al., 2018). The amount of plasmid copy number can change when the plasmid is transferred between organisms. Therefore there is a need for expression levels independent of plasmid copy number if the same genetic circuit is used in different organisms. The steadystate solution of the model tells us that when our repressor binding is fully noncooperative, n = 1, we have complete independence of plasmid copy number:
$$G = \left(\frac{\color{red}c}{\color{red}c^\color{red}n}\right)_{\color{red}n\color{red}=\color{red}1} \left(\frac{a_Gb_Gy_T^ny_m^n}{a_Tb_Ty_Gy_m}\right)$$
This formula is however based on a few assumptions. To see how the system would behave without making these assumptions we implemented the full system of ordinary differential equations (Figure 4).
The model without assumptions has the same expression level independent of plasmid copy number (Figure 4). We therefore can transfer our circuit between organisms and expect the expression of the GOI to be independent of the changes in plasmid copy number of our orthogonal plasmid.
Wet lab
We tested the prediction of plasmid copy number independence by implementing the iFFL system, with GFP as the output. We cloned the system in backbones containing different origins of replication. As a control, we also cloned GFP into the same backbones to demonstrate different levels of expression. More information can be found here.
Download our code here.Transcriptional Variations
Every promoter might have a different strength when used in different organisms (Yang et al., 2018). Thus, when using the same promoter in different organisms, you can get unpredictable behavior. The steadystate solution of the model tells us that the steadystate expression level of the GOI is only dependent on the ratio of transcription rates of the GOI and TALE.
$$G = \left(\frac{c}{c^n}\right) \left(\frac{\color{red}a_\color{red}G \color{black} b_Gy_T^ny_m^n}{\color{red}a_\color{red}T \color{black} b_Ty_Gy_m}\right)$$
In our full kinetic model we vary both transcription parameters. In Figure 5 we plot the resulting steadystate solutions as a function of the transcription rate of TALE and GFP.
The full kinetic model shows that the expression level of GFP is the same when the transcription rate of TALE and of GFP remain constant (Figure 5). In order to achieve a constant ratio of transcription rates in our genetic circuit, we use the T7 orthogonal transcription system which is transcribed by its own RNA polymerase. We implemented T7 promoters with varying strengths compared to the wildtype, developed by (Ryo Komura et al., 2018). More information can be found here.
Wet lab
We have successfully demonstrated the prediction of transcription rate independence when the same ratio in the transcription rate of both genes is maintained.
We made variations of the system where we changed the promoters of both genes to a 50% strength version of that same promoter (Figure 6). As a control, we also cloned GFP without repression under the control of these same promoters.
Furthermore, we also tested independence to transcriptional variation by using different IPTG concentrations (Figure 7). Again, as a control, we cloned GFP without repression under the control of these same promoters.
More information can be found here.
Download our code here.
Translational Variations
As for in transcription, the model steadystate solution tells us that the steadystate expression level of the GOI is only dependent on the rate of translation of the GOI and TALE:
$$G = \left(\frac{c}{c^n}\right) \left(\frac{a_G \color{red}b_\color{red}Gy_T^ny_m^n}{a_T \color{red}b_\color{red}Ty_Gy_m}\right)$$
In Figure 6, we plot the resulting steadystate solutions as a function of the translation rate of TALE and GOI using the full kinetic model to see how the system without assumptions behaves.
As can be seen in Figure 8 the full kinetic model maintains the same level of GFP expression when the translation rates for both genes remain in a constant ratio. To keep the same ratio in translation rates across organisms we used the same ribosome binding site (RBS) for both genes. Using the same RBS ensures that translation initiation for both genes change in a similar manner (Salis et al., 2009), more on the design choices can be found here.
Wet lab
We tested the prediction of independence to variation in translational rates by implementing the iFFL system, where the output is GFP. We made variations of the system where we change both ribosome binding sites in the same way. As a control, we also cloned GFP without repression under control of the same ribosome binding sites. More information can be found here.
Download our code here.Importance of Insulation
In the model solutions so far we assumed the promoter of the GOI to be completely insulated from expression from the TALE protein. However, in reality when two transcription units are placed in series leaky expression of the second gene can occur. This is due to the efficiency of the terminator of the first gene (YingJa et al,. 2018). The iFFL system originally developed by SegallShapiro et al. (2018), uses the ECK120029600 terminator for the TALE protein. This terminator has a reported efficiency of 1/612, meaning that for every 612 TALE proteins produced, 1 protein of the GOI is made (YingJa et al,. 2018). We incorporate this efficiency into the model and solve again for steadystate GOI expression levels to see the effect of terminator efficiency on plasmid copy number independence.
The model shows that the leaky expression negatively impacts the system's ability to adapt to plasmid copy number (Figure 9). We therefore designed our system to have the transcriptional unit of TALE in a different orientation than the transcriptional unit of the GOI (Figure 10).
Codon Usage  Crossspecies Codon Harmonization
When transferring genetic circuits across different organisms, translation rates are not only dependent on variations in translation initiation but also on translation elongation as codon usage preferences can change.
When heterologous proteins are expressed in new bacterial host cells, they exhibit altered protein expression levels compared to the one observed in the original microorganism.
One of the reasons for a lower expression level is the variation in codon usage between the original organism and the new host cell (Angov et al., 2008).
Bacterial cells contain 20 different amino acids encoded by 64 codons (excluding 3 stop codons). Most of the 20 amino acids are encoded by more than one codon (Nascimento et al., 2018). Nascimento et al. (2018) have proven that cells are making great use of this codon choice, since codon usage directly affects the translation rate. They showed that proteins expressed at high levels contain more frequently used codons, corresponding to higher translation rates. In order to increase the expression level of a heterologous protein in the host cell, new codon optimization tools were developed (Mignon et al., 2008). The codon optimization tools currently available are based on two different algorithms:
 Codon optimization tools: The objective of a codon optimization tool is to achieve the highest translation rate possible. The translation rate is increased by substituting each codon with the codon that is used most frequently for the corresponding amino acid. Moreover, the codons are selected such that their sequence keeps the ribosomal binding site (RBS) freely accessible for the ribosomal subunit RBS by avoiding hairpin formation at the translation initiation site (Puigbò et al., 2018).
The relative codon frequencies are calculated through the Codon Adaptation Index (CAI) as shown in the following equation. In this equation $w_i$ is the CAI, $f_i$ is the frequency of a particular codon, and $max(f_i)$ is the codon that is used most frequently for the corresponding amino acid.  Codon harmonization tools: The basic idea of a codon harmonization tool is to mimic the native translation rate in the host organism by using rare codons at specific places and avoiding hairpin formation as much as possible. The codon usage of the original microorganism functions as a reference (Athey et al., 2017). This approach allows prefolding of the protein during translation in order to reduce the chance of the protein misfolding as much as possible (Figure 11).
$$w_i = \frac{f_i}{max( f_i )}$$
The current limitation of both codon adaptation tools is that the adaptation of the DNA coding sequence can be performed for one single organism at the time.
There is not an option to adapt the DNA coding sequence into one single DNA coding sequence adapted to function across multiple species.
We wanted to enable consistent gene translation elongation rates between bacteria. Therefore we created the first crossspecies codon harmonization tool.
This harmonization tool provides the user with a single DNA coding sequence that will yield the same protein expression level in different bacterial host cells.
In our tool, we take already existing coding sequences and alter the codon usage to yield similar codon usage in multiple organisms, and thus giving similar translation elongation rates in those organisms.
We based our tool on the idea of harmonization. However, instead of matching the codon usage of the host, we use a coding sequence that already functions in a certain organism and try to match the same codon usage in different organisms.
Furthermore, the tool also allows you to make the coding sequence BioBrick compatible by removing standard restriction sites for RFC10 or Type IIs standards.

Click here to find out more about the codon harmonization﹀
 The filtered database containing only the codon usage of the organisms of interest and the reference organisms codon usage (data_formatted = output file name of the filtered database).
 Database containing recognition sites for type IIS restriction enzymes (restriction_enzymes_database).
 A database containing recognition sites for type IIS restriction enzymes (restriction_enzymes_database).
 Nucleotide sequence of the gene of interest that will be harmonized.
 A codonharmonized nucleotide sequence usable in all organisms of interest.
 The deviation of the endogenous GC content of each organism of interest from the mean GC content across these organisms.
Preselection data
Our harmonization tool is based on a database containing the codon usage of 152903 microorganisms obtained from Athey et al. (2017). Since working with such a big database slows down the tool, we designed a MATLAB script that preselects the data of microorganisms of your interest. This MATLAB script is available in the supplementary list below.
The taxonomy identification number (taxid) is used as an input for the preselection, the standard for organism specification. The organism in which the coding sequence has been shown to work functions as the reference point for the tool. The regions with highfrequency codon usage will be substituted with high codon frequency usage in the organism of interest, and lowfrequency codon usage regions will be substituted with lowfrequency codon usage.
In our case we used eGFP as the heterologous protein and Escherichia coli strain BL21(DE3) as host (taxid 469008). When generating the filtered database containing data of only the organisms of interest, we designed the code in such a way that the first input row will contain data of the reference organism, and the rows below contain data of the potential new host organisms. Each new row is a new potential organism. A schematic representation of the preselection is shown in Figure 12.Harmonization code
Before the harmonized coding sequence for the organisms of interest is generated, the deviation of the endogenous GC content of each organism of interest from the mean GC content is calculated. This calculation has been added to inform the user about whether the GC content of the organisms of interest is very different from each other. In case the GC content for one of the organisms deviates more than 5% from the mean, the harmonized coding sequence may not function in the same way as the host. Users are therefore notified if the GC content deviation is larger than 5%. The codon frequency is calculated in the same way as was described by Athey et al. (2017). We calculate the codon frequency and variance in the same way as Athey et al. (2017). We order the calculated variance for each position from lowest to highest. In the first iteration of generating the final sequence, we use the lowest variance codon at every position. The generated sequence then goes through screening for selected restriction enzymes recognition sequences to make it MoClo compatible and easier to use in a construct. In case a site is found, the codon at that particular position is substituted with the synonymous codon that has the secondlowest variance.
Inputs and Outputs
Our codon harmonization tool requires three inputs and will generate two outputs. Inputs:
The codon harmonization script will generate the following two outputs:
Obtained output
For the validation for the model we used the microorganisms listed below:Name organism taxid Escherichia coli BL21 (DE3) 469008 Vibrio natriegens NBRC 15636 = ATCC 14048 = DSM 759) 1219067 Bacillus subtilis subsp. subtilis str. 16 224308 As input sequence we use the coding sequence for eGFP (click here to get this sequence)
The generated output sequence is here
Download our crossspecies harmonization tool here.
References
 Angov, E., Hillier, C. J., Kincaid, R. L., & Lyon, J. A. (2008). Heterologous protein expression is enhanced by harmonizing the codon usage frequencies of the target gene with those of the expression host. FEMS PLoS ONE, 3(5): e2189.
 Athey, J., Alexaki, A., Osipova, E., Rostovtsev, A., SantanaQuintero, L. V., Katneni, U., KimchiSarfaty, C. (2017). A new and updated resource for codon usage tables. BMC bioinformatics, 18(1), 391.
 Chen, Y.J., Liu, P., Nielsen, A. A. K., Brophy, J. A. N., Clancy, K., Peterson, T., & Voigt, C. A. (2013). Characterization of 582 natural and synthetic terminators and quantification of their design constraints. Nature Methods, 10(7), 659–664.
 Doyle, E. L. (2013). Computational and experimental analysis of TALeffectorDNA binding. Plant Pathology and Microbiology, Iowa State University. Dissertation.
 Federhen, S. (2011, April 7). Entrez Taxonomy Quick Start.
 Jain, A. and P. Srivastava (2013). "Broad host range plasmids." FEMS Microbiology Letters 348(2): 8796.
 Kushwaha, M., & Salis, H. M. (2015). A portable expression resource for engineering crossspecies genetic circuits and pathways. Nature Communications, 6(1).
 Mignon, C., Mariano, N., Stadthagen, G., Lugari, A., Lagoutte, P., Donnat, S., Werle, B. (2018). Codon harmonization – going beyond the speed limit for protein expression. FEBS Letters.
 Nascimento, J. de F., Kelly, S., Sunter, J., & Carrington, M. (2018). Codon choice directs constitutive mRNA levels in trypanosomes. ELife.
 Puigbò, P., Bravo, I. G., & GarciaVallve, S. (2008). CAIcal: a combined set of tools to assess codon usage adaptation. Biology direct, 3, 38.
 Ryo Komura, W. A., Keisuke Motone, Atsushi Satomura, Mitsuyoshi Ueda (2018). "Highthroughput evaluation of T7 promoter variants using biased randomization and DNA barcoding." PLOS ONE.
 Salis, H. M., et al. (2009). "Automated design of synthetic ribosome binding sites to control protein expression." Nature Biotechnology 27(10): 946950.
 SegallShapiro, T. H., et al. (2018). "Engineered promoters enable constant gene expression at any plasmid copy number in bacteria." Nature Biotechnology 36: 352.
 Yang, S., et al. (2018). "Construction and Characterization of BroadSpectrum Promoters for Synthetic Biology." ACS Synthetic Biology 7(1): 287291.