Introduction
Modern biological studies rely on computation. Completion of the human genome project and the modeling of protein folding would not have been possible without advances in the analysis and interpretation of data. In iGEM, mathematical models provide insight into the mechanisms of biological processes. A perfect example is the construction of the Collins switch[1], the first engineered genetic toggle switch, where the mathematical model guided the design choices of the project. We built a descriptive model of differential equations that characterizes the expression of the output proteins of our system. Our model sets out to characterize the effectiveness of our bistable switch system, specifically as it switches between different states. The purpose of this is to be able to compare it with other existing biological switches. A secondary goal is to test the elasticity of variables in our model so that we understand where further improvements could be made.
Notation
The following symbols are for the concentrations of the components to be measured on our system:
The flowchart below is a simplification of the reactions occuring on our system. Note that \(I_A\), and \(I_L\) represent the arabinose and lactose as the inducers respectively. They will be passed to our system as inputs.
Center section of the flowchart represents the bindings of the sgNAs with the dCas protein and with the asRNAs relating to both GFP and RFP. The sgRNA bindings are reversible reactions as represented by the left pointing arrows in the diagram. The reactions in the system are modeled through mass-action kinetics resulting in a system of first order differential equations.
Model
In the model \(\gamma_x\) is the symbol to be used to express the degradation of proteins or nucleic acid \(x\). If there is no subscript, then it refers to the global dilution of the system. We decided to include global dilution into our model because the simulation was expected to run time periods in hours. \(\alpha\)'s are used for protein expression constants and \(\beta\) for unmodified protein expression constants. Expression of messenger RNAs and the production of proteins is modeled following the methods in chapter 7 of [2]. \(k\) is used for the binding of sgRNAs and \(\eta\) for induced mRNA expression. dCas9 expression is modeled as in [3]: \begin{gather} \frac{d}{dt}[M_d]= \beta_d - (\gamma_{M_d}+\gamma)[M_d]\\ \frac{d}{dt}[d] = \alpha_d[M_d] + k_{g:c}^r([C_G]+[C_R]) - k_{g:c}^f([S_G]+[S_R])[d] -(\gamma_d+\gamma)[d]\\ \end{gather} Through mass action kinetics we model the expression of dCas9 as well as its interaction to the RNA regulators.
RNA regulators: \begin{gather} \frac{d}{dt} [S_G]= \eta_l(S_G) + k_{g:c}^r[C_G] + k_a^r[AS_G] - k_{g:c}^f[S_G][d] - k_a^f[A_G][S_G] - (\gamma_{S_G}+\gamma) [S_G]\\ \frac{d}{dt}[S_R]= \eta_a(S_R) + k_{g:c}^r[C_R] + k_a^r[AS_R] - k_{g:c}^f[S_R][d] - k_a^f[A_R][S_R] - (\gamma_{S_R}+\gamma) [S_R]\\ \frac{d}{dt} [A_G] = \eta_a(A_G) - k_a^f[A_G][S_G] + k_a^r[AS_G]- (\gamma_{A_G}+\gamma)[A_G]\\ \frac{d}{dt} [A_R] = \eta_l(A_R) - k_a^f[A_R][S_R] + k_a^r[AS_R]- (\gamma_{A_R}+\gamma)[A_R]\\ \end{gather} This models the interactions of the RNA regulators as well as their expression. \(\eta\) refers to the rate of expression which is induced either by arabinose or IPTG, and modeled as in [4]: \begin{gather} \eta_a(X) =K_a^{max}(X)\dfrac{[Arab:AraC]^{n}}{(K_a^{half}(X))^{n}+[Arab:AraC]^{n}},\hspace{2cm} \eta_l(X) =K_l^{max}(X)\dfrac{[IPTG:Lacl]^{n}}{(K_l^{half}(X))^{n}+[IPTG:Lacl]^{n}} \end{gather} Fluorescent Proteins production: \begin{gather} \frac{d}{dt} [M_G] = \delta_G - (\gamma_{M_G}+\gamma) [M_G]\\ \frac{d}{dt} [G] = \alpha_G[M_G] - (\gamma_G+\gamma) [G]\\ \frac{d}{dt} [M_R] = \delta_R - (\gamma_{M_R}+\gamma) [M_R]\\ \frac{d}{dt} [R] = \alpha_R[M_R] - (\gamma_R+\gamma) [R]\\ \end{gather} Here the production is affected by the CRISPRi hence it depends on the concentrations of dcas9 and single guide RNAs: \begin{gather} \delta_G = \beta_R/\varepsilon_G + \beta_G\dfrac{1}{1+([C_G] \frac{k_{c:p}^f}{k_{c:p}^r})^{n_r}}, \hspace{2cm} \delta_R = \beta_R/\varepsilon_R + \beta_R\dfrac{1}{1+([C_R] \frac{k_{c:p}^f}{k_{c:p}^r})^{n_r}} \end{gather} sgRNA Binded Molecules: \begin{gather} \frac{d}{dt} [C_G] = k_{g:c}^f [S_G] [d] - k_{g:c}^r [C_G] -\gamma[C_G]\\ \frac{d}{dt} [C_R]= k_{g:c}^f [S_R][d] - k_{g:c}^r [C_R] -\gamma[C_R]\\ \frac{d}{dt} [AS_G] = k_a^f[A_G][S_G]-k_a^r[AS_G]-\gamma[AS_G]\\ \frac{d}{dt} [AS_R] = k_a^f[A_R][S_R]-k_a^r[AS_R]-\gamma[AS_G] \end{gather} Now we make the assumption that the guide RNAs bind to the asRNAs and the dcas9 at a faster rate so they are always on a balanced state. This is known as the Quasi-Steady State assumption, more information on it can be found on [2]. For the QSS assumption we consider: \begin{gather} \frac{d}{dt} [C_G] = k_{g:c}^f [S_G] [d] - k_{g:c}^r [C_G^{QSS}] -\gamma[C_G^{QSS}]=0\\ \frac{d}{dt} [C_R]= k_{g:c}^f [S_R] [d] - k_{g:c}^r [C_R^{QSS}] -\gamma[C_R^{QSS}]=0\\ \Downarrow\\ [C_G^{QSS}] = \frac{k_{g:c}^f [S_G] [d]}{k_{g:c}^r+\gamma}\\ [C_R^{QSS}]= \frac{ k_{g:c}^f [S_R] [d]}{k_{g:c}^r+\gamma} \end{gather} We can also apply QSS to the antisense part with: \begin{gather} \frac{d}{dt} [AS_G] = k_a^f[A_G][S_G]-k_a^r[AS_G^{QSS}] -\gamma[AS_G^{QSS}]= 0 \\ \frac{d}{dt} [AS_R] = k_a^f[A_R][S_R]-k_a^r[AS_R^{QSS}] -\gamma[AS_R^{QSS}] = 0\\ \Downarrow\\ [AS_G^{QSS}] = \frac{k_a^f[A_G][S_G]}{k_a^r+\gamma}\\ [AS_R^{QSS}] = \frac{k_a^f[A_R][S_R]}{k_a^r+\gamma} \end{gather}
Simplifying the complete system to 7 differential equations, the finalized system that we will use: \begin{gather} \frac{d}{dt}[M_d]= \beta_d - (\gamma_{M_d}+\gamma)[M_d] \tag{1}\\ \frac{d}{dt}[d] = \alpha_d[M_d] + k_{g:c}^r([C_G]+[C_R]) - k_{g:c}^f([S_G]+[S_R])[d] -(\gamma_d+\gamma)[d]\tag{2}\\ \frac{d}{dt} [S_G]= \eta_l(S_G) + k_{g:c}^r[C_G] + k_a^r[AS_R] - k_{g:c}^f[S_G][d] - k_a^f[A_G][S_G] - (\gamma_{S_G}+\gamma) [S_G] \tag{3}\\ \frac{d}{dt}[S_R]= \eta_a(S_R) + k_{g:c}^r[C_R] + k_a^r[AS_R] - k_{g:c}^f[S_R][d] - k_a^f[A_R][S_R] - (\gamma_{S_R}+\gamma) [S_R]\tag{4}\\ \frac{d}{dt} [A_G] = \eta_a(A_G) - k_a^f[A_G][S_G] + k_a^r[AS_G]- (\gamma_{A_G}+\gamma)[A_G] \tag{11}\\ \frac{d}{dt} [A_R] = \eta_l(A_R) - k_a^f[A_R][S_R] + k_a^r[AS_R]- (\gamma_{A_R}+\gamma)[A_R] \tag{12}\\ \frac{d}{dt} [M_G] = \delta_G - (\gamma_{M_G}+\gamma) [M_G] \tag{7}\\ \frac{d}{dt} [G] = \alpha_G[M_G] - (\gamma_G+\gamma) [G] \tag{8}\\ \frac{d}{dt} [M_R] = \delta_R - (\gamma_{M_R}+\gamma) [M_R] \tag{9}\\ \frac{d}{dt} [R] = \alpha_R[M_R] - (\gamma_R+\gamma) [R] \tag{10} \end{gather} With assumptions: \begin{gather} [C_G^{QSS}] = \frac{k_{g:c}^f [S_G] [d]}{k_{g:c}^r+\gamma} \tag{5*}\\ [C_R^{QSS}]= \frac{ k_{g:c}^f [S_R] [d] }{k_{g:c}^r+\gamma} \tag{5*}\\ [AS_G^{QSS}] = \frac{k_a^f[A_G][S_G]}{k_a^r+\gamma} \tag{13*}\\ [AS_R^{QSS}] = \frac{k_a^f[A_R][S_R]}{k_a^r+\gamma} \tag{14*} \end{gather} And substitutions: \begin{gather} \delta_G = \beta_G/\varepsilon_G + \beta_G(1-\frac{1}{\varepsilon_G})\dfrac{1}{1+([C_G] \frac{k_{c:p}^f}{k_{c:p}^r})^{n_r}} \tag{15}\\ \delta_R = \beta_R/\varepsilon_R + \beta_R(1-\frac{1}{\varepsilon_R})\dfrac{1}{1+([C_R] \frac{k_{c:p}^f}{k_{c:p}^r})^{n_r}}\tag{16}\\ \eta_a(X) = K_a^{max}(X)\dfrac{[Arab:AraC]^{n}}{(K_a^{half})(X)^{n}+[Arab:AraC]^{n}}\tag{17}\\ \eta_l(X) =K_l^{max}(X)\dfrac{[IPTG:Lacl]^{n}}{(K_l^{half}(X))^{n}+[IPTG:Lacl]^{n}}\tag{18} \end{gather}
Results
Using our estimated parameters, we ran simulations of our model on Python to characterize the time it takes to switch between the states. The inducer levels inputted were similar to the ones tested in [5]. This, so that we could compare the results of our project with theirs. The inputs had to be provided at multiple points adjusted multiple times to simulate the inducer change that causes the state switching. The system was started with concentrations of \(200000nM\) for arabinose and negligible concentrations of \(IPTG\). This was to understand how our model reacts to the inducers to reach an equilibrium state.
As part of a collaboration, GT college ran multiple runs of this simulation. A piece of key information from this assay was how incrementing the amount inducer for the protein we want produced also helped to repress the production of the other protein. We then ran our simulations for the switching behavior of the model. We start with the same inducer amounts as last time, but now at 5 hours in the concentrations of inducers where changed to \(200000nM\) of IPTG inducer and negligible ones for arabinose. This inducer as shown in the diagram is the one responsible for the transcription of \(S_G\)and \(A_R\), hence repressing the production of GFP and de-repressing the RFP production. Later, at 10 hours in, the inducer concentrations were once again swapped.
Our finishing graph shows that the system will take around 5 hours to switch between states. Unfortunately comparing it to experimental results We see that the system begins switching states almost immediately, this is different from results observed in [5]. We account for this to the measuring of fluorescence instead. Especially since in the Collins switch this same quick-change behavior can be observed. Collins switch takes 6 hours while ours should take around 5 for the change of state so we expect it to be indeed faster. It is indeed more complicated than collins but we believe the strength of our witch is in its orthogonality, specificity and versatility.
Estimated Parameters
Most of the parameters arise from literature:
The rest of the symbols are as follows:
The \(\eta\), \(\delta\), and complexes are substitutions or inputs so they don't have any constants attached to them. For \(\beta_G, \beta_R\), we considered them to be on the plausible range \(0.0001\text{nM/min}-0.01\text{nM/min}\). Finally, for the \(\alpha\) variables we estimated the values using the peptide elongation rate of E. coli 12 to 21[10], the length of the sequence to be translated and the ribosome concentration in E. coli[11]). This results in a value in the range of 41.52/length of Sequence to 72.66/length of sequence.
References:
[1] Gardner, T., Cantor, C., & Collins, J. (2000). Construction of a genetic toggle switch in Escherichia coli. Nature, 403(6767), 339-342. doi: 10.1038/35002131
[2] Brian P. Ingells. (2013). Mathematical Modeling in Systems Biology. The MIT Press, Cambridge, Massachusetts.
[3] Waterloo iGEM Team. (2014). Math Book: CRISPR Interference. https://2014.igem.org/Team:Waterloo/Math_Book/CRISPRi
[4] Oxford iGEM Team. (2015). Modelling. https://2015.igem.org/Team:Oxford/Modeling
[5] Lee, Y., Hoynes-O'Connor, A., Leong, M., & Moon, T. (2016). Programmable control of bacterial gene expression with the combined CRISPR and antisense RNA system. Nucleic Acids Research, 44(5), 2462-2473. doi: 10.1093/nar/gkw056
[6] Samuel E. Clamons and Richard M. Murray. (2017). Modeling dynamic transcriptional circuits with crispri. bioRxiv.
[7] Halter, M., Tona, A., Bhadriraju, K., Plant, A., & Elliott, J. (2007). Automated live cell imaging of green fluorescent protein degradation in individual fibroblasts. Cytometry Part A, 71A(10), 827-834. doi: 10.1002/cyto.a.20461
[8] Qi, L., Larson, M., Gilbert, L., Doudna, J., Weissman, J., Arkin, A., & Lim, W. (2013). Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression. Cell, 152(5), 1173-1183. doi: 10.1016/j.cell.2013.02.022
[9] HKUST iGEM Team. (2017). Sensing Model. https://2017.igem.org/Team:Hong_Kong_HKUST/Model/Sensing
[10] Karpinets, T., Greenwood, D., Sams, C., & Ammons, J. (2006). RNA:protein ratio of the unicellular organism as a characteristic of phosphorous and nitrogen stoichiometry and of the cellular requirement of ribosomes for protein synthesis. BMC Biology, 4(1). doi: 10.1186/1741-7007-4-30
[11] Ron Milo, Paul Jorgensen, Uri Moran, Griffin Weber, and Michael Springer. (2019). Bionumbers—the database of keynumbers in molecular and cell biology. Nucleic Acids Research, 38(suppl1):D750–D753.