Team:SASTRA Thanjavur/Model

Modelling

Modelling formed an integral aspect of our project that corroborated the interdisciplinary nature of synthetic biology, ranging from collating and utilising results provided by our software team to giving useful suggestions and insights to our wet lab team towards design of experiments and results validation. Our model is designed in terms of providing a quantitative basis towards differential expression of miRNA biomarkers.

Our modelling consists of three major facets

Biomarker choice through differential expression profiling and survival curve analysis
ViennaRNA suite for the design of our toehold constructs
Mass action kinetics model

Biomarker Panel Development

The Cancer Genome Atlas (TCGA) is a reliable cancer genomics program that began in 2006 as a joint effort between the National Cancer Institute and the National Human Genome Research Institute. It is a standalone database that consists of over 20,000 primary cancer and matched normal samples with molecular characterization, covering around 33 different tumour types. In this context, the TCGA has released a large amount of miRNA sequencing data from CESC (Comprehensive RNA Expression Signature for Cervical Cancer) patients [1]. The analysis of differentially expressed miRNA markers obtained from the corresponding cervical cancer miRNA-Seq omics data (obtained from TCGA), for normal tissue samples and cervical adenocarcinoma based tissue samples revealed plausible prognosis value associated with these markers. This opens up a novel way of detecting the survival rates of a patient via this miRNA signature.

Omics Analysis

Normalized and log2-transformed Illumina HiSeq RNASeq gene expression data was processed by the RSEM pipeline (which were in turn obtained from TCGA via the firebrowse.org portal). The patient sample barcodes UUID encoded in variable “Hybridisation REF” was parsed and correspondingly annotated as cancer and normal. The stage information of the patient sample is then stored in the clinical stage variable, which essentially corresponds to the surgical stage, prior to any treatment received, determined with the tissue obtained at the time of surgery.

The miRNA expression analyses were performed using the “limma” package in R [2]. The linear model was subjected to empirical Bayes adjustment to obtain moderated t-statistics. To account for multiple hypothesis testing and the false discovery rate, the p values of the F-statistic of the linear fit were adjusted using the method of Benjamini and Hochberg. miRNA-coding genes were considered Differentially Expressed Genes (DEGs) if the absolute log-fold changes were more than 1.5 and the p value was <0.05 [3]. For top 100 DEGs obtained from the linear model, the following 4 DEGs were chosen based on the above differential expression profiling, corroborated by our extensive literature backing.

Table 1. Differential Expression profiling of 4 chosen DEGs from top 100 DEGs obtained from linear model

DEGs	logFC	Average Expression	t statistic	p value	Adjusted p value	B
hsa-mir-21-5p	-18.6676	18.39681	-27.2302	1.47E-84	7.97E-82	143.6128
hsa-mir-20a-5p	-7.5285	8.7422	-6.59869	1.77E-10	1.16E-09	12.06215
hsa-mir-29a-3p	-12.0232	12.4346	-10.8081	2.38E-23	3.58E-22	36.86442
hsa-mir-200a-3p	-9.31171	10.26658	-6.41446	5.21E-10	3.17E-09	11.15374

Top 4 DEG genes of the linear model with the log-fold change expression of the genes in each stage relative to the controls are given below in the table, followed by p value adjustment for the false discovery rate.

Table 2. Stage wise log fold change of gene expression relative to controls

DEGs	Stage_1 - Control	Stage_2 - Control	Stage_3 - Control	Stage_4 - Control	Average Expression	F value	p value	Adjusted p value
hsa-mir-21	-18.6872	-18.6273	-18.641	-18.6328	18.40209	185.0495	8.12E-80	4.40E-77
hsa-mir-20a	-7.63549	-7.46839	-7.44558	-7.38787	8.720818	11.57827	9.32E-09	6.01E-08
hsa-mir-29a	-11.9232	-12.0534	-12.1063	-12.3912	12.44322	30.29603	3.39E-21	5.41E-20
hsa-mir-200a	-9.23351	-9.46469	-9.44375	-9.01185	10.27391	10.9008	2.88E-08	1.66E-07

Survival Analysis

The differential expression profiling of miRNAs associated with early-stage cervical cancer furnished us with the requisite biomarkers for our project, which was further validated by the survival curve analysis via a Cox proportional hazards model using Kaplan-Meier curves. The association between the expression of differentially expressed miRNAs and the overall survival of a patient was evaluated by univariate Cox proportional hazards regression analysis using the survival R package [4]. The Hazards Ratio (HR) associated with a predictor variable is given by the exponent of its coefficient. A hazard ratio above 1 indicates a covariate that is positively associated with the event probability, and thus negatively associated with the length of survival.

Table 3. Hazard Ratio, Z-Score and p value for three chosen miRNA biomarkers

Gene Symbol	HR	Z-Score	p value
hsa-mir-20a-5p	0.5231	-2.479	0.0132
hsa-mir-200a-5p	1.8075	2.076	0.0379
hsa-mir-21-5p	0.5951	-1.961	0.0498
hsa-mir-29a-3p	0.6735	-1.497	0.1340

The Kaplan-Meier estimator is used to estimate the survival function. The visual representation of this function is usually called the Kaplan-Meier curve, as the probability of surviving in a given length of time while considering time in many small intervals. The time starting from a defined point to the occurrence of a given event, for example death is called as survival time and the analysis of group data as survival analysis [5]. The log rank test is the most common method used for comparing the survival curve. The K-M plot for the differentially expressed miRNA of interest is as follows.

Figure 1. K-M plot for (a) hsa-miR-20a-5p, (b) hsa-miR-200a-5p, (c) hsa-miR-21-5p and (d) hsa-miR-29a-3p

In order to explore the strategy of relying on a panel of miRNA biomarkers for early-stage cervical cancer detection, a predictive model was defined as a linear combination of the expression levels of the three miRNAs weighted by their relative coefficients in the multivariate Cox regression test as follows: survival risk score(SRS) = (-0.5967 * Expression value of hsa-miR-20a) + (0.5376 * Expression value of hsa-miR-200a) + ( - 0.4736* Expression value of hsa-miR-21) [6]. hsa-miR-29a was dropped as a prospective biomarker owing to its p value ≮ 0.05. hsa-miR-20a and hsa-miR-21 showed a negative coefficient in Cox regression analysis indicating a protective effect of the variable with which it is associated whereas hsa-miR-200a showed a positive coefficient indicates a worse prognosis.

Table 4. Cox Regression, Z-Score and P value for three chosen miRNA biomarkers

Gene Symbol	RC	Z-Score	p value
hsa-mir-20a-5p	-0.5967	-2.280	0.0226
hsa-mir-200a-5p	0.5376	1.879	0.0603
hsa-mir-21-5p	-0.4739	-1.786	0.0740

According to the predictive miRNA signature model, the risk score for each of the 307 patients was calculated. The patients were then classified into high-risk or low-risk group by identifying optimal cut point using survminer package. Overall survival curves were generated using the Kaplan-Meier method, and two-sided log-rank tests were employed to compare the differences in overall survival time between the high-risk and low-risk patient groups. The Kaplan-Meier overall survival curves of the two groups based on the three miRNAs were notably different (log-rank p = 0.00068< 0.001). The results of the multivariate Cox regression analysis suggested that one miRNA, hsa-mir-20a can act as an independent prognostic factor in CESC cancer (p < 0.05).

Figure 2. (a) Kaplan-Meier curves of the two groups based on the three miRNA signatures (b) Kaplan-Meier curves of the two groups based on the two miRNA signatures

The multivariate Cox regression model with just the hsa-miR-21 and hsa-miR-20a biomarkers as a biomarker panel was tested over other possible two-marker models, considering the weight of literature on the significance of miR-21 in cancer. The survival risk score (SRS) = (-0.6170*Expression value of hsa_mir_20a) + (-0.4799* Expression value of hsa_mir_21) was calculated. hsa-mirR-20a and hsa-miR-21 showed a negative coefficient in Cox regression analysis indicating a protective effect of the variable with which it is associated, as observed in the previous trial.

The Kaplan-Meier overall survival curves of the two groups based on the two miRNAs were extremely significant (log-rank p = 0.00075 < 0.001), indicating the prospective use of hsa-miR-21 and hsa-miR-20a as members of the biomarker panel for early-stage cervical cancer detection due to minimal differences in the p value between the panel group with three miRNA markers and two miRNA markers. Thus, the consideration of hsa-miR-21 and hsa-miR-20a as biomarkers for cervical cancer stands significant and anchors the biomarker choice for our project

Toehold Design

We used the ViennaRNA suite to visualize the secondary structures of our toehold switches. Specifically, we used the RNAfold web server version to visualize our designed structures. We were able to find and optimize the nucleotides after passing the sequence via RNAfold. The ensemble diversity of the structure provides insight into the average base-pair distance between all the structures in the thermodynamic ensemble and the minimum free energy value of a sequence indicates the energy value of the most stable structure of that sequence. Therefore, we observed that our designed sequences adopted a toehold switch MFE structure.

The calculated free energy value for the second generation toehold switch for hsa-miR-21-5p was -21.4 kcal/mol, with an ensemble diversity of 4.95. The calculated free energy value for the second generation toehold switch for hsa-miR-21-5p was -22.68 kcal/mol, with an ensemble diversity of 8.79.

Base Diagrams created in ViennaRNA suite for (a) hsa-miR-21-5p (b) hsa-miR-20a-5p

Reaction Modelling

In mass action kinetic models, the dynamics of concentration of the product is seen to depend on the instantaneous values of the reactants. Due to the lesser complexity and minimal parameter requirements associated with these deterministic models, we chose to use it for our project. We opted for MATLAB R2017b as a platform to run the simulations as all three of us in the modelling team were familiar with its programming language.

We referred to the excellent documentation of iGEM CLSB-UK (2017) [6] and initially decided to emulate their model for us to get an intuition and insight of their kinetic model and anticipate the plausible errors that could be encountered during our simulations. This emulation in turn gave us intuition and confidence, setting up a substantial premise towards the modelling. A literature survey was also carried out by our modelling team to gain more clarity and gather information on the different mass action kinetic models proposed so far in relation to toehold switches and GFP expression.

Literature Review

We looked into the back references cited by iGEM CLSB-UK (2017) for their modelling and went through the same in detail. Of many papers we looked into, the research article titled Experiment and mathematical modeling of gene expression dynamics in a cell-free system, authored by Stogbeaur et al. and his corresponding thesis proved to be most useful [7]. The former in particular paved the way towards understanding mathematical equations and their formulations in modelling a cell free system, while the latter gave us insights on the cell free system (NEB’s PURExpress Protein Synthesis kit) and the experimental procedure to be followed to carry out fluorescence measurements, where requisite information from our understanding of the thesis was passed onto our sensor team for the design of experiments.

Owing to the use of second-generation toehold switches in our project, it was requisite for us to model the miR-antimiR interactions in the solution system for which we decided to search for methods that could aid us in estimating the kinetic aspects for modelling such interactions. In this search, so as to include the thermodynamic aspects in the model, we went through the documentation provided by iGEM Exeter (2015) and simulated their MATLAB code. Their work represented a Brownian motion based model for interaction of 2 particles in a solution system. We decided to emulate this simulation twice so as to model the interaction between antimiR-miR that facilitates the formation of a primary complex and then model the interaction between the formed complex and toehold to form the secondary complex. We spent quite some time understanding the intricate aspects in the MATLAB code that was provided on their wiki, which involved neglecting the interactions with the walls of the container, formulating equations for the geometry of the container (in our case a round bottom well instead of a microcentrifuge tube used by the Exeter team) so as to define the boundary conditions for the particles in the system [8]. The model in turn aided the visualisation of interactions happening within the system; following this we had an organic discussion with our primary principal investigator, Dr. Ashok Palaniapan, who also happens to be our systems biology instructor. However, post the discussion we found out that the model gave only a spatial representation of the dynamics involved in the container and not any useful information on the kinetics/thermodynamic aspects. Hence, we did not proceed further on this front.

Initial works in the process of Modelling

With no headway in terms of modelling the anti-miR and miR interactions, we shifted our full focus towards modelling the miR and toehold interactions. We decided to do so because, once we simulate the interactions between the miRNA and the toehold it becomes only a matter of replacing the miRNA with complex (miR-anti_miR complex) and then simulating the kinetics of its complex formation with the toehold.

The Coxd “Experiment and mathematical modelling of gene expression in cell free systems” by Stogbauer et.al [9].furnished us with the variations in GFP fluorescence as a function of DNA concentration, which forms the major aspect of our project. The article gave insights on intricate aspects such as including an incubation time to facilitate significant transcription of the toehold DNA to RNA using the master mix. The incubation time was calculated to be 8 minutes from the transcription rate, provided in this article.

As per our plan, we kicked off our simulations via emulating the kinetic model of iGEM CLSB-UK (2017). The initial model was fairly simple as it involved only two of our major reactants, the closed toehold switch and miRNA reacting to give away an intermediate partially bound toehold switch, which eventually forms the open toehold switch upon its linearization.

CTS - Closed Toehold Switch

PBTS - Partially Bound Toehold Switch

OTS - Open Toehold Switch

k - Rates of the respective reactions

The above kinetic equation was translated into the following differential equations:

Using the interactive platforms provided in the iGEM CLSB-UK 2017 wiki, which had sliders for varying the parameters involved in their modelling, we set our parameters such that the forward reactions were more favoured and initially plotted our graphs.

Figure 3. Results obtained from emulating the initial model of CLSB using the following parameters; CTS(0) = 1 M; miRNA(0) = 0.8 M;

k₁ = 0.1 s^-1; k₂ = 0.2 M^-1s^-1; k_-1 = 0.01 s^-1; k_-2 = 0.01 s^-1

The responses we obtained were akin to the plots displayed by the interactive sliders on the iGEM CLSB-UK 2017 wiki, confirming the working of our MATLAB code. Briefly, from the above plot, it could be seen that the open toehold switch concentration increases and saturates with time while the reactants (closed toehold switch and miRNA) concentration decreases with time along with a slight initial increase of the partially bound toehold switch and its subsequent decrease with time, when a threshold concentration of the open toehold switch is reached.

First Generation Toehold Kinetics Modelling

Although this had set a premise towards our modelling journey, it represented a very crude model and hence in order to make our model more realistic and useful, we emulated the final model of CLSB which had certain added quintessential parameters such as transcription rate, translation rate and characteristic decay of individual RNA elements.

This was again translated into the following ODEs:

The ‘sensor team’ provided us with the approximate concentration of PCR amplified DNA and we utilised the same (~120 nM) as the initial concentration for the toehold DNA. The other parameters were maintained similar to what CLSB had used.

Output:

Figure 4. Results obtained from emulating the initial model of CLSB using the following parameters; miRNA(0) = 1 M; k_{transcription} = 1.1×10^-3s^-1; k_translation = 1.7×10^-2s^-1; k_open = 6×10⁵ M^-1s^-1; k_decay = 1.28×10^-3;

The responses we obtained were similar to the plots of the simple kinetic model used before, wherein from the above plot, it could be seen that the GFP concentration increases steadily and saturates with time while the reactants (closed toehold switch and miRNA) concentration decreases sharply with time along with a slight initial increase of the open toehold switch and its subsequent decrease with time, when a threshold concentration of the GFP is reached.

Throughout the process of emulating the work of iGEM CLSB-UK 2017, we had frequent discussions amongst ourselves in terms of interpreting trends and discrepancies exhibited in the graphs. This gave us more clarity and boosted our confidence in the modelling aspect. Following this, we decided to make a plot of GFP expression for different initial concentrations of miRNA so as to check the correctness of expected trends (i.e increasing fluorescence of GFP with increase in concentration of miRNA)

We utilised the rate constants given by iGEM CLSB-UK 2017 for the initial model and later planned to carry out a curve fitting with results provided by the ‘sensor team’.

Plot of GFP expression for different initial concentrations of miRNA:

We varied the miRNA concentration from 1 fM to 1 µM with a 10- fold difference for each simulation, upon receiving inputs from the sensor team on their experiment design. However, on plotting the concentration curves, we observed negative concentration values for miRNA. We speculated that this could be due to a faster decay rate of RNA that was employed in the model. On decreasing the k_decayvalue, the negative values disappeared. We correspondingly chose a value of 3×10^-4which was used in the CLSB model for curve fitting purpose.

The plot for all components at different concentrations of miRNA ranging from 1 fM to 1 µM is shown below. The DNA concentration used for this simulation run was 120 nM. The GFP and OTS curves are alone plotted in semi-logarithmic scale (Y axis in log), owing to the ~10 fold difference in the concentrations with change in miRNA.

k_{transcription} = 1.1×10^-3s^-1; k_translation = 1.7×10^-2s^-1; k_open = 10⁵ M^-1s^-1; k_decay = 3×10^-4;

Figure 5. Graphical representation of concentration of (a) CTS (b) miRNA (c) OTS and (d) GFP with time for varying initial concentration of miRNA from 1 fM to 1 µM for 2 h

It could be seen that the GFP concentration increases with increase in the initial concentration of miRNA. This is expected since, higher concentrations of miRNA would open more closed toehold switches for translation of GFP. Similar trends were observed with OTS also, which in turn propels the GFP expression.

Relationship between miRNA and initial DNA concentration

The dynamics of CTS production (via transcription) depends on the initial concentration of DNA. We felt a necessity to probe into the relationship between concentrations of DNA and miRNA, since the latter binds to the transcribed DNA (CTS) and facilitates GFP expression. Hence, in order to get significant fluorescence an optimum concentration of toehold DNA has to be fixed.

If the initial concentration of DNA is high when compared to miRNA concentration, more amounts of CTS would be present owing to the conversion of the toehold DNA into the corresponding CTS.

So, when the miRNA concentration is lower than that of DNA, the production rate (i.e. transcription) of CTS dominates in the equation, as a result of which we observe overlapping curves in the graph from miRNA concentrations between 1 fM-10 nM.

However, when the DNA and miRNA are at comparable concentrations ~100 nM and > 100 nM, where until 1000 s, the curve increases slowly (as CTS gets converted to OTS). Beyond 1000 s (after miRNA(t)=0), the curve again increases linearly, since the miRNA present is consumed and hence, transcription begins to dominate in the equation. The concentration of the latter toehold DNA when similar to that of miRNA concentration would result in lower CTS amounts, owing to the interaction of miRNA with much of the transcribed CTS (from DNA) resulting in the conversion to OTS.

When miRNA(0) = 1 µM, the concentration of miRNA exceeds that of DNA by ~8 fold. In this case, the consumption of CTS to obtain OTS dominates, and hence we observe very little increase in the concentration of CTS.

Since it is the miRNA that is acting as a switch for initiating the fluorescence in the system, it became imperative for us to probe into the concentration profiles associated with these.

The studies performed by Stogbeuar et.al. clearly indicated the fluorescence to saturate after 2 h and hence we decided to choose an initial concentration of miRNA that gets consumed by that time. At a high concentration of miRNA(0) = 1 µM, we observe a steady decrease in the plot, where even at t = 7200 s, we find the miRNA concentration to be ~100 nM. However, in the case of miRNA(0) = 100 nM and other lesser concentrations, the decay occurs to 0 by 1000 s.

Second Generation Toehold Kinetics Modelling

At this point in time, the major architecture of our model was built; following this, we had to replace the miR/toehold interaction with miR-anti_miR/toehold interaction. Here is where a stiff challenge awaited us; with the aid of CLSB-UK’s documentation we were able to assign meaningful values to the rate constants and proceed with the simulation; however there were no sources we could rely on, to gather information on kinetic aspects of miR-anti_miR interactions.

Obtaining rate constants from the kinetics of miR-anti_miR interaction

The structures obtained from NUPACK simulations [10] of miR-21 and anti_miR-21 corresponded to their MFE state; we initially planned to utilise the concept of MFE towards unearthing the rate constants for the kinetics of miR-anti_miR complex formation.

We were required to know the energy required to form the hydrogen bonds associated with the miR-anti_miR complex formation. If the cumulative energy required to form hydrogen bonds with the participating nucleotides in the miR-anti_miR complex is less than the complex’s MFE (since the free energy contributes to bond formation), the miR and anti-miR in their MFE state could take part in bond formation.

Initially, we did not look into the energy requirements associated with hydrogen bond formation for miR-anti_miR complex formation; from the thermodynamics relationship of Gibbs free energy, we proceed to find the equilibrium constant for MFE;

Calculations for equilibrium constant

Values used: R = 1.987*10^-3kcal mol^-1 K^-1, T = 298 K,

For antimiR-miR binding: ΔG = -22.2 kcal mol^-1

k_eq= 1.9167*10¹⁶

For complex-CTS binding: ΔG = -26 kcal mol^-1

k_eq= 1.1733*10¹⁹

Parameters used:

kmiRantimiR_b=1.9167*10¹⁶; (3.3 * 10^5)

kmiRantimiR_ub=1;

kcomplexcts_b=1.1733*10¹⁰;

ktranscription=1.1*10^-3;

ktranslation=1.7*10^-2;

kdecay=3*10^-4;

We obtained an enormously high value for the equilibrium constant; Our major objective was to find out the rate constants associated with the reactions, for this, we assumed the backward rate to be 1 and estimated the forward reaction rate; we thought it was a reasonable assumption due to very large associated equilibrium constants; However, for such very high rate constants, the MATLAB code halted at some point of time during the execution.

To combat this, we arbitrarily reduced the values of the fed forward rate constants and tried executing the code; the code seemed to work fine for reduced values of rate constants (around 10^5).

Perplexed by the trends, we rechecked our calculations but did not find any discrepancies. However, when checked the MFE values, we found out that those values correspond to the complex and not for the individual anti_miRNA and miRNA. We ascertained this to be cause for such trends.

Also, this time we resorted to a clearer way of ascertaining the equilibrium constant from the NUPACK through the equilibrium concentrations (this we obtained via feeding the separate miRNA and anti_miRNA sequences in NUPACK [3]); with the equilibrium concentrations, we found out the equilibrium constant and correspondingly the forward rate of the reaction. The obtained forward rates were in the order of 10^5 supporting the serendipitous simulation results obtained in the previous case. Also, here in this calculation we substituted a temperature value of around 18 °C (input furnished by the ‘sensor team’, because the incubation of the miR and anti-miR was carried out at this temperature only).

The possibility of temperature giving very high values of equilibrium constants in the previous case was immediately neglected, owing to its little intuitive sense.

For miR-21, we gather that all of the miR and anti_miR had hybridised to yield the complex. However, when we looked into the histogram data, the equilibrium concentrations were found to be:

Complex = 99.97 nM

miRNA and anti_miR: 0.0264 nM

Estimated equilibrium constant from above data: 1*10⁵nM^-1

For miR 20a,

Equilibrium concentrations of complex = 0.099 µM;

miRNA and antimiRNA = 0.001 µM

Estimated equilibrium constant from above data: 9.9*10⁴µM^-1

The sole purpose for us to obtain the equilibrium constant was to get an idea about the rate constants(k_forward/k_backward). We used this value as the forward rate constant by assuming the backward rate constant to be 1.

As per plan, we just had to replace the miRNA terms used in the previous case with complex term (miR-anti_miR) and hence correspondingly we had formulated the following equations,

However, this was in negation to the experimental design, where it was elaborated by the sensor team that miR-anti_miR complex is formed separately and then this complex is allowed to interact with the toehold DNA mixture. Hence, we formulated two separate set of equations that catered to the discussed experimental design;

Set 1:

Set 2:

The initial experimental design of the sensor team was to prepare varying concentrations of the miR and anti_miR separately, incubate them for complex formation and hence, quantify the fluorescence intensity. However, from the simulation results, we obtained a trend that did not follow any logical trend which might be attributed to the low concentrations of miR and anti-miR available in the sample solution in case of diluted concentrations, resulting in higher time for their binding due to the low probability of finding each other spatially in the solution space. Following these results, we suggested the ‘sensor team’ to prepare a stock solution of complex and subject it to serial dilutions for their required concentrations and this proved valuable towards their experimental procedure.
The graph obtained for the suggested simulation method is as follows;

Figure 6. Graph representing the variation in concentration of miRNA, antimiRNA and subsequent complex formed within 2 h time frame with miR-antimiR binding rate constant as 10⁵mol^-1s^-1

The simulation time was chosen to be 7200 s since the sensor team had reported an incubation time of 2 h for the complex formation. However, from the above graph it is evident that the complex formation takes place much more quickly (well before 500 s) and both the complex and miRNA/anti-miRNA attains saturation by then.

Figure 7. Graph representing the variation in concentration of complex, CTS and OTS formed within 2 h time frame with Complex-CTS binding rate constant as 10⁵mol^-1s^-1 However, negative concentration obtained for the complex, which was resolved after inclusion of incubation time for transcription (Figure 9)

Although the intention of any literature survey is to give us a prelude towards mathematical modelling of cell free systems, it surprisingly aided us to cruise an anomaly associated with the occurring negative concentrations of CTS. Since, the emergence of negative concentration made little sense, we looked into means to resolve this issue; The Stogbauer et.al’s paper gave insights on intricate aspects such as including an incubation time so as to facilitate significant transcription of the toehold DNA to RNA using the master mix. The incubation time was calculated to be 8 minutes from the transcription rate.On adding the incubation time associated with transcription of toehold DNA to CTS (around 8 minutes), this anomaly was removed.

Figure 8. Graph representing the variation in concentration of GFP and mature GFP for 2 h time frame with maturation factor of 0.033 s^-1

Modelling GFP Fluorescence

The output of our biosensor experiments was a measure of GFP fluorescence. We also included the GFP maturation factor (0.2 min^-1), which the CLSB-UK team had omitted for their convenience. We utilised the maturation factor/scaling factor to be 79.429 calculated by iGEM Valencia_UPV 2018 (http://parts.igem.org/Part:BBa_E0040) team and used it in our code [11]. We changed the DNA concentration also to 223.5 nM after confirming the final experimental protocol with the ‘sensor team’.
We ran the simulation for a 2 h time frame and we obtained the following graph;

Figure 9. Variation in concentration of Complex, CTS and OTS with time; the ambiguity in terms of negative concentration was resolved via including the incubation time of 8 minutes for transcription process

We plotted a graph to understand the trends in concentration of the various components of our model with time. The CTS concentration initially was around 118 nM owing to the transcriptional incubation time. The complex concentration was obtained on solving the ‘Set 1’ differential equations. The CTS and complex rapidly hybridise to open the CTS, which is evident from the drop in their concentrations and increase in OTS concentration. However, at around 500 s, the complex is completely consumed, following which OTS begins to decay, while CTS increases due to continued transcription.

When we look into the dynamics of GFP and mature GFP, in the initial 500 s, translation of GFP increases as OTS increases, which begins to mature slowly. Following this period, the GFP decreases as the OTS concentration decreases. However, the mature GFP increases due to maturation of the continually produced GFP.

The next stage in our modeling process was to obtain trends for fluorescence intensity with variations in complex concentration. We plotted the intensity at 2 h against the complex concentration and obtained a linear fit with a correlation coefficient value (R² = 0.9993). This showed that our model was working fine and we repeated the process for a time period for 4 h.

Figure 10. The fluorescence intensity varies linearly with concentration of complex for time frame of 2 h

Figure 11. The fluorescence intensity varies linearly with concentration of complex for time frame of 4 h

Table 5. Table for different concentraion of CTS, OTS, GFP and mature GFP with its corresponding fluorescence intensity after 2 h

miRNA/anti-miR concentration	100 pM	1 nM	10 nM	25 nM	50 nM	75 nM	100 nM
Complex (miR-antimiR- Toehold switch)	98 pM	980 pM	9.87 nM	24.69 nM	49.37 nM	74.05 nM	98.74 nM
CTS	739 nM	738 nM	737 nM	736 nM	733 nM	729 nM	726 nM
OTS	0.018 nM	0.18 nM	1.17 nM	3.01 nM	5.85 nM	9.74 nM	12.1 nM
GFP	0.067 nM	0.67 nM	6.66 nM	16.9 nM	32.32 nM	54.1 nM	65.8 nM
Mature GFP	0.0048 uM	0.048 uM	0.48 uM	1.21 uM	2.43 uM	3.08 uM	4.85 uM
Fluorescence Intensity	1.74	17.40	174.16	432.12	868.03	1348.51	1728.04

Table 6. Table for different concentraion of CTS, OTS, GFP and mature GFP with its corresponding fluorescence intensity after 4 h

miRNA/ anti-miR concentration	100 pM	1 nM	10 nM	25 nM	50 nM	75 nM	100 nM
Complex (miR-antimiR- Toehold switch)	98 pM	980 pM	9.87 nM	24.69 nM	49.37 nM	74.05 nM	98.74 nM
CTS	810 nM	810 nM	810 nM	810 nM	810 nM	810 nM	810 nM
OTS	0.0013 nM	0.010 nM	0.13 nM	0.34 nM	0.73 nM	1.13 nM	1.38 nM
GFP	0.0079 nM	0.0734 nM	0.785 nM	1.97 nM	3.23 nM	5.58 nM	7.96 nM
Mature GFP	0.0055 uM	0.055 uM	0.551 uM	1.38 uM	2.75 uM	4.32 uM	5.51 uM
Fluorescence Intensity	4.76	47.73	477.17	1188.49	2382.83	3722.41	4754.4

The trend obtained for 4 h was similar to that of 2 h with the intensity increasing linearly with increase in concentration of complex (R² = 0.9992)

However, on referring to the ‘sensors team’, we gathered that the GFP fluorescence intensity saturated for them after a concentration of 10 nM. We observed that CLSB-UK had faced a similar issue and hence had included a term for GFP decay, which we followed as well (GFPdecay = 5*10^-5). Furthermore, on discussion with our PI, we realised that the probability of the toehold DNA being a limiting factor could be an ad-hoc explanation for the saturation profile (beyond 10 nM) observed with our experimental results. He suggested that even though the ‘sensor team’ had used a DNA concentration of 220 nM there are chances that due to the round bottom wells that were used for the fluorescence measurements, a significant amount of DNA would have been “trapped” to the walls of the wells and hence not available for transcription by the cell-free solutions used, possibly leading to saturation at 10 nM itself. Hence, we decreased the DNA concentration 20 fold to 10 nM.

Figure 12. The fluorescence intensity is seen to exhibit a linear trend from 10 nM to 100 nM following which it saturates

Following implementation of these two changes, we obtained saturation of fluorescence intensity at 100 nM. However, the ‘sensors team’ obtained it at 10 nM. This difference could be attributed to experimental errors and could be addressed by further experimentation for estimating the parameters for several rates including miRNA-anti_miR binding, complex- CTS binding.

Matlab Code

Download our Matlab Code here:

Initial Model Final Model Complete Model

Modelling Diary

Download our modelling diary here.

References

Broad Institute TCGA Genome Data Analysis Center. Analysis-ready standardized TCGA data from broad GDAC firehose 2016_01_28 run. Broad institute of MIT and Harvard. Dataset; 2016. https://doi.org/10.7908/C11G0KM9.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015). “limma powers differential expression analyses for RNA-sequencing and microarray studies.” Nucleic Acids Research, 43(7), e47.
Arjun Sarathi, Ashok Palaniappan. Novel significant stage-specific differentially expressed genes in liver hepatocellular carcinoma.BMC Cancer 2019: 19:663-685
T. Therneau, A package for survival analysis in S. Version 2.38 (2015). http://CRAN.R-project.org/package=survival.
Chen Q, Zeng X, Huang D, Qiu X. Identification of differentially expressed miRNAs in early-stage cervical cancer with lymph node metastasis across The Cancer Genome Atlas datasets. Cancer Manag Res. 2018;10:6489–6504. Published 2018 Nov 28.
https://2017.igem.org/Team:CLSB-UK
Stögbauer, S. (2012) Experiment and quantitative modeling of cell-free gene expression dynamics. Ludwig Maximilian University of Munich, Germany
https://2015.igem.org/Team:Exeter/Modeling
Stögbauer, T., Windhager, L., Zimmer, R., & Rädler, J. O. (2012). Experiment and mathematical modeling of gene expression dynamics in a cell-free system. Integrative Biology, 4(5), 494. doi: 10.1039/c2ib00102k
Zadeh, J. N., Steenberg, C. D., Bois, J. S., Wolfe, B. R., Pierce, M. B., Khan, A. R., … Pierce, N. A. (2010). NUPACK: Analysis and design of nucleic acid systems. Journal of Computational Chemistry, 32(1), 170–173. doi: 10.1002/jcc.21596
http://parts.igem.org/Part:BBa_E0040