Team:Thessaly/Model

The interaction between Wet Lab and Dry Lab in a Synthetic Biology project is one of the most crucial preliminary steps, in order to better understand a biological system. The dry lab procedures served our project in two different directions. Firstly, conducting an in silico analysis of the candidate primers and toeholds, we were able to pinpoint the most promising sequences, thus saving time and resources from the wet lab experiments. Moreover, for the purposes of our diagnostic tool we've built a kinetic model to predict the minimum time needed for a visible and reliable result.

In silico analysis

Primer design

The first and most crucial step to detect a DNA fragment in a biological sample, is to amplify it in a specific manner. For our project, an isothermal amplification step with Recombinase Polymerase Amplification (RPA) was included to achieve this goal. Below we describe the workflow followed for our project’s primer design.


Designing primers for isothermal amplification reactions

Before initiating the design process, we first had to set and consider the amplification reaction's parameters. According to the literature, DNA fragments from Mycobacterium tuberculosis (MTB) in urine are found in the length range of 100-200 bp [1,2]. Moreover, the RPA’s optimal primer length is between 30-36 bp. Thus, the main pillars of the design were the amplicon length to be less than 150bp and the primer length to be in the optimal range. In order to proceed with the design, we used BLAST (Primer-BLAST).


Primer-BLAST main parameters

For the Primer-BLAST, we used the following parameters in general:
● IS6110 (BBa_K2973009) was used as the reaction’s DNA template.
● Amplification product size was limited to 150bp.
● Primer melting temperatures did not matter, since there is no temperature change in our reaction, and left as default.
● For the primer pair specificity, we included both the reference sequence mRNA and non-redundant, in order to include major genomes (e.g. Homo sapiens) as well as MTB strains.
● For the primer parameters, primer size was set to 30-36 bp, with a GC content of 30-70%, according to RPA’s manual (TwistDX). GC clamp was set to 2, max Poly-X to 3.
● Max self- and pair- complementary for the primer candidates were set to 5 for “Any” and 2 for the 3’ end.


Checking primers for secondary structures and dimers

After selecting a pool of the 18 most promising primer sets, we checked each primer’s hairpin, self-dimer and hetero-dimer structures, with IDT’s OligoAnalyzer. Moreover, we were interested in the primers’ structures in the reaction’s temperature, which is 42℃, so all the structures created with IDT’s software considered that fact as well.


Adding overhangs to our primers and structure prediction

One of the most significant elements of our project is the overhangs that lie in the 5’ end of the primers. Thus, it was of utter importance to have the least secondary structure formation in our primer sets, including the 5’ end overhangs on both primers. In order to achieve that, we screened a pool of 25 trigger sequences found in the literature [3,4], aiming to find at least one combination that had good enough thermodynamics (dG close to zero and no 3’ end structures) and could perform specific amplification. After checking more than 485 combinations of primers + overhangs, we narrowed down to 4 primer sets, along with 3 trigger sequences that were promising. Mostly because of resource limitation, we selected the most promising of those combinations to conduct our experiments in the lab.


Incorporating a Self-Avoiding Molecular Recognition System in the primer design

Whilst RPA has a lot of advantages over other isothermal and covnentional amplification methods, it embeds some issues. The biggest issue is the strong primer-dimer formation, mostly due to primer length and the low temperature the reaction is incubated. When developing a diagnostic test, specificity is an important aspect of judgment. So, in order to reduce primer-dimer formation, we introduced components of a Self-Avoiding Molecular Recognition System (SAMRS) into our primer sequences. A “Self-Avoiding Molecular Recognition System” (SAMRS) is a species of DNA that binds to natural DNA but not to other members of the same SAMRS species [5]. Following reported literature on RPA-SAMRS primers [6], we exchanged 4 consecutive bases at the 3’ end (starting from the one before the last base) with SAMRS nucleotides. This way, we hope to reduce the primer-dimer formation that we encountered in our non-SAMRS primers.

Toehold design

Toehold switch sensors are programmable synthetic riboregulators that control the translation of a gene through the binding of a trans-acting trigger RNA. The switches contain a hairpin structure that blocks gene translation in cis by sequestration of the ribosome binding site (RBS) and start codon. Upon a switch binding to a complementary trigger RNA, sequestration of the RBS and start codon is relieved, activating gene translation. [3,4]


Design criteria

For our project, we needed to find DNA sequences that are only present in hyperthermophile bacteria and archaea. So, for that reason, we aimed at the 16S rRNA gene of hyperthermophiles in order to generate the trigger part of our toehold switch. Hyperthermophiles contain sequences that are very rare to find in humans so it eliminates the possibility of a false positive result. We confirmed that there is no sequence in humans that could activate the toehold switch by using the NCBI BLAST tool. Other criteria for the design of our biosensor were the structure and the free energy of the trigger sequence and the toehold switch. Specifically, the toehold switch should contain its stems and grooves in the right places and hide the Ribosomal Binding Site, suppressing the expression of the downstream gene in absence of the trigger sequence. Among the toehold switches designed, we tested the two most promising, due to resource limitation. Those toehold switches were designed according to the structure of the 32B toehold switch of Pardee et al. 2016 [3], by replacing the trigger complementary region of the Zika virus with the sequence from the hyperthermofiles. In order to design the toehold switches for our project, we used the NUPACK Design & Analysis algorithms [7-14], [19].


Selection criteria

We designed 15 new toehold switches from 15 different hyperthermophile bacteria and archaea. However, most of their structures were not ideal for our purpose. On the other hand, the toehold switches No.13 and No.14 showed a similar structure to Pardee’s toehold switch 32B[3], so these are the two we tested in the lab. In addition, another criterion that we took into account for the design of our toehold switches, was the similarity of the trigger’s sequence to the DNA sequences that are found in the human genome and in common bacteria that may dwell in the human body. To check the specificity of our biosensor we performed BLAST (Basic Local Alignment Search Tool), a tool that helped us find regions in the human genome that are similar to our trigger and could activate the toehold switch [20]. Out of all the potential triggers, the trigger No.13 was the best, because it showed no similarity with other DNA sequences that appeared in humans. The rest of the triggers, showed some similarity, while the trigger No.14 had 20 nucleotides in common with the Homo sapiens cAMP regulated phosphoprotein 21 (ARPP21) gene. For those reasons, we expected that the Toehold switch system No.13 would have a better performance in the wet lab experiments because the structure of the toehold is more similar to Pardee’s 32B Toehold than Toehold No.14.

The two toehold switches and their triggers, that we finally tested in the lab, are shown below:

Figure 1-2. Trigger sequence No.13 and Toehold switch No.13

The first set of toehold switch/trigger No 13 was designed by using the 16S rRNA of the thermophilic Geobacillus kaustophilus as the source of the trigger sequence.

Figure 3-4. Trigger No.14 and Toehold switch No.14

The toehold switch No.14 contains a region that is complementary to a certain sequence of the 16S rRNA gene of Dictyoglomus turgidum, an organism that thrives in extremely high temperatures.

Minimum time prediction

For our diagnostic platform, we firstly had to define the main pillars of the designed cell-free system, in order to have a concise view of the reactions’ model. To do that, we divided the project into 3 different parts. First, Recombinase Polymerase Amplification (RPA) uses our DNA biomarker as an input to propagate it exponentially over a given amount of time. This amplified DNA sequence is then used as a template for DNA transcription into RNA, by T7 RNA Polymerase, producing many RNA molecules for each DNA molecule. The RNA produced activates the toehold-switch device, which in turn is translated by ribosomes, producing a functional enzyme, β-lactamase. Lastly, β-lactamase hydrolyzes its substrate, nitrocefin, in an enzymatic assay, producing a colorimetric readout.

In collaboration with our Wet Lab team we developed a reaction cascade by coupling RPA to the in vitro transcription/translation system and enzymatic assay. The main goal was to predict the minimum time that our system needs in order to provide a visual result. To do this, we designed a model based on ordinary differential equations (ODE). The mathematical models were solved using MATLAB as a software tool.

Testing concentrations

To determine the sensitivity of the test, we had to use different concentrations of DNA input in our model, to predict an outcome in colorimetric readout. Although in the model we could stochastically lower the DNA input until no signal is produced, there are certain limitations when it comes to a biological system. In theory, at least one copy of the DNA template is needed for an amplification reaction to occur. So, to determine the lowest concentration of DNA input for the model and in extent for our experiments, we had to calculate DNA copy number according to the following formula:

$${Copy~number}={{Amount(ng)*6.022*10^{23}}\over {Length(bp)*10^9*660}}$$

TWe assume that 1 bp of dsDNA is 660 Daltons. The missing variables here are number of copies, amount (in ng) and length of the template (in bp). In our case, the length of our DNA template is 987 bp and we are looking for the corresponding DNA amount in nanograms when the number of copies is 1. If we solve this equation for the amount of DNA as variable X, the solution is that an approximate of 10-9 nanograms of DNA corresponds to 1 copy of it.

In order to provide the Wet Lab team with sufficient data to determine the actual sensitivity, serial dilutions were calculated, starting from 10-2ng/ul to 10-9ng/ul. These concentrations were selected because 1 ul of DNA template is used in each amplification reaction.

First step: Recombinase Polymerase Amplification (RPA)

Τhe first stage of our model is the Recombinase Polymerase Amplification (RPA), a newly developed method which shows great promise in point of care diagnostics. Using RPA, we assume that perfect symmetry exists between the reactions involving forward and reverse primers. Moreover, the ATP depletion is assumed to be negligible for the examined time spans. Supposing that the fluid is well mixed we can safely assume that all the occurring reactions are kinetically controlled. It is worth mentioning that all processes occur at 50 μl reaction mix.

The following figure describes the interactions between the species during the RPA amplification process.

Figure 5. The interactions between the species during RPA

The mechanism of RPA facilitates a recombinase, along with a Single-Stranded Binding Protein (Gp32) to achieve complementarity between primer and DNA template. First, a Gp32 (G) molecule binds to the primer. Gp32 binding helps remove secondary structures from primers to facilitate the binding of recombinase, while several Gp32 molecules bind on each primer (7bp binding site). Next, the pre-assembled recombinase complex, R, displaces Gp32 from the primer. This process is repeated for each Gp32 molecule that is bound to the primer. Recombinase has a 4bp binding site, so R complexes bind to each primer.

Once recombinase has bound a primer, it seeks out homology in the DNA template. Consuming energy, it inserts the primer into the DNA and dissociates. Then, Polymerase, P, extends the primer using resources found in the reaction buffer to produce a new strand of DNA. This process is repeated many times, resulting in millions of DNA strands being produced. [15]

The reactions 1-8 summarize the RPA mechanism:
\[\begin{array}{l} m*G + F \leftrightarrow FG\;(1)\\ R + FG \leftrightarrow FGR'\;(2a)\\ FGR' \to FGR + G\;(2b)\\ FGR + \left( {N - 1} \right){\rm{ }}*{\rm{ }}FGnR'\left( {2c} \right)\\ FGnR' \to FnR(m - 1)*G\;(2d)\\ FnR + D + 2*m*G \leftrightarrow FnRD\;(3)\\ FnRD + n*ATP \to FD + n*R + n*AMP + n*PPi\;(4a)\\ FnRD + n*ATP \to FD + n*ADP + {H_3}P{O_4}\;(4b)\\ FD + P \leftrightarrow PFD\;(5)\\ PFD + 2*B*dNTP \to P + 2*B*PPi + 2*D\;(6)\\ FD + np*PPi \to D + np*dNTP\;(7)\\ D + B*PPi \to B*dNTP + FD\;(8) \end{array}\]

Using our mathematical model, we compared our results with the experimental results and found reasonable agreement with data provided by our wet lab team at both minimum and maximum level of DNA concentration.

Figure 6: Concentration of amplified DNA (M) using 2.94∙10¯¹⁷M as DNA input during time (sec)

For $$minimum ~DNA_{input} = 2.94*10^{-17}M,$$ dry lab results to a concentration of amplified DNA equal to $$DNA_{output}=1.6667*10^{-8} M$$.

Figure 7. Concentration of amplified DNA (M) using 5.8∙10¯¹¹M as DNA input during time (sec)

For maximum DNA input =5.8∙10¯¹¹M, dry lab results to a concentration of amplified DNA output equal to 1.9473∙10¯⁸M.

Comparing the experimental data with our simulation results we point that the difference is less than 2 orders of magnitude, existing only at the minimum concentration level, which is tolerable. Hence, we conclude that as the concentration of input DNA grows, our model results converge to the ones measured in the wet lab. It is possible that in lower input DNA concentrations diffusion phenomena govern the reaction rate resulting in the detected inconsistency. The addition of diffusion terms in RPA the RPA kinetic model is a high priority future work of our team.

Second step: In vitro transcription/translation system

The next step in our model was the in vitro protein synthesis of the desired reporter.Τhe PURExpress kit (Protein synthesis Using Recombinant Elements) was our cell-free system of choice, because of its well characterized and defined nature, since all components exist in precise concentration for each reaction. That makes the system amenable to precise modeling.

Figure 8. The in vitro transcription/translation model for our project.

In our system, the trigger DNA sequence [Trigger_DNA] that has been amplified by the previous step, works as the input, while Hydrolyzed Nitrocefin ([Nitrocefin_Hydrolyzed]) as the output. The model calculates the transcription of a toehold-switch regulated β-lactamase mRNA, the activation of the mRNA by a trigger sequence, the translation of β-lactamase, as well as RNA denaturation. According to the literature[16,17], we assigned Michaelis-Menten kinetics to the transcription and translation reactions.

The reactions taking place in the model are:
\[\begin{array}{l} Toehold\_DNA \to Toehold\_RNA + Toehold\_DNA~~~~~~~~~(9)\\ Trigger\_DNA \to Trigger\_RNA + Trigger\_DNA~~~~~~~~~~(10)\\ Activated \leftrightarrow Toehold\_RNA + Trigger\_RNA~~~~~~~~~~~~~~~~(11)\\ Activated \to blact + Activated~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(12)\\ Toehold\_RNA \to \emptyset ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(13)\\ Trigger\_DNA \to \emptyset~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(14) \end{array}\]

At this point we tested if our experimental data of β-lactamase production fits the results of our modeling data. To obtain experimental data in molar values, we had to convert the reporter absorbance (Abs) from the plate-reader enzymatic assay into enzyme concentration. To do that, we used the Beer-Lambert law:
$$A = {e * l * c}$$
where A is the absorbance, c is the concentration of the enzyme, l is the path length of the photometer, and e is the molar attenuation coefficient of the substrate. The molar attenuation coefficient for nitrocefin is known to be 20,500 M-1 cm-1.

The path length is 1 cm for a standard cuvette, but since we were using a micro-plate reader and thus flat 96-well plates, we had to calibrate it. To do that, we measured 3 replicate wells on the reaction volume, for A900 and A977. After calculating the means for each wavelength, we determined the pathlength in cm by the formula:
$$L = {(A977-A900)\over0.18}$$
which was equivalent to 0.55 cm.

Now we had all the components to measure the concentration of the enzyme in molar and then apply the C1V1=C2V2 law to match the model volume.

The results are depicted in the graph below.

Figure 9. Comparison of the β-lactamase concentration (M) results between wet lab and dry lab during time (hours).

The Wet value represents the experimentally measured concentration of β-lactamase after a 3-hour incubation, while the blact curve represents the model’s calculations for enzyme production, over the same 3-hour incubation. The experimental data from the wet lab matched the model calculated data. It is noteworthy that the same amount of trigger DNA was used both in the experimental and model runs presented here (3*10-7M).

In contrast, when no trigger is present in the model calculation, the blact production is significantly reduced, putatively resulting in no color change, as confirmed by the experimental data.

However, when comparing the model values with the wet lab values, we found out that even though the activated toehold results almost converge, the leakage values were different by several orders of magnitude (Fig. 10)

Figure 10. Comparison between Activated Toehold and Leakage values on wet lab and model experiments.

To address this issue, we tested different Vmax values for the leakage reaction (Fig. 11). The goal was to obtain results that matched the experimental data regarding the leakage. As seen in Fig. 11, when we changed the Vmax value to 3.7*10-9, the results were almost identical with the wet lab value.

Figure 11. Calculated absorbance of hydrolyzed nitrocefin in different Vmax values for the leakage reaction (no trigger added).

Since changing the values of the leakage reactions affect the whole model, we checked the model values of the activated toehold switch, with different Vmax values as well (Fig 12.). As it turns out, changing the Vmax of the toehold leakage does not affect the outcome of the reaction when the trigger sequence is present.

Figure 12. Calculated absorbance of hydrolyzed nitrocefin in different Vmax values for the leakage reaction (no trigger added).

Consequently, increasing the Vmax value by ~3 orders of magnitude, we obtained almost identical values between Wet and Dry labs. This means that in practice, the toehold switch we tested produced more background signal than expected, but that did not affect the outcome of trigger activated reactions.

It is interesting though, how small change in enzyme concentration can result in maximum or not hydrolysis. We assume that the threshold for enzyme concentration between leakage and activated toehold values lie between the values of the 2 lines of Fig. 13.

Figure 13. Comparison of the β-lactamase production under the regulation of a toehold switch when the trigger is absent.

Third step: Enzymatic assay

The last step in our model was to simulate the kinetics of the β-lactamase enzymatic assay, which are well known and characterized (Km and Kcat) [18].

To obtain comparable results with the experimental data, the value of Nitrocefin was 5*10-5M, as added in our enzymatic assay reactions.

$$blact + Nitrocefin \to Nitrocefin\_Hydrolyzed + blact~(15)$$

Using as input the value of produced β-lactamase (blact) from the previous step, we simulated the reaction according to the equation above. The results are depicted in the following graph:

Figure 14. The nitrocefin hydrolysis by the β-lactamase during time (min).

As expected, Nitrocefin is rapidly hydrolyzed while the reaction reaches a plateau during the first 10 minutes. This confirms and supplements the experimental data, regarding the β-lactamase assay.

Fourth step: Predicting the minimum in vitro protein synthesis time

The ultimate goal of the model we have created is to be able to accurately and precisely predict certain outcomes when fed with the corresponding parameters. To exemplify the power of modeling in a wet lab experiment, we sought to predict the minimum time of incubation for the PURExpress reaction, to produce sufficient enzyme to hydrolyze nitrocefin rapidly. Due to budget limitation, we have tested incubation times for as little as 1 hour, and the produced signal was robust, resulting in a visible color change.

We hypothesized that nitrocefin was added in the same reaction with the protein synthesis, in order to assess how fast the substrate would be hydrolyzed by β-lactamase. Moreover, such an experimental setup is easier to conduct, because the enzymatic assay step is completely removed. The results can be seen in the graph below:

Figure 15. Nitrocefin hydrolysis from the β-lactamase through the in vitro transcription and translation system leading to the prediction of the minimum time that is needed for the reaction to be complete.

Nitrocefin is rapidly hydrolyzed in the first 30 minutes of the reaction and then reaches a plateau for the rest of the simulation. It is also noteworthy that 30nM of trigger DNA as input is significantly low.

Hence, considering the above data, it is safe to assume that the in vitro protein synthesis and enzymatic assay steps could be combined into one and conducted in less than 1 hour, making our diagnostic test rapid, since RPA needs as little as 5 minutes to robustly amplify even a single copy of the DNA biomarker.

Nomencalture

References

1. Geldmacher C, Dheda K, Hoelscher M, Labugger I. Evaluation of a Urine-Based Rapid Molecular Diagnostic Test with Potential to be used at Point-of-Care for Pulmonary Tuberculosis: Cape Town Cohort. The Journal of Molecular Diagnostics;(2018). doi:10.1016/j.jmoldx.2017.11.005

2.Labugger I, Heyckendorf J, Dees S, Häussinger E, Herzmann C, Kohl T A, Richter E, Milla E R, Lange C. Detection of transrenal DNA for the diagnosis of pulmonary tuberculosis and treatment monitoring, Infection (2016). doi:10.1007/s15010-016-0955-2

3. Pardee Keith, Alexander A Green, Melissa K Takahashi, David H O Connor, Lee Gehrke, James J Collins (2016). “Rapid , Low-Cost Detection of Zika Virus Using Programmable Biomolecular Components Resources” Cell, 1–12. https://doi.org/10.1016/j.cell.2016.04.059.

4. Pardee, Keith, Alexander A Green, Tom Ferrante, D Ewen Cameron, Ajay Daleykeyser, and Peng Yin (2014). “Paper-Based Synthetic Gene Networks.” Cell 159 (4): 940–54. https://doi.org/10.1016/j.cell.2014.10.004.

5. Hoshika S, Chen F, Leal NA, Benner SA. Artificial genetic systems: self-avoiding DNA in PCR and multiplexed PCR. Angew Chem Int Ed Engl. 2010;49(32):5554–5557. doi:10.1002/anie.201001977

6. Sharma, N., Hoshika, S., Hutter, D., Bradley, K. M., & Benner, S. A. (2014). Recombinase-Based Isothermal Amplification of Nucleic Acids with Self-Avoiding Molecular Recognition Systems (SAMRS). ChemBioChem, 15(15),2268–2274. doi: 10.1002/cbic.201402250

7. J. N. Zadeh, C. D. Steenberg, J. S. Bois, B. R. Wolfe, M. B. Pierce, A. R. Khan, R. M. Dirks, N. A. Pierce. NUPACK: analysis and design of nucleic acid systems. J Comput Chem, 32:170–173, 2011.

8. R. M. Dirks, J. S. Bois, J. M. Schaeffer, E. Winfree, and N. A. Pierce. Thermodynamic analysis of interacting nucleic acid strands. SIAM Rev, 49:65-88, 2007.

9. R. M. Dirks and N. A. Pierce. An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots. J Comput Chem, 25:1295-1304, 2004.

10. R. M. Dirks and N. A. Pierce. A partition function algorithm for nucleic acid secondary structure including pseudoknots. J Comput Chem, 24:1664-1677, 2003.

11. B. R. Wolfe, N. J. Porubsky, J. N. Zadeh, R. M. Dirks, and N. A. Pierce. Constrained multistate sequence design for nucleic acid reaction pathway engineering. J Am Chem Soc, 139:3134-3144, 2017.

12. B. R. Wolfe and N. A. Pierce. Sequence design for a test tube of interacting nucleic acid strands. ACS Synth Biol, 4:1086-1100, 2015.

13. J. N. Zadeh, B. R. Wolfe, and N. A. Pierce. Nucleic acid sequence design via efficient ensemble defect optimization. J Comput Chem, 32:439–452, 2011.

14. R. M. Dirks, M. Lin, E. Winfree, and N. A. Pierce. Paradigms for computational nucleic acid design. Nucl Acids Res, 32:1392-1403, 2004.

15. ‘’A mathematical model of recombinase polymerase amplificationunder continuously stirred conditions’’,by ClintMoodya,HeatherNewella,HendrikViljoena

16. ‘’Quantitative characterization of translationalriboregulators using an in vitro transcription-translation system’’,Anis Senoussi,Jonathan Lee Tin Wah, Yoshihiro Shimizu,JeromeRobert, Alfonso Jaramillo, Sven Findeiss, Ilka M.Axmannand AndreEstevez-Torres

17. Modelling cell-free RNA and protein synthesis with minimalsystems,Anne Doerr, Elise de Reu, Pauline van Nies, Mischa van der Haar, Katy Wei, Johannes Kattan,AljoschaWahland Christophe Danelon

18. Chow C, Xu H, Blanchard JS. Kinetic characterization of hydrolysis of nitrocefin, cefoxitin, and meropenem by β-lactamase from Mycobacterium tuberculosis. Biochemistry. 2013;52(23):4097–4104. doi:10.1021/bi400177y

19. J. N. Zadeh, C. D. Steenberg, J. S. Bois, B. R. Wolfe, M. B. Pierce, A. R. Khan, R. M. Dirks, N. A. Pierce. NUPACK: analysis and design of nucleic acid systems. J Comput Chem, 32:170–173, 2011.

20. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410. PubMed