Introduction
One of the fundamental bases of Poseidon has undoubtedly been the selection of the appropriate DNA Strand Displacement (DSD) circuit. After extensive bibliographic research and long brainstorming sessions, we concluded in a variety of existing circuits that we modified and we had to test their functionality. The idea of testing the possible circuits in the lab was unappealing, as it demanded consuming both time and laboratory resources. However, as our project development resolved around the idea of a robust DNA circuit the need of testing functional DSD systems was fulfilled by designing full circuit simulations. In the process of designing the simulations, we had to initially encounter with two variables, time and accuracy. At first, we aimed for high accuracy by running simulations at the sequence level with specific nucleotides. The scripts for the simulations are present in our github DNA-Strand-Displacement repository .
After conducting the first laboratory experiments for each system, we were able to obtain time data from the lab results and implement them in our model. Thus, the model was able to predict the time at which the system would reach equilibrium. This was achieved by employing software for calculations of the binding equilibrium constant and the kinetics of each domain and toehold. By exploiting this improvement, we used these predictions as a guide for estimating the remaining Input DNA that was left after the DNA-Protein binding.
Another achievement of our dry lab is the development of an algorithm for photobleaching correction of the experimental data. Observing the depletion of fluorophores over time in the laboratory, we managed to measure the bleaching rate for different values of the initial concentration. By using it we calculated the correct values of the experimental data.
Visual DSD Tool
Visual DSD Tool is a software developed by Microsoft able to perform stochastic and deterministic simulations for a network of DNA Strand Displacement reactions. It can provide very quickly (~5 min or less) representative simulations for very big systems. Analyzing at the domain level, the Visual DSD Tool gives user the ability to obtain qualitative information, such as the approximation of the yield and a crude prevalence of the side and polymerization reactions. Integrating some experimental results and an energetic model for the toeholds to the model, let us obtain similar reaction dynamics between the model and the lab experiments.
Throughout the simulations, we used a stochastic model, which we ran multiple times to obtain multiple trajectories. Hence, we could acquire results for many possible routes and observe the most prevalent. We decided on using such a model because DNA Strand Displacement systems can reach many different, but relevant, equilibrium states.
The inputs for the Visual DSD Tool are the initial amount of molecules for each complex. The Visual DSD Tool converts the given quantities to concentration units. For our system, nM units are described in the best way of our quantities.
Following the above, the stochastic algorithm (SSA) enumerates all possible reactions resulting in the chemical reaction network. Finally, using the Gillespie algorithm, SSA creates trajectories over the specified time frame. Based on our experience, running the simulation 10 times and obtaining 10 trajectory files provided a satisfactory amount of options. In the result section, we show the most prevalent trajectory.
Summing up, using Visual DSD Tool we were able to run quick simulations on systems or parts of our systems, before conducting experiments in the lab. In the analysis section, we present two of the circuits obtained from the model that we also tested experimentally in the lab.
The scripts for the simulations using the Visual DSD Tool, are present in our github repository DNA-Strand-Displacment.
Improving the rates
We managed to improve the kinetic profile of the simulations by using more accurate values for the binding and unbinding rates. These rates were calculated based on an energetic model. Furthermore, by using the experiment results, the rates were better fitted and added to the code based on the experimental time-frame.
To specify the bind and unbind rates, we used the toehold binding free energies, ΔG, calculated by the python package stickydesign. Using this information, we calculated the binding equilibrium Ka for each toehold sequence. As it is described in equation 1, the binding equilibrium can be derived from the deviation of the binding and unbinding rate.
(Eq. 1)
Experimenting with fluctuating values of k_bind and k_unbind, while maintaining their quotient equal to Ka and with the extra data from the wet lab team was the way to go. However, instead of putting random values to the rates, we employed the Kinetic Calculator of the NAB Lab of the Department of Bioengineering, Rice University. Using this tool, we have set some of the binding rates equal to the kinetic rates of the corresponding domain or toehold. The k_unbind was the division of the k_bind with the binding constant. We have to note that some rates were selected through experimenting, without the use of the Kinetic Calculator, so that the simulation has grater correspondence with the laboratory experiment.
Concluding, the usage of custom binding and unbinding rates, instead of the default, improved the time frame of the simulations. More specifically, with the implementation of experimental data and data from the work of NAB lab, we were able to predict, with an acceptable error, the time that the system would reach equilibrium.
KinDA
KinDA is a python package that can predict the kinetic and thermodynamic behavior of systems at the sequence level. With KinDA, researchers can determine if the chosen sequences will result in a functional system. The scripts used for the KinDA simulations along with some results, are demonstrated in our github DNA-Strand-Displacement repository.
The first step of a simulation with KinDA is the enumeration, which creates the Chain Reaction Network (CRN). Starting with complexes, specified by the user, KinDA generates all possible reactions at the domain level every time a new product comes up. More specifically the user has to provide information about the sequences, domains, strands and all the desired or important complexes. Next, the simulation proceeds with the calculation of the kinetic rates.
KinDA mainly uses the first-step reaction model, where each bimolecular reaction is considered to have an intermediate before reaching its final state. Thus, every reaction is composed of reaction pairs that have the intermediate as a common component. As a result, for each reaction there are 3 kinetic rates to be calculated.
A + B ⇌ XP (k11/k21)
XP ⇀ C + D (k12)
or
A + B ⇌ XD (k11/k21) XD ⇀ C (k12)
Before calculating the kinetics for every reaction, the reaction trajectories are being created. The starting state of each trajectory is composed of all the reactants for a set of reactions and the final state is the equilibrium. Every state m of the trajectory is different from its neighbors by the secondary structure of one nucleotide. So state m has one unpaired nucleotide which is paired in state j or vice versa. The probability of moving from state i to the next state, m, of all the possible states j, is:
(Eq. 2)
where k_m,i is the transition rate and not the kinetic rate. The value of transition rates is calculated from the difference of the energy between two neighbor states.
(Eq. 3)
(Eq. 4-5)
The time needed for each transition is taken randomly from the distribution:
(Eq. 6)
(Eq. 7)
The kinetic rates are calculated from the trajectories. Each trajectory Ni begins with a state, which contains the two reactants, and ends with either the final products or intermediate species. From these trajectories, KinDA can calculate the collision rate, k_coll,i, from the net rate of forming any initial base pair between the two structures. The kinetic rate constant is defined as the probability of a trajectory to have products at the last state multiplied with the average collision rates of those trajectories.
(Eq. 8)
Sin=1 for a trajectory with products at the last state, otherwise it is 0.
With this data, the estimator for the kinetic rate constant can also be calculated. This estimator will be used to calculate the error. In equation 10, N represents the number of trajectories.
(Eq. 9)
The error in the value of a kinetic rate is defined as the standard deviation of a distribution of k_i among the trajectories.
(Eq. 10)
Visual DSD Tool
The first system we tested for its functionality, was a hybrid of two systems found in the bibliography, which we call Lufree. The model predicted that the system would not work, so we did not order it. The reactants are shown in the picture below.
The simulation dynamics show clearly that the reaction including the complex F2 does not result in the desired products. We came to this conclusion, as the complex SS is not being consumed while the complex F2 is. That means that the complex F2 reacts but the desired product, which would react with SS, is not produced. From data not shown here, we concluded that a polymerization reaction occurs. The experimental results also showed that the system was dysfunctional.
After trying out different systems, we concluded that the main system should contain an autonomous binding area. We created a system composed of a gate containing the binding site, two gates for signal amplification and of course the reporter. This system, which was named γ-TLA, was inspired by the “Plasmid-derived DNA Strand Displacement Gates for Implementing Chemical Reaction Network“. A simulation of the system for initial concentrations of 35 nM of each Gate (FORK, JOIN), 25 nM of the INPUT Gate, 50 nM of the signal strands and 100 nM of the REPORTER is shown below:
The simulation demonstrated in Figure 2 corresponds to the system we used for the study of both P65 and ELK1. No nucleotides or binding rates have been specified in the code. The concentration of the strand with the fluorophore, C, seems to increase much faster than it does in the laboratory experiments. This disparity was partially resolved once the experimental results were incorporated into the project. The method we used for this improvement is demonstrated in the Improving the DSD rates part of the Simulation Process tab.
Simulation Results
In an attempt to improve the model’s approach on the simulation dynamics, we used custom binding and unbinding rates in the Visual DSD Tool, which resulted in successful predictions of the time at which the system reaches equilibrium.
Using the python package stickydesign we were able to calculate the binding free energy of each toehold, which as shown in Eq. 1 is equal to the k_bind divided by the k_unbinding. By combining this information with the laboratory results we conducted simulations using different pairs of k_bind and k_unbinding for each toehold. As mentioned in the simulation process tab, some k_bind were chosen according to the Kinetic Calculator, a tool developed from the NAB Lab. Finally, we were able to find values that showed the best correspondence with the actual experiment concerning the system equilibrium. The following figure demonstrates two simulations in which we used this technique to improve the mentioned rates. The simulation on the left includes the nucleotides designed for the study of the P65 homodimer and the simulation on the right, the nucleotides designed for the study of the ELK1 homodimer.
The similarity between the results of the two systems is expected because Visual DSD Tool is a software for domain level analysis. The simulations are in accordance with the experimental results. More specifically, the figure below, demonstrates the fluorophore strand concentration of the experimental data for the P65 system in comparison with the in silico prediction of the left figure of Figure 3.
Despite the fact that the dynamic of the simulation differs significantly from the real one, the difference between the two decreases as they reach the equilibrium. For every comparison between experimental and in silico results, we were calculating the deviation between them with the following equation.
Furthermore, simulations of varying concentrations of the INPUT gate were conducted to imitate the depletion of the initial INPUT concentration because of the Transcription Factor. In other words, we simulated the dynamics of C (Fluorophore) when a different fraction of the initial INPUT concentration (25 nM) was available to react. The result is shown in the following Figure.
KinDA
KinDA is a python package developed by DNA and Natural Algorithms Group in Caltech. The package is capable of simulating DNA Strand Displacement systems with high accuracy), and enumerate the reaction network as it can conduct both sequence and domain level analysis. In general, it is suitable for extracting kinetic data and predicting spurious and unproductive reactions.
We used the package to validate some parts of our circuits prior to ordering DNA and to evaluate our system. With the help of KinDA we were able to make decisions to reject or proceed with a part of a circuit or with specific DNA strands. More specific, we primarily made use of the enumeration, the kinetic data and the simulations. The final results needed vast amount time to be computed so we abandoned simulations using KinDA after the first system simulation, though we did use it for other tasks. In the figure below full system simulations are demonstrated for a system similar to the one described by Dr. Soloveichik:
The system in Figure 1, contains the Soloveichik system with custom DNA sequences that differs from our system by two reactions. It took 33 days for the simulation to complete on an 8 core processor with an 0.8 error-goal. The great amount of time for this error demonstrates how slow more accurate simulations can be. The 0.8 means that the calculated k_i is one of the 80% values that are closer to the average value.
This simulation contains the kinetic data for 19,963 reactions while our system is composed of 40,563. From those reactions, only 8 are desired reactions and all 19,959 are unproductive and spurious reactions with low kinetic rates. This is to be expected for systems with large systems with multiple reactants. The results given from 0.8 error goal simulation, have very fast dynamics that do not correspond to reality. Thus using KinDA is regarded to be impractical for simulations of big systems, but instead it seems practical when used for parts of systems.
Photobleaching
One of the main problems of spectrofluorometry is photobleaching. A fluorophore becomes photobleached when a photon breaks it and makes it incapable of producing light. The scripts used for the corrections over photobleaching, are demonstrated in our github repository DNA-Strand-Displacement.
In our experiments photobleaching was one of the main reasons for measurement errors. The scale of photobleaching in our experiments and how it can affect the results, can be better understood with the following figure. Figure 7 shows the change in the signal for a reaction between the Reporter and its input strand, tcC. The reaction is complete in the first 10 minutes of the experiment, where one can observe the maximum signal. After that time, the the signal should be constant, but instead it decreases due to photobleaching.
In order to correct our data over photobleaching, we have created a correction algorithm. The algorithm makes the assumption that fluorophores can become photobleached even when they are next to the quencer. The assumption causes a deviation from the true value but it is at an acceptable level. First we calculated the rate of signal reduction, Rr, from the maximum value. The data for this calculation are derived from an Reporter-tcC experiment with tcC initial concentration of 20 nM. The reduction rate for the time t is given from the following equation.
The experimental data we want to correct, are divided by these rates. More specifically, each point in time is divided with the reduction rate that corresponds to same moment in time.
The resulting light is converted to units of concentration via the trendline equation for multiple Reporter-tcC experiments, for different concentrations of tcC. A demonstration of this correction for a full system with 20 nM of INPUT gate is shown below, in figure 8. The initial concentration for the JOIN and FORK Gates was also 20 nM.
References
[1]J. Berleant, Ch. Berlind, S. Badelt, F. Dannenberg, J. Schaeffer and E. Winfree (2018) Automated sequence-level analysis of kinetics and thermodynamics for domain-level DNA strand-displacement systems.
[2]KinDA
[3]M. R. Lakin, S. Youssef, F. Polo, S. Emmott and A. Phillips (2011) Visual DSD: a design and analysis tool for DNA strand displacement systems
[4]Visual DSD Tool
[5]C. Grun, K. Sarma, B. Wolfe, S. W. Shin and E. Winfree (2015) A domain-level DNA strand displacement reaction enumerator allowing arbitrary non-pseudoknotted secondary structures
[6]Peppercornenumerator
[7]Kinetic Calculator
[8]Evans C. G., Winfree E. (2013) DNA Sticky End Design and Assignment for Robust Algorithmic Self-assembly.
[9]Stickydesign